I would like to help in contributing to this feature. On Wed, Sep 21, 2016 at 12:26 AM, Sasha Parfenov <sas...@apache.org> wrote:
> +1 on both executing Python code in an operator and high level API for > constructing Pipelines in Python. > > There is a large user base of engineers and data scientists which use > Python on regular basis for crunching through big data. Providing them > with a powerful new platform for big data processing, wrapped in a familiar > language, will open Apex to a much broader user base and help grow the > project. > > Given the potentially new user base of Python developers, it may make sense > to prioritize the high level API for pipeline construction. This will > allow users to build simple applications with existing library operators, > and we can get feedback on what areas they would like to see improved next > - custom Python operator support or more built-in library operators. > > Thanks, > Sasha > > On Thu, Sep 15, 2016 at 2:06 PM, Thomas Weise <t...@apache.org> wrote: > > > Hi, > > > > Python (not Jython) seems to be a popular language and frequently used > for > > data analysis, especially where flexibility matters. It has a > comprehensive > > library and it is generally considered low barrier to entry. I have also > > seen Python used in critical back-end components, although that's > probably > > not very common? > > > > I think Python support could potentially expand the user base for Apex. > > There are 2 main areas that can be considered: > > > > 1) Support to execute Python code through an operator > > 2) A client API that lets users construct pipelines in Python > > > > The former can exist without the latter. And it would enable users to > > leverage existing code that otherwise would have to be rewritten in a JVM > > language. The engine could ship scripts/packages so they are > automatically > > distributed on the cluster. > > > > A useful client API probably requires back-end support for lambda > functions > > and more complex UDFs. > > > > Would be great to get some feedback, especially from those that have > > experience with Python, on how an integration could potentially open up > new > > use cases for Apex. > > > > Thanks, > > Thomas > > >