+1 on this feature. we could use py4j or communication with python process through pipes to run python code through jvm.
- Tushar. On Fri, Sep 16, 2016 at 12:10 PM, Thomas Weise <t...@apache.org> wrote: > Jython is not a replacement for Python, it seems to be fairly limited. We > would need the ability to run Python with all its libraries. > > Thomas > > On Thu, Sep 15, 2016 at 11:25 PM, David Yan <da...@datatorrent.com> wrote: > >> On a very high level, we can build a Python framework in Apex by having a >> Python binding on our high level API that generates Jython operators with >> the business logic written by users in Python, along with existing >> connectors. >> >> David >> >> On Sep 15, 2016 11:00 PM, "Chinmay Kolhatkar" <chin...@datatorrent.com> >> wrote: >> >> > Strongly +1 on this. One thing that proves this is useful for Apex is >> > hadoop streaming where python is used write map-reduce jobs. This not >> only >> > will increase the reach in development world but also would be appealing >> to >> > administrators to write an app as they are usually aware of python. >> > >> > >> > Few suggestions (not in specific order): >> > 1. As a part of supporting python execution in operator code, we should >> > provide a complete lifecycle of an operator to be specified from python. >> > >> > 2. I would personally not worry about providing python binding for low >> > level apex client APIs like addOperator, addStream etc... If one has to >> do >> > it, I think its best to use JAVA api as the most power of those low level >> > APIs can be leveraged there. >> > >> > 3. For client APIs, I would rather suggest we focus on high level APIs >> like >> > apex stream API (malhar-stream). We should provide a complete python >> > binding for them. Python is very useful when it comes to functional >> > programming and Stream API provide exactly that. >> > >> > 4. Thinking very high level, I don't think we need any change in >> apex-core >> > for this. This could be another project in malhar itself. There are >> python >> > libraries like py4j or pyjnius or JPype which allows to access Java >> objects >> > from python. >> > Basically, we just need to establish a right bridge betweeen java and >> > python VM. We need to be thoughtful about performance as these bridges >> > across programming languages are costly. >> > >> > 5. We need to decide on how the code execution will look like on this. >> For >> > eg., should a py file be an alternative to Application.java in the >> package? >> > This means, the starting point is apex cli i.e. java. Hence instead of >> > finding classes implementing StreamingApplication, apexcli needs to find >> py >> > file which defines definition of DAG. >> > OR should the flow start with "__main__" of python file and end up in >> Java? >> > >> > 6. This might be too early, but it important to emphasis that we need to >> > plan for writing examples and documentation for python binding. >> > >> > -Chinmay. >> > >> > >> > >> > On Fri, Sep 16, 2016 at 2:36 AM, Thomas Weise <t...@apache.org> wrote: >> > >> > > Hi, >> > > >> > > Python (not Jython) seems to be a popular language and frequently used >> > for >> > > data analysis, especially where flexibility matters. It has a >> > comprehensive >> > > library and it is generally considered low barrier to entry. I have >> also >> > > seen Python used in critical back-end components, although that's >> > probably >> > > not very common? >> > > >> > > I think Python support could potentially expand the user base for Apex. >> > > There are 2 main areas that can be considered: >> > > >> > > 1) Support to execute Python code through an operator >> > > 2) A client API that lets users construct pipelines in Python >> > > >> > > The former can exist without the latter. And it would enable users to >> > > leverage existing code that otherwise would have to be rewritten in a >> JVM >> > > language. The engine could ship scripts/packages so they are >> > automatically >> > > distributed on the cluster. >> > > >> > > A useful client API probably requires back-end support for lambda >> > functions >> > > and more complex UDFs. >> > > >> > > Would be great to get some feedback, especially from those that have >> > > experience with Python, on how an integration could potentially open up >> > new >> > > use cases for Apex. >> > > >> > > Thanks, >> > > Thomas >> > > >> > >>