Hi Thomas, I had looked at APEXMALHAR-2260 as well and it will also be part of this development. Though Apex provide python script operator, it is actually very limited script implementation. Lambda function or custom python functions which may have to run as scripts in python operator can be serialised using CloudPickle and run on various nodes.
I am still investigating how to ensure that all libraries required by python code made available to operators running on different nodes. One of the approach suggested by cloudera is to make sure all libraries are available on each node of the cluster. This was suggested with respect to pyspark jobs . Please do suggest better alternative for making python environment available as required even in cluster environment. Thanks & Regards, Vikram On Sun, Jan 29, 2017 at 1:11 AM, Thomas Weise <t...@apache.org> wrote: > Hi, > > Python support would be great to have. Users look for the ability to use > Python with its library ecosystem. How will that be possible with this API > proposal? > > I suspect that just being able to wire operators in Python is of limited > impact when operators cannot execute Python. Have you looked > at APEXMALHAR-2260 as well? > > Thanks > > > On Fri, Jan 27, 2017 at 11:39 PM, vikram patil <patilvik...@gmail.com> > wrote: > > > Hi All, > > > > I would like to take up development for python binding implementation for > > highlevel APIs (APEXMALHAR-2261 ). I went over High-Level APIs from > Apache > > Malhar Stream API project. It can be initiated as separated project in > the > > Apache Malhar project just like sql or stream project. > > > > In first phase I would like to focus on providing python binding for > > following APIs: > > > > 1) StreamFactory.fromFolder > > 2) StreamFactory.fromKafka* > > 3) StreamFactory.fromLocal > > 4) StreamFactory.fromInput > > 5) ApexStream.map > > 6) ApexStream.flatMap > > 7) ApexStream.filter > > 9) ApexStream.endWith > > 11) ApexStream.setGlobalAttribute > > 12) Custom functions in python . > > > > > > Rest of the Apex HighLevel APIs such as addStream, addOperator can be > > implemented as part of phase II . > > > > > > For integration of this purpose,I would like to use py4j as python-java > > binding due to wide acceptance and very good community support. Also > py4j > > also allows call backs to python code from java which can make certain > > functionalities easier to implement. > > > > Py4j Version: 0.10.4 > > > > Please share your suggestions about this implementation. > > > > Thanks & Regards, > > Vikram > > >