+1 on this feature.

we could use py4j or communication with python process through pipes
to run python code through jvm.

- Tushar.



On Fri, Sep 16, 2016 at 12:10 PM, Thomas Weise <t...@apache.org> wrote:
> Jython is not a replacement for Python, it seems to be fairly limited. We
> would need the ability to run Python with all its libraries.
>
> Thomas
>
> On Thu, Sep 15, 2016 at 11:25 PM, David Yan <da...@datatorrent.com> wrote:
>
>> On a very high level, we can build a Python framework in Apex by having a
>> Python binding on our high level API that generates Jython operators with
>> the business logic written by users in Python, along with existing
>> connectors.
>>
>> David
>>
>> On Sep 15, 2016 11:00 PM, "Chinmay Kolhatkar" <chin...@datatorrent.com>
>> wrote:
>>
>> > Strongly +1 on this. One thing that proves this is useful for Apex is
>> > hadoop streaming where python is used write map-reduce jobs. This not
>> only
>> > will increase the reach in development world but also would be appealing
>> to
>> > administrators to write an app as they are usually aware of python.
>> >
>> >
>> > Few suggestions (not in specific order):
>> > 1. As a part of supporting python execution in operator code, we should
>> > provide a complete lifecycle of an operator to be specified from python.
>> >
>> > 2. I would personally not worry about providing python binding for low
>> > level apex client APIs like addOperator, addStream etc... If one has to
>> do
>> > it, I think its best to use JAVA api as the most power of those low level
>> > APIs can be leveraged there.
>> >
>> > 3. For client APIs, I would rather suggest we focus on high level APIs
>> like
>> > apex stream API (malhar-stream). We should provide a complete python
>> > binding for them. Python is very useful when it comes to functional
>> > programming and Stream API provide exactly that.
>> >
>> > 4. Thinking very high level, I don't think we need any change in
>> apex-core
>> > for this. This could be another project in malhar itself. There are
>> python
>> > libraries like py4j or pyjnius or JPype which allows to access Java
>> objects
>> > from python.
>> > Basically, we just need to establish a right bridge betweeen java and
>> > python VM. We need to be thoughtful about performance as these bridges
>> > across programming languages are costly.
>> >
>> > 5. We need to decide on how the code execution will look like on this.
>> For
>> > eg., should a py file be an alternative to Application.java in the
>> package?
>> > This means, the starting point is apex cli i.e. java. Hence instead of
>> > finding classes implementing StreamingApplication, apexcli needs to find
>> py
>> > file which defines definition of DAG.
>> > OR should the flow start with "__main__" of python file and end up in
>> Java?
>> >
>> > 6. This might be too early, but it important to emphasis that we need to
>> > plan for writing examples and documentation for python binding.
>> >
>> > -Chinmay.
>> >
>> >
>> >
>> > On Fri, Sep 16, 2016 at 2:36 AM, Thomas Weise <t...@apache.org> wrote:
>> >
>> > > Hi,
>> > >
>> > > Python (not Jython) seems to be a popular language and frequently used
>> > for
>> > > data analysis, especially where flexibility matters. It has a
>> > comprehensive
>> > > library and it is generally considered low barrier to entry. I have
>> also
>> > > seen Python used in critical back-end components, although that's
>> > probably
>> > > not very common?
>> > >
>> > > I think Python support could potentially expand the user base for Apex.
>> > > There are 2 main areas that can be considered:
>> > >
>> > > 1) Support to execute Python code through an operator
>> > > 2) A client API that lets users construct pipelines in Python
>> > >
>> > > The former can exist without the latter. And it would enable users to
>> > > leverage existing code that otherwise would have to be rewritten in a
>> JVM
>> > > language. The engine could ship scripts/packages so they are
>> > automatically
>> > > distributed on the cluster.
>> > >
>> > > A useful client API probably requires back-end support for lambda
>> > functions
>> > > and more complex UDFs.
>> > >
>> > > Would be great to get some feedback, especially from those that have
>> > > experience with Python, on how an integration could potentially open up
>> > new
>> > > use cases for Apex.
>> > >
>> > > Thanks,
>> > > Thomas
>> > >
>> >
>>

Reply via email to