Ismael - thanks, adding scripting language support to Beam is an awesome idea and we should absolutely do it.
However I think it the current proposal can be made significantly more general, and it would merit from a formal design discussion. E.g. a couple of points I can think of, that seem very important but currently aren't covered by the PR: - Having the script return multiple values per element - Scripting arbitrary user-code callbacks rather than a whole PTransform, e.g. writing the various lambdas of FileIO.writeDynamic() in a scripting language - Integration with Beam SQL - Specifying dependencies (does this require anything special?) And less critical but also important or potentially very useful points: - Support for side inputs and for multiple output tags - Supporting asynchronous API calls from the script - Supporting batching multiple elements together On Fri, Mar 23, 2018 at 12:09 PM Tyler Akidau <taki...@google.com> wrote: > +1, I like it. Thanks! > > On Fri, Mar 23, 2018 at 9:03 AM Ahmet Altay <al...@google.com> wrote: > >> Thank you Ismaël, this looks really cool. >> >> On Fri, Mar 23, 2018 at 5:33 AM, Jean-Baptiste Onofré <j...@nanthrax.net> >> wrote: >> >>> Hi, >>> >>> it sounds like a very good extension mechanism to PTransform. >>> >>> +1 >>> >>> Regards >>> JB >>> >>> On 03/23/2018 12:03 PM, Ismaël Mejía wrote: >>> > This is a really simple proposal to add an extension with transforms >>> > that package the Java Scripting API )JSR-223) [1] to allow users to >>> > specialize some transforms via a scripting language. This work was >>> > initially created by Romain [2] and I just took it with his >>> > authorization and refined it to make it pass all the Beam validations >>> > + style. I also added ValueProviders that allow users to template now >>> > scripts also in Dataflow. >>> > >>> > Notice that Dataflow recently added something similar to create really >>> > simple data movement pipelines [3], so maybe the rest of the community >>> > can benefit of a similar extension (and eventually dataflow may >>> > converge to this implementation). >>> > >>> > I hope there is interest in this extension, so far we have a >>> > ScriptingParDo transform to show the idea, hopefully we can expand >>> > this to other transforms. >>> > >>> > For those interested in more details you can check the Jira issue [4] >>> > and the PR [5]. >>> > >>> > [1] https://www.jcp.org/en/jsr/detail?id=223 >>> > [2] https://github.com/rmannibucau/beam-jsr223 >>> > [3] >>> https://cloud.google.com/blog/big-data/2018/03/pre-built-cloud-dataflow-templates-kiss-for-data-movement >>> > [4] https://issues.apache.org/jira/browse/BEAM-3921 >>> > [5} https://github.com/apache/beam/pull/4944 >>> > >>> >>> -- >>> Jean-Baptiste Onofré >>> jbono...@apache.org >>> http://blog.nanthrax.net >>> Talend - http://www.talend.com >>> >> >>