Hi JB, Thanks a lot for your kind attention. I'm very happy to take your advises on this implementation. :)
I am planning to do this for GSOC 2016 since it has been published as a project idea in this year. Here is the plan in brief. The user should be able to implement the pipelines using commands provided by the beam sdk (dataflow sdk) using a zeppelin notebook. Then the beam interpreter should be able to interpret and execute beam sdk commands at the back-end and give the output. Since beam provides only a sdk for java, I am going to use Java-REPL <https://github.com/albertlatacz/java-repl> to interpret java commands provided by sdk at the zeppelin back-end. I will create a draft proposal for this implementation and share it with you. Would like to have your comments on it. Thanks and regards. Minudika Minudika Malshan Undergraduate Department of Computer Science and Engineering University of Moratuwa Sri Lanka. On Thu, Mar 10, 2016 at 2:39 PM, Jean-Baptiste Onofré <[email protected]> wrote: > Hi Minudika, > > Oh, interesting for Zeppelin. What do you plan to do ? Implement the > zeppelin notebook backend with Beam (the zeppelin analytics would be > implemented as beam pipelines) ? I would be happy to help if you need. > > Regards > JB > > > On 03/10/2016 09:47 AM, Minudika Malshan wrote: > >> Hi, >> >> This is related with the implementation of a beam interpreter for Apache >> zeppelin. I think for the first phase, DirectPipelineRunner will do the >> job >> :) >> Please let me know if there is anything which can be helpful. >> >> Thanks and regards. >> Minudika >> >> Minudika Malshan >> Undergraduate >> Department of Computer Science and Engineering >> University of Moratuwa >> Sri Lanka. >> >> >> >> >> On Thu, Mar 10, 2016 at 12:11 PM, Jean-Baptiste Onofré <[email protected]> >> wrote: >> >> By the way, on my side, I will work on a Karaf/OSGi ( >>> http://karaf.apache.org) runner for Beam (with shell commands, features, >>> etc). >>> I will start it just after the work on new IOs. >>> >>> Regards >>> JB >>> >>> >>> On 03/09/2016 08:01 PM, Minudika Malshan wrote: >>> >>> Hi, >>>> >>>> Thanks a lot for your quick responses. >>>> I will refer those resources. >>>> >>>> Regards, >>>> Minudika >>>> >>>> Minudika Malshan >>>> Undergraduate >>>> Department of Computer Science and Engineering >>>> University of Moratuwa >>>> Sri Lanka. >>>> >>>> >>>> >>>> >>>> On Thu, Mar 10, 2016 at 12:24 AM, Lukasz Cwik <[email protected] >>>> > >>>> wrote: >>>> >>>> There are currently two implementations which do not require the cloud: >>>> >>>>> >>>>> The DirectPipelineRunner >>>>> < >>>>> >>>>> >>>>> https://github.com/apache/incubator-beam/blob/master/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/DirectPipelineRunner.java >>>>> >>>>> >>>>>> which is mainly used for testing and local development. This runner >>>>>> has >>>>>> >>>>> several limits (data size, no support for unbounded collections, ...) >>>>> and >>>>> is being expanded to support more use cases, for example adding >>>>> unbounded >>>>> PCollection support <https://issues.apache.org/jira/browse/BEAM-22>. >>>>> >>>>> The FlinkPipelineRunner >>>>> <https://github.com/apache/incubator-beam/tree/master/runners/flink> >>>>> which >>>>> can be used to execute locally or on a Flink cluster. >>>>> >>>>> There is also ongoing work to bring Spark >>>>> <https://issues.apache.org/jira/browse/BEAM-6> into the mix as a >>>>> runner >>>>> and >>>>> suggestions to for other runners such as GearPump >>>>> <https://github.com/gearpump/gearpump>. >>>>> >>>>> On Wed, Mar 9, 2016 at 10:37 AM, Minudika Malshan < >>>>> [email protected] >>>>> >>>>>> >>>>>> wrote: >>>>> >>>>> Hi all, >>>>> >>>>>> >>>>>> As per my knowledge about Apache beam and data flow sdk, at the first >>>>>> >>>>>> data >>>>> >>>>> flow sdk has been developed targeting google cloud platform. >>>>>> So we have to deploy pipelines in the cloud. >>>>>> >>>>>> But my question is, can not we use this sdk for standalone >>>>>> >>>>>> implementations >>>>> >>>>> without cloud. If so, I would love to have a look at some examples of >>>>>> >>>>>> such >>>>> >>>>> implementations. >>>>>> Your kind help is much appreciated. >>>>>> >>>>>> Regards, >>>>>> Minudika >>>>>> >>>>>> Minudika Malshan >>>>>> Undergraduate >>>>>> Department of Computer Science and Engineering >>>>>> University of Moratuwa >>>>>> Sri Lanka. >>>>>> >>>>>> >>>>>> >>>>> >>>> -- >>> Jean-Baptiste Onofré >>> [email protected] >>> http://blog.nanthrax.net >>> Talend - http://www.talend.com >>> >>> >> > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com >
