The biggest appeal of spark in zeppelin is its interactiveness, i.e. the ability to pull data from RDDs to the driver/web UI via actions (take, collect, top). There are no equivalent of actions in Beam/Dataflow, only transformations (apply(transform)). How's that gonna work with spark?
In scio-repl we have semi-interactiveness, i.e. each context corresponds to a Dataflow job but you have to close the context before collecting data back to the REPL with Future. On Tue, May 17, 2016 at 9:03 AM Ismaël Mejía <[email protected]> wrote: > Last week during the Apache Big Data / Apachecon conference i assisted to > some > presentations and one aspect that surprised me is how Apache Zeppelin was > used > by many presenters to show their data processing code (mostly in > python/scala). > > I consider that even if this integration is not critical for Apache Beam, > it > is important to support this, and i intend to collaborate in such task. I > just created an issue on JIRA for the people interested > https://issues.apache.org/jira/browse/BEAM-290 > > I briefly discussed with Alexander Bezzubov from Zeppelin about an initial > plan > to support Beam in three phases: > > 1. support the scala sdk (scio) + scala runners (spark): > > This is first since most of the pieces exist already, we just need to put > the > things together. > > 2. integrate the java sdk > > The big issue here is that there is not (yet) a decent java repl tool, and > the > support of such repl in zeppelin is an ongoing work. > > 3. integrate the python sdk > > This one depends on the release of the python sdk in the upcoming weeks, > and its > priority can change if integration is easier than the other two tasks. > > Of course this message is a call to other interested parties to contribute, > e.g. > ideas, agenda to prioritize certain runners, or other complementary tasks > to > achieve the goals like integrate scio, support the google storage backend > for the > notebooks (to make a nicer integration for users of the runner in the > google > cloud), etc. > > Ismaël Mejía >
