The biggest appeal of spark in zeppelin is its interactiveness, i.e. the
ability to pull data from RDDs to the driver/web UI via actions (take,
collect, top).
There are no equivalent of actions in Beam/Dataflow, only transformations
(apply(transform)). How's that gonna work with spark?

In scio-repl we have semi-interactiveness, i.e. each context corresponds to
a Dataflow job but you have to close the context before collecting data
back to the REPL with Future.

On Tue, May 17, 2016 at 9:03 AM Ismaël Mejía <[email protected]> wrote:

> Last week during the Apache Big Data / Apachecon conference i assisted to
> some
> presentations and one aspect that surprised me is how Apache Zeppelin was
> used
> by many presenters to show their data processing code (mostly in
> python/scala).
>
> I consider that even if this integration is not critical for Apache Beam,
> it
> is important to support this, and i intend to collaborate in such task. I
> just created an issue on JIRA for the people interested
> https://issues.apache.org/jira/browse/BEAM-290
>
> I briefly discussed with Alexander Bezzubov from Zeppelin about an initial
> plan
> to support Beam in three phases:
>
> 1. support the scala sdk (scio) + scala runners (spark):
>
> This is first since most of the pieces exist already, we just need to put
> the
> things together.
>
> 2. integrate the java sdk
>
> The big issue here is that there is not (yet) a decent java repl tool, and
> the
> support of such repl in zeppelin is an ongoing work.
>
> 3. integrate the python sdk
>
> This one depends on the release of the python sdk in the upcoming weeks,
> and its
> priority can change if integration is easier than the other two tasks.
>
> Of course this message is a call to other interested parties to contribute,
> e.g.
> ideas, agenda to prioritize certain runners, or other complementary tasks
> to
> achieve the goals like integrate scio, support the google storage backend
> for the
> notebooks (to make a nicer integration for users of the runner in the
> google
> cloud), etc.
>
> Ismaël Mejía
>

Reply via email to