Thanks Luciano. IIRC, what user want is to run the whole spark app, but
they don't care about whether it is in zeppelin or through a standard spark
app jar. I know zeppelin currently doesn't do well in converting note to
production spark app as Lee mentioned. But exporting note as jar seems a
short term solution, not a long term solution. I just feel when zeppelin
improve in this field, user might abandon this solution and transit to
zeppelin again. Here's some disadvantages I can see of this approach.

1.  If user want to change the code in iterative development , they have to
repeat the whole pipeline (change code in zeppelin -> export it to spark
jar -> redeploy this jar). This process is painful and wasting time.
2.  Hard to debug and diagnose as code is changed/restructured when
exporting to jar
3.  User have to leverage several distinct tools for the whole development
cycle (zeppelin, spark job server, and maybe cron job)

Besides,  the OOM issue of Spark REPL Lee mentioned might not be a problem.
Because we can shutdown the app (close interpreter) after the app is done.





Luciano Resende <luckbr1...@gmail.com>于2017年1月5日周四 下午3:59写道:

Some use cases discussed earlier on this thread:

https://www.mail-archive.com/dev@zeppelin.apache.org/msg06323.html

https://www.mail-archive.com/dev@zeppelin.apache.org/msg06332.html

On Wed, Jan 4, 2017 at 4:51 PM, Jianfeng (Jeff) Zhang <
jzh...@hortonworks.com> wrote:

>
> I don¹t understand why user want to export zeppelin note as spark
> application.
>
> If they want to trigger the running of spark app, why not use zeppelin¹s
> rest api for that. Even user export it as spark application, most of time
> in reality, they need to submit it through spark job server, so why not
> use zeppelin as a spark job server.
> And if the spark app fails, it is pretty hard to debug it, because the
> exporting tool has changed/restructured the source code.
>
>
> If this is a pretty large and complicated spark application, I don¹t think
> zeppelin is a proper tool for that, they¹d better to use IDE for that
> project.
>
> BTW, After https://github.com/apache/zeppelin/pull/1799, user can define
> the dependency between paragraphs, and they can run one whole note which
> contains different interpreters.
>
>
>
> Best Regard,
> Jeff Zhang
>
>
>
>
>
> On 1/5/17, 2:25 AM, "Luciano Resende" <luckbr1...@gmail.com> wrote:
>
> >I have made some progress with a tool to handle the points discussed in
> >this thread. It's currently a command line tool and given a Zeppelin
> >notebook (note.json) it generates a Spark scala application, compiles it
> >using the compiler embedded in the scala sdk and then package all these
> >resources into a jar that works with spark-submit command.
> >
> >I would like to start prototyping the integration into the Zeppelin UI
and
> >I was wondering if it would be ok to use the above jar as a dependency
> >(e.g. from a maven release) and integrate into zeppelin...
> >
> >Thoughts ?
> >
> >
> >On Mon, Sep 19, 2016 at 7:47 AM, Sourav Mazumder <
> >sourav.mazumde...@gmail.com> wrote:
> >
> >> To Moon's point, This is what my vision is around this feature -
> >>
> >> 1. Use should be able to package 1, more than one, all of the
> >>paragraphs in
> >> a Notebook to create a Jar file which can be used with Spark-Submit.
> >>
> >> 2. The tool should automatically remove the all the interactive
> >>statements
> >> like print, show etc.
> >>
> >> 3. The tool should automatically create a Main class in addition to the
> >>jar
> >> file(s) which will internally call the respective jar. User can then
> >>change
> >> this main class if needed for parameterization through Args.
> >>
> >> Regards,
> >> Sourav
> >>
> >> On Mon, Sep 19, 2016 at 7:33 AM, Sourav Mazumder <
> >> sourav.mazumde...@gmail.com> wrote:
> >>
> >> > I am also pretty much for this.
> >> >
> >> > I have got the similar request from each and every people/group who I
> >> > showcased Zeppelin.Regards,
> >> > Sourav
> >> >
> >> > On Fri, Sep 16, 2016 at 8:06 PM, moon soo Lee <m...@apache.org>
> wrote:
> >> >
> >> >> Hi Luciano,
> >> >>
> >> >> I've also got a lot of questions about "Productize the notebook"
> >>every
> >> >> time
> >> >> i meet users use Zeppelin in their work.
> >> >>
> >> >> I think it's actually about two different problems that Zeppelin
> >>need to
> >> >> address.
> >> >>
> >> >> *1) Provide way that interactive notebook becomes part of production
> >> data
> >> >> pipeline.*
> >> >>
> >> >> Although Zeppelin does have quite convenient cron-like scheduler for
> >> each
> >> >> Note, built-in cron scheduler is not ready for serious use in the
> >> >> production. Because it lacks some features like actions after
> >> >> success/fail,
> >> >> fault-tolerance, history, and so on. I think community is working on
> >> >> improving it, and it's going to take some time.
> >> >>  Meanwhile, any external enterprise level job scheduler can run Note
> >>or
> >> >> Paragraph via REST api. But we don't have any guide and examples for
> >>it,
> >> >> what are the REST APIs user can use for this purpose, and how to use
> >> them
> >> >> in various cases (e.g. with authentication on, dynamic form
> >>parameters,
> >> >> etc). I think a lot of things need to be improved to make zeppelin
> >> easier
> >> >> to be part of production pipeline.
> >> >>
> >> >> *2) Provide stable way of run spark paragraphs.*
> >> >>
> >> >> Another barrier of using notebook in production pipeline is Scala
> >>REPL
> >> in
> >> >> SparkInterpreter. SparkInterpreter uses Scala REPL to provide
> >> interactive
> >> >> scala session and Scala REPL will eventually hit OOME as it compiles
> >>and
> >> >> runs statements. Current workaround in zeppelin is cron-scheduler
> >>inside
> >> >> of
> >> >> notebook has checkbox that can restart the Note after scheduler runs
> >>it.
> >> >> Of course that option does not apply when external scheduler runs
job
> >> >> through REST api.
> >> >>
> >> >> I think what Luciano suggesting, "Export Spark Paragraph as Spark
> >> >> application" is interesting. If Spark Paragraphs can be easily
> >>packaged
> >> >> into jar (spark application) that can be one of way to address 1)
and
> >> 2).
> >> >> In case of user already have stable way to schedule spark
application
> >> jar.
> >> >>
> >> >> Actually, Flink interactive shell works in similar way internally as
> >>far
> >> >> as
> >> >> i know. i.e. package compiled class into jar and submit.
> >> >>
> >> >> One idea for prototyping is,
> >> >> How about make a interpreter inside of spark interpreter group, say
> >>it's
> >> >> %spark.build or some better name.
> >> >>
> >> >> And if user runs some command like
> >> >>
> >> >> %spark.build
> >> >> package
> >> >>
> >> >> then it builds spark application jar based on spark paragraph in the
> >> Note.
> >> >> I think it can be the simplest user interface for the prototype.
> >> >>
> >> >> Thanks,
> >> >> moon
> >> >>
> >> >> On Fri, Sep 16, 2016 at 1:11 PM Jeremy Anderson <
> >> >> jer...@objectadjective.com>
> >> >> wrote:
> >> >>
> >> >> > Luciano, I think this would be a terrific feature. I've heard the
> >> exact
> >> >> > same workflow you've describe in all of the research we've done.
> >> >> >
> >> >> > ...........................
> >> >> >
> >> >> > Jeremy Anderson
> >> >> > Founder, Object Adjective
> >> >> > 415.493.8489 <(415)%20493-8489>
> >> >> > jer...@objectadjective.com
> >> >> > objectadjective.com <http://about.me/jeremyanderson>
> >> >> >
> >> >> >
> >> >> >
> >> >> > This email and any files transmitted with it are confidential and
> >> >> > intended solely for the use of the individual or entity to whom
> >>they
> >> are
> >> >> > addressed.
> >> >> >
> >> >> > On 16 September 2016 at 12:19, Luciano Resende
> >><luckbr1...@gmail.com>
> >> >> > wrote:
> >> >> >
> >> >> > > While talking with a few different users, I have been seeing the
> >>use
> >> >> case
> >> >> > > of using iterative development in Notebooks or Spark Shell and
> >>then
> >> >> > copying
> >> >> > > and pasting the final solution to a formal application repeating
> >> >> itself
> >> >> > > very often.
> >> >> > >
> >> >> > > I was wondering if an "Export Spark Paragraphs as a Spark
> >> Application
> >> >> > > (jar)" would be a feature that Zeppelin community would think
> >>it's
> >> >> > useful.
> >> >> > > But keep in mind there are some limitation here : we would be
> >> >> constrained
> >> >> > > to Spark related paragraphs, etc...  but even so, I think there
> >>are
> >> >> > > multiple scenarios where I see that the ability to have an
> >> application
> >> >> > that
> >> >> > > directly runs on Spark to be very useful.
> >> >> > >
> >> >> > > If the community is interested, let's use this thread to discuss
> >>any
> >> >> > > specific requirements or suggestions that others might have, and
> >> >> after a
> >> >> > > few days I would like to start prototyping this functionality.
> >> >> > >
> >> >> > > Thoughts ?
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > --
> >> >> > > Luciano Resende
> >> >> > > http://twitter.com/lresende1975
> >> >> > > http://lresende.blogspot.com/
> >> >> > >
> >> >> >
> >> >>
> >> >
> >> >
> >>
> >
> >
> >
> >--
> >Luciano Resende
> >http://twitter.com/lresende1975
> >http://lresende.blogspot.com/
>
>


--
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Reply via email to