Hi Luciano, maybe I am wrong, just my two cents for your consideration.


Jeff Zhang <zjf...@gmail.com>于2017年1月6日周五 上午8:32写道:

>
> Thanks Luciano.  I am not saying the community don't feel this is a good
> idea. It's just my personal opinion (maybe with some bias, I didn't talk
> with many customers as you)  I just feel maybe you can spend time on
> improving zeppelin to make zeppelin to do the job rather than exporting the
> jar and leverage other tools to deploy the jar.  Because I don't want you
> to waste time that maybe finally you find out customer are happy to do that
> in one central place: zeppelin.  Anyway, this is just my personal thinking,
> you can talk with your customers to hear their feedback.
>
>
> Luciano Resende <luckbr1...@gmail.com>于2017年1月6日周五 上午5:01写道:
>
> Hi Jeff,
>
> While I agree with you that what you mentioned is completely acceptable for
> some users, particularly regarding the data science personas. Having said
> that, while talking with multiple enterprise companies, that have their own
> scheduler infrastructure with different quality of service or just want to
> deploy this as an app into their production environment which will have
> much more resources for running these apps with complete data sets, and
> currently they finish the experiment/development of the application in an
> interactive environment and them move their final code into a native spark
> application.
>
> Zeppelin is evolving quickly in this area, and I think that export as an
> application might be a good option for users that want to actually deploy
> their notebooks as native applications into their own Spark cluster.
>
> Having said that, if the community feels that this is not a required
> function in Zeppelin anymore, then I can continue with the development of
> the tool as a standalone command line tool. I was even thinking about
> expanding the functionality and implementing what is described in
> ZEPPELIN-1793.
>
> Thoughts ?
>
> On Thu, Jan 5, 2017 at 12:38 AM, Jeff Zhang <zjf...@gmail.com> wrote:
>
> > Thanks Luciano. IIRC, what user want is to run the whole spark app, but
> > they don't care about whether it is in zeppelin or through a standard
> spark
> > app jar. I know zeppelin currently doesn't do well in converting note to
> > production spark app as Lee mentioned. But exporting note as jar seems a
> > short term solution, not a long term solution. I just feel when zeppelin
> > improve in this field, user might abandon this solution and transit to
> > zeppelin again. Here's some disadvantages I can see of this approach.
> >
> > 1.  If user want to change the code in iterative development , they have
> to
> > repeat the whole pipeline (change code in zeppelin -> export it to spark
> > jar -> redeploy this jar). This process is painful and wasting time.
> > 2.  Hard to debug and diagnose as code is changed/restructured when
> > exporting to jar
> > 3.  User have to leverage several distinct tools for the whole
> development
> > cycle (zeppelin, spark job server, and maybe cron job)
> >
> > Besides,  the OOM issue of Spark REPL Lee mentioned might not be a
> problem.
> > Because we can shutdown the app (close interpreter) after the app is
> done.
> >
> >
> >
> >
> >
> > Luciano Resende <luckbr1...@gmail.com>于2017年1月5日周四 下午3:59写道:
> >
> > Some use cases discussed earlier on this thread:
> >
> > https://www.mail-archive.com/dev@zeppelin.apache.org/msg06323.html
> >
> > https://www.mail-archive.com/dev@zeppelin.apache.org/msg06332.html
> >
> > On Wed, Jan 4, 2017 at 4:51 PM, Jianfeng (Jeff) Zhang <
> > jzh...@hortonworks.com> wrote:
> >
> > >
> > > I don¹t understand why user want to export zeppelin note as spark
> > > application.
> > >
> > > If they want to trigger the running of spark app, why not use
> zeppelin¹s
> > > rest api for that. Even user export it as spark application, most of
> time
> > > in reality, they need to submit it through spark job server, so why not
> > > use zeppelin as a spark job server.
> > > And if the spark app fails, it is pretty hard to debug it, because the
> > > exporting tool has changed/restructured the source code.
> > >
> > >
> > > If this is a pretty large and complicated spark application, I don¹t
> > think
> > > zeppelin is a proper tool for that, they¹d better to use IDE for that
> > > project.
> > >
> > > BTW, After https://github.com/apache/zeppelin/pull/1799, user can
> define
> > > the dependency between paragraphs, and they can run one whole note
> which
> > > contains different interpreters.
> > >
> > >
> > >
> > > Best Regard,
> > > Jeff Zhang
> > >
> > >
> > >
> > >
> > >
> > > On 1/5/17, 2:25 AM, "Luciano Resende" <luckbr1...@gmail.com> wrote:
> > >
> > > >I have made some progress with a tool to handle the points discussed
> in
> > > >this thread. It's currently a command line tool and given a Zeppelin
> > > >notebook (note.json) it generates a Spark scala application, compiles
> it
> > > >using the compiler embedded in the scala sdk and then package all
> these
> > > >resources into a jar that works with spark-submit command.
> > > >
> > > >I would like to start prototyping the integration into the Zeppelin UI
> > and
> > > >I was wondering if it would be ok to use the above jar as a dependency
> > > >(e.g. from a maven release) and integrate into zeppelin...
> > > >
> > > >Thoughts ?
> > > >
> > > >
> > > >On Mon, Sep 19, 2016 at 7:47 AM, Sourav Mazumder <
> > > >sourav.mazumde...@gmail.com> wrote:
> > > >
> > > >> To Moon's point, This is what my vision is around this feature -
> > > >>
> > > >> 1. Use should be able to package 1, more than one, all of the
> > > >>paragraphs in
> > > >> a Notebook to create a Jar file which can be used with Spark-Submit.
> > > >>
> > > >> 2. The tool should automatically remove the all the interactive
> > > >>statements
> > > >> like print, show etc.
> > > >>
> > > >> 3. The tool should automatically create a Main class in addition to
> > the
> > > >>jar
> > > >> file(s) which will internally call the respective jar. User can then
> > > >>change
> > > >> this main class if needed for parameterization through Args.
> > > >>
> > > >> Regards,
> > > >> Sourav
> > > >>
> > > >> On Mon, Sep 19, 2016 at 7:33 AM, Sourav Mazumder <
> > > >> sourav.mazumde...@gmail.com> wrote:
> > > >>
> > > >> > I am also pretty much for this.
> > > >> >
> > > >> > I have got the similar request from each and every people/group
> who
> > I
> > > >> > showcased Zeppelin.Regards,
> > > >> > Sourav
> > > >> >
> > > >> > On Fri, Sep 16, 2016 at 8:06 PM, moon soo Lee <m...@apache.org>
> > > wrote:
> > > >> >
> > > >> >> Hi Luciano,
> > > >> >>
> > > >> >> I've also got a lot of questions about "Productize the notebook"
> > > >>every
> > > >> >> time
> > > >> >> i meet users use Zeppelin in their work.
> > > >> >>
> > > >> >> I think it's actually about two different problems that Zeppelin
> > > >>need to
> > > >> >> address.
> > > >> >>
> > > >> >> *1) Provide way that interactive notebook becomes part of
> > production
> > > >> data
> > > >> >> pipeline.*
> > > >> >>
> > > >> >> Although Zeppelin does have quite convenient cron-like scheduler
> > for
> > > >> each
> > > >> >> Note, built-in cron scheduler is not ready for serious use in the
> > > >> >> production. Because it lacks some features like actions after
> > > >> >> success/fail,
> > > >> >> fault-tolerance, history, and so on. I think community is working
> > on
> > > >> >> improving it, and it's going to take some time.
> > > >> >>  Meanwhile, any external enterprise level job scheduler can run
> > Note
> > > >>or
> > > >> >> Paragraph via REST api. But we don't have any guide and examples
> > for
> > > >>it,
> > > >> >> what are the REST APIs user can use for this purpose, and how to
> > use
> > > >> them
> > > >> >> in various cases (e.g. with authentication on, dynamic form
> > > >>parameters,
> > > >> >> etc). I think a lot of things need to be improved to make
> zeppelin
> > > >> easier
> > > >> >> to be part of production pipeline.
> > > >> >>
> > > >> >> *2) Provide stable way of run spark paragraphs.*
> > > >> >>
> > > >> >> Another barrier of using notebook in production pipeline is Scala
> > > >>REPL
> > > >> in
> > > >> >> SparkInterpreter. SparkInterpreter uses Scala REPL to provide
> > > >> interactive
> > > >> >> scala session and Scala REPL will eventually hit OOME as it
> > compiles
> > > >>and
> > > >> >> runs statements. Current workaround in zeppelin is cron-scheduler
> > > >>inside
> > > >> >> of
> > > >> >> notebook has checkbox that can restart the Note after scheduler
> > runs
> > > >>it.
> > > >> >> Of course that option does not apply when external scheduler runs
> > job
> > > >> >> through REST api.
> > > >> >>
> > > >> >> I think what Luciano suggesting, "Export Spark Paragraph as Spark
> > > >> >> application" is interesting. If Spark Paragraphs can be easily
> > > >>packaged
> > > >> >> into jar (spark application) that can be one of way to address 1)
> > and
> > > >> 2).
> > > >> >> In case of user already have stable way to schedule spark
> > application
> > > >> jar.
> > > >> >>
> > > >> >> Actually, Flink interactive shell works in similar way internally
> > as
> > > >>far
> > > >> >> as
> > > >> >> i know. i.e. package compiled class into jar and submit.
> > > >> >>
> > > >> >> One idea for prototyping is,
> > > >> >> How about make a interpreter inside of spark interpreter group,
> say
> > > >>it's
> > > >> >> %spark.build or some better name.
> > > >> >>
> > > >> >> And if user runs some command like
> > > >> >>
> > > >> >> %spark.build
> > > >> >> package
> > > >> >>
> > > >> >> then it builds spark application jar based on spark paragraph in
> > the
> > > >> Note.
> > > >> >> I think it can be the simplest user interface for the prototype.
> > > >> >>
> > > >> >> Thanks,
> > > >> >> moon
> > > >> >>
> > > >> >> On Fri, Sep 16, 2016 at 1:11 PM Jeremy Anderson <
> > > >> >> jer...@objectadjective.com>
> > > >> >> wrote:
> > > >> >>
> > > >> >> > Luciano, I think this would be a terrific feature. I've heard
> the
> > > >> exact
> > > >> >> > same workflow you've describe in all of the research we've
> done.
> > > >> >> >
> > > >> >> > ...........................
> > > >> >> >
> > > >> >> > Jeremy Anderson
> > > >> >> > Founder, Object Adjective
> > > >> >> > 415.493.8489 <(415)%20493-8489> <(415)%20493-8489>
> > > >> >> > jer...@objectadjective.com
> > > >> >> > objectadjective.com <http://about.me/jeremyanderson>
> > > >> >> >
> > > >> >> >
> > > >> >> >
> > > >> >> > This email and any files transmitted with it are confidential
> and
> > > >> >> > intended solely for the use of the individual or entity to whom
> > > >>they
> > > >> are
> > > >> >> > addressed.
> > > >> >> >
> > > >> >> > On 16 September 2016 at 12:19, Luciano Resende
> > > >><luckbr1...@gmail.com>
> > > >> >> > wrote:
> > > >> >> >
> > > >> >> > > While talking with a few different users, I have been seeing
> > the
> > > >>use
> > > >> >> case
> > > >> >> > > of using iterative development in Notebooks or Spark Shell
> and
> > > >>then
> > > >> >> > copying
> > > >> >> > > and pasting the final solution to a formal application
> > repeating
> > > >> >> itself
> > > >> >> > > very often.
> > > >> >> > >
> > > >> >> > > I was wondering if an "Export Spark Paragraphs as a Spark
> > > >> Application
> > > >> >> > > (jar)" would be a feature that Zeppelin community would think
> > > >>it's
> > > >> >> > useful.
> > > >> >> > > But keep in mind there are some limitation here : we would be
> > > >> >> constrained
> > > >> >> > > to Spark related paragraphs, etc...  but even so, I think
> there
> > > >>are
> > > >> >> > > multiple scenarios where I see that the ability to have an
> > > >> application
> > > >> >> > that
> > > >> >> > > directly runs on Spark to be very useful.
> > > >> >> > >
> > > >> >> > > If the community is interested, let's use this thread to
> > discuss
> > > >>any
> > > >> >> > > specific requirements or suggestions that others might have,
> > and
> > > >> >> after a
> > > >> >> > > few days I would like to start prototyping this
> functionality.
> > > >> >> > >
> > > >> >> > > Thoughts ?
> > > >> >> > >
> > > >> >> > >
> > > >> >> > >
> > > >> >> > > --
> > > >> >> > > Luciano Resende
> > > >> >> > > http://twitter.com/lresende1975
> > > >> >> > > http://lresende.blogspot.com/
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >> >
> > > >> >
> > > >>
> > > >
> > > >
> > > >
> > > >--
> > > >Luciano Resende
> > > >http://twitter.com/lresende1975
> > > >http://lresende.blogspot.com/
> > >
> > >
> >
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
> >
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>
>

Reply via email to