Re: why zeppelin SparkInterpreter use FIFOScheduler

moon soo Lee Sat, 01 Aug 2015 08:38:29 -0700

Hi Pranav,

I think we need to think Scala compiler and SparkContext separately.
If Scala compiler is dedicated for a notebook, run paragraphs in different
notebooks in parallel will not be a problem. (Even if SparkContext is not
dedicated for a notebook. SparkContext is already thread safe and have fair
scheduler inside).


So, I think dedicated Scala compiler for a notebook, with shared
SparkContext (we can still use fair scheduler) would help.

Thanks,
moon

On Thu, Jul 30, 2015 at 8:53 PM Pranav Kumar Agarwal <[email protected]>
wrote:

> Hi Moon,
>
> How about tracking dedicated SparkContext for a notebook in Spark's
> remote interpreter - this will allow multiple users to run their spark
> paragraphs in parallel. Also, within a notebook only one paragraph is
> executed at a time.
>
> Regards,
> -Pranav.
>
>
> On 15/07/15 7:15 pm, moon soo Lee wrote:
> > Hi,
> >
> > Thanks for asking question.
> >
> > The reason is simply because of it is running code statements. The
> > statements can have order and dependency. Imagine i have two paragraphs
> >
> > %spark
> > val a = 1
> >
> > %spark
> > print(a)
> >
> > If they're not running one by one, that means they possibly runs in
> > random order and the output will be always different. Either '1' or
> > 'val a can not found'.
> >
> > This is the reason why. But if there are nice idea to handle this
> > problem i agree using parallel scheduler would help a lot.
> >
> > Thanks,
> > moon
> > On 2015년 7월 14일 (화) at 오후 7:59 linxi zeng
> > <[email protected] <mailto:[email protected]>> wrote:
> >
> >     any one who have the same question with me? or this is not a
> question?
> >
> >     2015-07-14 11:47 GMT+08:00 linxi zeng <[email protected]
> >     <mailto:[email protected]>>:
> >
> >         hi, Moon:
> >            I notice that the getScheduler function in the
> >         SparkInterpreter.java return a FIFOScheduler which makes the
> >         spark interpreter run spark job one by one. It's not a good
> >         experience when couple of users do some work on zeppelin at
> >         the same time, because they have to wait for each other.
> >         And at the same time, SparkSqlInterpreter can chose what
> >         scheduler to use by "zeppelin.spark.concurrentSQL".
> >         My question is, what kind of consideration do you based on to
> >         make such a decision?
> >
> >
>
>

Re: why zeppelin SparkInterpreter use FIFOScheduler

Reply via email to