Re: why zeppelin SparkInterpreter use FIFOScheduler

DuyHai Doan Wed, 15 Jul 2015 08:00:51 -0700

We may introduce some paragraph dependency system to remove this limitation.


Indeed, in the InterpreterContext object we can introduce dependency link
between different paragraphs so that we guarantee a correct ordering of
execution

It would imply updating the AngularJS code to add a config section for
paragraph dependency

After the refactoring of ZeppelinContext, I might do a POC to see how it
can be done easily.

After that, all interpreters can leverage the ParallelScheduler

On Wed, Jul 15, 2015 at 3:45 PM, moon soo Lee <m...@apache.org> wrote:

> Hi,
>
> Thanks for asking question.
>
> The reason is simply because of it is running code statements. The
> statements can have order and dependency. Imagine i have two paragraphs
>
> %spark
> val a = 1
>
> %spark
> print(a)
>
> If they're not running one by one, that means they possibly runs in random
> order and the output will be always different. Either '1' or 'val a can not
> found'.
>
> This is the reason why. But if there are nice idea to handle this problem
> i agree using parallel scheduler would help a lot.
>
> Thanks,
> moon
>
> On 2015년 7월 14일 (화) at 오후 7:59 linxi zeng <linxizeng0...@gmail.com> wrote:
>
>> any one who have the same question with me? or this is not a question?
>>
>> 2015-07-14 11:47 GMT+08:00 linxi zeng <linxizeng0...@gmail.com>:
>>
>>> hi, Moon:
>>>    I notice that the getScheduler function in the SparkInterpreter.java
>>> return a FIFOScheduler which makes the spark interpreter run spark job one
>>> by one. It's not a good experience when couple of users do some work on
>>> zeppelin at the same time, because they have to wait for each other.
>>> And at the same time, SparkSqlInterpreter can chose what scheduler to
>>> use by "zeppelin.spark.concurrentSQL".
>>> My question is, what kind of consideration do you based on to make such
>>> a decision?
>>>
>>
>>

Re: why zeppelin SparkInterpreter use FIFOScheduler

Reply via email to