Re: Other paragraphs do not wait for %sh paragraphs to finish.
"depends on previous paragraph" could be the default behavior is no deps is specified. Specifying dependencies explicitly could benefit the performance. e.g. In the spark tutorial note, the 3 sql could run at the same time independently. Ruslan Dautkhanov于2017年4月7日周五 上午1:09写道: > Apart from introducing a full-blown graph of DAG dependencies, a simpler > solution > might be introducing a paragraph-level property "depends on previous > paragraph" (boolean), > so in run-all-paragraphs run, this particular paragraph wouldn't be > scheduled until > previous one is complete (without errors). > > It will be a compromise between completely sequential run and having a way > to define a DAG. > > > > -- > Ruslan Dautkhanov > > On Thu, Apr 6, 2017 at 1:32 AM, Jeff Zhang wrote: > > > That's correct, it needs define dependency between paragraphs, e.g. > %spark(deps=p1), so that we can build DAG for the whole pipeline. > > > > > > Rick Moritz 于2017年4月6日周四 下午3:28写道: > > This actually calls for a dependency definition of notes within a > notebook, so the scheduler can decide which tasks to run simultaneously. > I suggest a simple counter of dependency levels, which by default > increases with every new note and can be decremented to allow notes to run > simultaneously. Run-all then submits each level into the target > interpreters for this level, awaits termination of all results, and then > starts the next level's note. > > > On Thu, Apr 6, 2017 at 12:57 AM, moon soo Lee wrote: > > Hi, > > That's expected behavior at the moment. The reason is > > Each interpreter has it's own scheduler (either FIFO, Parallel), and > run-all just submit all paragraphs into target interpreter's scheduler. > > I think we can add feature such as run-all-sequentially. > Do you mind file a JIRA issue? > > Thanks, > moon > > On Thu, Apr 6, 2017 at 5:35 AM wrote: > > I often have notebooks that have a %sh as the 1st paragraph. This scps > some file from another server, and then a number of spark or sparksql > paragraphs are after that. > > If I click on the run-all paragraphs at the top of the notebook the 1st > %sh paragraph kicks off as expected, but the 2nd %spark notebook starts too > at the same time. The others go into pending state and then start once the > spark one has completed. > > Is this a bug? Or am I doing something wrong? > > Thanks > > > >
Re: Other paragraphs do not wait for %sh paragraphs to finish.
Apart from introducing a full-blown graph of DAG dependencies, a simpler solution might be introducing a paragraph-level property "depends on previous paragraph" (boolean), so in run-all-paragraphs run, this particular paragraph wouldn't be scheduled until previous one is complete (without errors). It will be a compromise between completely sequential run and having a way to define a DAG. -- Ruslan Dautkhanov On Thu, Apr 6, 2017 at 1:32 AM, Jeff Zhangwrote: > > That's correct, it needs define dependency between paragraphs, e.g. > %spark(deps=p1), so that we can build DAG for the whole pipeline. > > > > > > Rick Moritz 于2017年4月6日周四 下午3:28写道: > >> This actually calls for a dependency definition of notes within a >> notebook, so the scheduler can decide which tasks to run simultaneously. >> I suggest a simple counter of dependency levels, which by default >> increases with every new note and can be decremented to allow notes to run >> simultaneously. Run-all then submits each level into the target >> interpreters for this level, awaits termination of all results, and then >> starts the next level's note. >> >> >> On Thu, Apr 6, 2017 at 12:57 AM, moon soo Lee wrote: >> >> Hi, >> >> That's expected behavior at the moment. The reason is >> >> Each interpreter has it's own scheduler (either FIFO, Parallel), and >> run-all just submit all paragraphs into target interpreter's scheduler. >> >> I think we can add feature such as run-all-sequentially. >> Do you mind file a JIRA issue? >> >> Thanks, >> moon >> >> On Thu, Apr 6, 2017 at 5:35 AM wrote: >> >> I often have notebooks that have a %sh as the 1st paragraph. This scps >> some file from another server, and then a number of spark or sparksql >> paragraphs are after that. >> >> If I click on the run-all paragraphs at the top of the notebook the 1st >> %sh paragraph kicks off as expected, but the 2nd %spark notebook starts too >> at the same time. The others go into pending state and then start once the >> spark one has completed. >> >> Is this a bug? Or am I doing something wrong? >> >> Thanks >> >> >>
Re: Other paragraphs do not wait for %sh paragraphs to finish.
Filed https://issues.apache.org/jira/browse/ZEPPELIN-2368 We had users asking the same.. it forced them to run paragraphs one by one manually. -- Ruslan Dautkhanov On Wed, Apr 5, 2017 at 4:57 PM, moon soo Leewrote: > Hi, > > That's expected behavior at the moment. The reason is > > Each interpreter has it's own scheduler (either FIFO, Parallel), and > run-all just submit all paragraphs into target interpreter's scheduler. > > I think we can add feature such as run-all-sequentially. > Do you mind file a JIRA issue? > > Thanks, > moon > > On Thu, Apr 6, 2017 at 5:35 AM wrote: > >> I often have notebooks that have a %sh as the 1st paragraph. This scps >> some file from another server, and then a number of spark or sparksql >> paragraphs are after that. >> >> If I click on the run-all paragraphs at the top of the notebook the 1st >> %sh paragraph kicks off as expected, but the 2nd %spark notebook starts too >> at the same time. The others go into pending state and then start once the >> spark one has completed. >> >> Is this a bug? Or am I doing something wrong? >> >> Thanks >> >>
Re: Other paragraphs do not wait for %sh paragraphs to finish.
That's correct, it needs define dependency between paragraphs, e.g. %spark(deps=p1), so that we can build DAG for the whole pipeline. Rick Moritz于2017年4月6日周四 下午3:28写道: > This actually calls for a dependency definition of notes within a > notebook, so the scheduler can decide which tasks to run simultaneously. > I suggest a simple counter of dependency levels, which by default > increases with every new note and can be decremented to allow notes to run > simultaneously. Run-all then submits each level into the target > interpreters for this level, awaits termination of all results, and then > starts the next level's note. > > > On Thu, Apr 6, 2017 at 12:57 AM, moon soo Lee wrote: > > Hi, > > That's expected behavior at the moment. The reason is > > Each interpreter has it's own scheduler (either FIFO, Parallel), and > run-all just submit all paragraphs into target interpreter's scheduler. > > I think we can add feature such as run-all-sequentially. > Do you mind file a JIRA issue? > > Thanks, > moon > > On Thu, Apr 6, 2017 at 5:35 AM wrote: > > I often have notebooks that have a %sh as the 1st paragraph. This scps > some file from another server, and then a number of spark or sparksql > paragraphs are after that. > > If I click on the run-all paragraphs at the top of the notebook the 1st > %sh paragraph kicks off as expected, but the 2nd %spark notebook starts too > at the same time. The others go into pending state and then start once the > spark one has completed. > > Is this a bug? Or am I doing something wrong? > > Thanks > > >
Re: Other paragraphs do not wait for %sh paragraphs to finish.
This actually calls for a dependency definition of notes within a notebook, so the scheduler can decide which tasks to run simultaneously. I suggest a simple counter of dependency levels, which by default increases with every new note and can be decremented to allow notes to run simultaneously. Run-all then submits each level into the target interpreters for this level, awaits termination of all results, and then starts the next level's note. On Thu, Apr 6, 2017 at 12:57 AM, moon soo Leewrote: > Hi, > > That's expected behavior at the moment. The reason is > > Each interpreter has it's own scheduler (either FIFO, Parallel), and > run-all just submit all paragraphs into target interpreter's scheduler. > > I think we can add feature such as run-all-sequentially. > Do you mind file a JIRA issue? > > Thanks, > moon > > On Thu, Apr 6, 2017 at 5:35 AM wrote: > >> I often have notebooks that have a %sh as the 1st paragraph. This scps >> some file from another server, and then a number of spark or sparksql >> paragraphs are after that. >> >> If I click on the run-all paragraphs at the top of the notebook the 1st >> %sh paragraph kicks off as expected, but the 2nd %spark notebook starts too >> at the same time. The others go into pending state and then start once the >> spark one has completed. >> >> Is this a bug? Or am I doing something wrong? >> >> Thanks >> >>
Other paragraphs do not wait for %sh paragraphs to finish.
I often have notebooks that have a %sh as the 1st paragraph. This scps some file from another server, and then a number of spark or sparksql paragraphs are after that. If I click on the run-all paragraphs at the top of the notebook the 1st %sh paragraph kicks off as expected, but the 2nd %spark notebook starts too at the same time. The others go into pending state and then start once the spark one has completed. Is this a bug? Or am I doing something wrong? Thanks