Re: Implementing run all paragraphs sequentially

2017-10-06 Thread Jianfeng (Jeff) Zhang

Since almost everyone agree on to run serial by default. We could implement it 
first. Regarding the parallel mode,  we could leave it in future although 
personally I prefer to define DAG for note.


Best Regard,
Jeff Zhang


From: Michael Segel 
>
Reply-To: "users@zeppelin.apache.org" 
>
Date: Friday, October 6, 2017 at 10:08 PM
To: "users@zeppelin.apache.org" 
>
Subject: Re: Implementing run all paragraphs sequentially

Guys…

1) You’re posting this to the user list… Isn’t this a dev question?

2) +1 on the run serial… but doesn’t that already exist with the “run all 
paragraphs” button already?

3) -1 on a ‘run all in parallel’ button.  (Its like putting lipstick on a pig.)

Are you really going to run all of the paragraphs in parallel?  You’re not 
going to have a paragraph that is used to set things up? Import external 
libraries?  Define classes/functions for future paragraphs to use?

IMHO I would much rather see a DAG where each paragraph can set their 
dependancy… (this isn’t the right term. I’m trying to think back to how it was 
described in NeXTStep objective-c code.)
Then you could set your parallel button to run in parallel but if your 
paragraph is dependent on another, its blocked from executing until its 
predecessor completes.

But that’s just my $0.02

On Oct 6, 2017, at 2:25 AM, Polyakov Valeriy 
> wrote:

Thank you all for sharing the problem. Naman Mishra had started the 
implementation of serial run in [1] so I propose to come back for the 
discussion of next step (both Parallel and Serial run buttons) after [1] will 
resolved.

[1] https://issues.apache.org/jira/browse/ZEPPELIN-2368


Valeriy Polyakov

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Friday, October 06, 2017 10:14 AM
To: users@zeppelin.apache.org
Subject: Re: Implementing run all paragraphs sequentially


+1 for serial run by default.  Let's leave others in future.

Mohit Jaggi >于2017年10月6日周五 
上午7:48写道:
+1 for serial run by default.

Sent from my iPhone

On Oct 5, 2017, at 3:36 PM, moon soo Lee 
> wrote:
I'd like to we also consider simplicity of use.

We can have two different modes, or two different run buttons for Serial or 
Parallel run. This gives flexibility of choosing two different scheduler as a 
benefit, but to make user understand difference between two run button, there 
must be really good UI treatment.

I see there're high user demands for run notebook sequentially. And i think 
there're 3 action items in this discussion threads.

1. Change Parallel -> Serial the current run all button behavior
2. Provide both Parallel and Serial run buttons with really good UI treatment.
3. Provides DAG

I think 1) does not stop 2) and 3) in the future. 2) also does not stop 3) in 
the future.

So, why don't we try 1) first and keep discuss and polish idea about 2) and 3)?


Thanks,
moon

On Mon, Oct 2, 2017 at 10:22 AM Michael Segel 
> wrote:
Whoa!
Seems I walked in to something.

Herval,

What do you suggest?  A simple switch that runs everything in serial, or 
everything in parallel?
That would be a very bad idea.

I gave you an example of a class of solutions where you don’t want that 
behavior.
E.g Unit testing where you have one setup and then run several unit tests in 
parallel.

If that’s not enough for you… how about if you want to test producer/consumer 
problems?

Or if you want to define classes in one paragraph but then call on them in 
later paragraphs. If everything runs in parallel from the start of time 0, you 
can’t do this.


So, if you want to do it right the first time… you need to establish a way to 
control the dependency of paragraphs. This isn’t rocket science.
And frankly not that complex.

BTW, this is the user list not the dev list…

Just saying…  ;-)


On Oct 2, 2017, at 11:24 AM, Herval Freire 
> wrote:

 "nice to have" isn't a very strong requirement. I strongly uggest you really, 
really think about this before you start pounding an overengineered solution to 
a non-issue :-)

h

On Mon, Oct 2, 2017 at 9:12 AM, Michael Segel 
> wrote:
Yes…
 You have bunch of unit tests you can run in parallel where you only need one 
constructor and one cleanup.

I would strongly suggest that you really, really think about this long and hard 
before you start to pound code.
Its going to be harder to back out and fix than if you take the time to think 
thru the problem and not make a dumb mistake.

On Oct 2, 2017, at 

Re: pyspark run a specific paragraph

2017-10-06 Thread tbuenger
PyZeppelinContext has an additional member "z" that holds the py4j wrapper of
the SparkZeppelinContext.
At least on latest 0.8 zeppelin version z.z.run(...) works perfectly fine in
a pyspark paragraph.



--
Sent from: 
http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/


Re: Trying to 0.7.3 running with Spark

2017-10-06 Thread Michael Segel
I know.

If its not working… what do you see in the spark interpreter?

Do you see %spark, %spark.sql  or do you see just %sql ?

I’m sorry, I fuzzed with it and don’t remember what I did to change it.

On Oct 6, 2017, at 9:47 AM, Terry Healy > 
wrote:


Hi Michael-

I'm not sure what the values are supposed to be in order to compare. The 
interpreter is running; a save and restart still gives the same result.


On 10/06/2017 10:41 AM, Michael Segel wrote:
What do you see when you check out the spark interpreter?  Something with 
%spark, or %spark.sql   (Sorry, going from memory. ) I think it may also have 
to do with not having the spark interpreter running, so if you manually restart 
the interpreter then re-run the notebook… it should work…

HTH


On Oct 6, 2017, at 9:35 AM, Terry Healy > 
wrote:

Using Zeppelin 0.7.3, Spark 2.1.0-mapr-1703 / Scala 2.11.8

I had previously run the demo and successfully set up MongoDB and JDBC 
interpreter for Impala under V0.7.2. Since I have upgraded to 0.7.3, everything 
broke. I am down to to complete re-install (several, in fact) and get a 
response like below for most everything I try. (Focusing just on %spark for 
now) apparently have something very basic wrong, but I'll be damned if I can 
find it. The same example works fine in spark-shell.

Any suggestions for a new guy very much appreciated.

I found [ZEPPELIN-2475] 
and
 [ZEPPELIN-1560] which seem to be the same, or similar, but I did not 
understand what to change 
where

This is from "Zeppelin Tutorial/Basic Features (Spark)".

java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at 
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_2(SparkInterpreter.java:398)
at 
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:387)
at 
org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:843)
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)





Re: Trying to 0.7.3 running with Spark

2017-10-06 Thread Terry Healy

Hi Michael-

I'm not sure what the values are supposed to be in order to compare. The 
interpreter is running; a save and restart still gives the same result.




On 10/06/2017 10:41 AM, Michael Segel wrote:
What do you see when you check out the spark interpreter?  Something 
with %spark, or %spark.sql   (Sorry, going from memory. ) I think it 
may also have to do with not having the spark interpreter running, so 
if you manually restart the interpreter then re-run the notebook… it 
should work…


HTH


On Oct 6, 2017, at 9:35 AM, Terry Healy > wrote:


Using Zeppelin 0.7.3, Spark 2.1.0-mapr-1703 / Scala 2.11.8

I had previously run the demo and successfully set up MongoDB and 
JDBC interpreter for Impala under V0.7.2. Since I have upgraded to 
0.7.3, everything broke. I am down to to complete re-install 
(several, in fact) and get a response like below for most everything 
I try. (Focusing just on %spark for now) apparently have something 
very basic wrong, but I'll be damned if I can find it. The same 
example works fine in spark-shell.


Any suggestions for a new guy very much appreciated.

I found[ZEPPELIN-2475] and 
[ZEPPELIN-1560] 
which seem to be the same, or similar, but I did not understand what 
to change where 
 



This is from "Zeppelin Tutorial/Basic Features (Spark)".

java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at 
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_2(SparkInterpreter.java:398)
at 
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:387)
at 
org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at 
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:843)
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)

at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at 
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)






Re: Trying to 0.7.3 running with Spark

2017-10-06 Thread Michael Segel
What do you see when you check out the spark interpreter?  Something with 
%spark, or %spark.sql   (Sorry, going from memory. ) I think it may also have 
to do with not having the spark interpreter running, so if you manually restart 
the interpreter then re-run the notebook… it should work…

HTH


On Oct 6, 2017, at 9:35 AM, Terry Healy > 
wrote:

Using Zeppelin 0.7.3, Spark 2.1.0-mapr-1703 / Scala 2.11.8

I had previously run the demo and successfully set up MongoDB and JDBC 
interpreter for Impala under V0.7.2. Since I have upgraded to 0.7.3, everything 
broke. I am down to to complete re-install (several, in fact) and get a 
response like below for most everything I try. (Focusing just on %spark for 
now) apparently have something very basic wrong, but I'll be damned if I can 
find it. The same example works fine in spark-shell.

Any suggestions for a new guy very much appreciated.

I found [ZEPPELIN-2475] 
and
 [ZEPPELIN-1560] which seem to be the same, or similar, but I did not 
understand what to change 
where

This is from "Zeppelin Tutorial/Basic Features (Spark)".

java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at 
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_2(SparkInterpreter.java:398)
at 
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:387)
at 
org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:843)
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)



Trying to 0.7.3 running with Spark

2017-10-06 Thread Terry Healy

Using Zeppelin 0.7.3, Spark 2.1.0-mapr-1703 / Scala 2.11.8

I had previously run the demo and successfully set up MongoDB and JDBC 
interpreter for Impala under V0.7.2. Since I have upgraded to 0.7.3, 
everything broke. I am down to to complete re-install (several, in fact) 
and get a response like below for most everything I try. (Focusing just 
on %spark for now) apparently have something very basic wrong, but I'll 
be damned if I can find it. The same example works fine in spark-shell.


Any suggestions for a new guy very much appreciated.

I found[ZEPPELIN-2475] and 
[ZEPPELIN-1560] 
which seem to be the same, or similar, but I did not understand what to 
change where 
 



This is from "Zeppelin Tutorial/Basic Features (Spark)".

java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at 
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_2(SparkInterpreter.java:398)
at 
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:387)
at 
org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at 
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:843)
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)

at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)


Re: Implementing run all paragraphs sequentially

2017-10-06 Thread Michael Segel
Guys…

1) You’re posting this to the user list… Isn’t this a dev question?

2) +1 on the run serial… but doesn’t that already exist with the “run all 
paragraphs” button already?

3) -1 on a ‘run all in parallel’ button.  (Its like putting lipstick on a pig.)

Are you really going to run all of the paragraphs in parallel?  You’re not 
going to have a paragraph that is used to set things up? Import external 
libraries?  Define classes/functions for future paragraphs to use?

IMHO I would much rather see a DAG where each paragraph can set their 
dependancy… (this isn’t the right term. I’m trying to think back to how it was 
described in NeXTStep objective-c code.)
Then you could set your parallel button to run in parallel but if your 
paragraph is dependent on another, its blocked from executing until its 
predecessor completes.

But that’s just my $0.02

On Oct 6, 2017, at 2:25 AM, Polyakov Valeriy 
> wrote:

Thank you all for sharing the problem. Naman Mishra had started the 
implementation of serial run in [1] so I propose to come back for the 
discussion of next step (both Parallel and Serial run buttons) after [1] will 
resolved.

[1] https://issues.apache.org/jira/browse/ZEPPELIN-2368


Valeriy Polyakov

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Friday, October 06, 2017 10:14 AM
To: users@zeppelin.apache.org
Subject: Re: Implementing run all paragraphs sequentially


+1 for serial run by default.  Let's leave others in future.

Mohit Jaggi >于2017年10月6日周五 
上午7:48写道:
+1 for serial run by default.

Sent from my iPhone

On Oct 5, 2017, at 3:36 PM, moon soo Lee 
> wrote:
I'd like to we also consider simplicity of use.

We can have two different modes, or two different run buttons for Serial or 
Parallel run. This gives flexibility of choosing two different scheduler as a 
benefit, but to make user understand difference between two run button, there 
must be really good UI treatment.

I see there're high user demands for run notebook sequentially. And i think 
there're 3 action items in this discussion threads.

1. Change Parallel -> Serial the current run all button behavior
2. Provide both Parallel and Serial run buttons with really good UI treatment.
3. Provides DAG

I think 1) does not stop 2) and 3) in the future. 2) also does not stop 3) in 
the future.

So, why don't we try 1) first and keep discuss and polish idea about 2) and 3)?


Thanks,
moon

On Mon, Oct 2, 2017 at 10:22 AM Michael Segel 
> wrote:
Whoa!
Seems I walked in to something.

Herval,

What do you suggest?  A simple switch that runs everything in serial, or 
everything in parallel?
That would be a very bad idea.

I gave you an example of a class of solutions where you don’t want that 
behavior.
E.g Unit testing where you have one setup and then run several unit tests in 
parallel.

If that’s not enough for you… how about if you want to test producer/consumer 
problems?

Or if you want to define classes in one paragraph but then call on them in 
later paragraphs. If everything runs in parallel from the start of time 0, you 
can’t do this.


So, if you want to do it right the first time… you need to establish a way to 
control the dependency of paragraphs. This isn’t rocket science.
And frankly not that complex.

BTW, this is the user list not the dev list…

Just saying…  ;-)


On Oct 2, 2017, at 11:24 AM, Herval Freire 
> wrote:

 "nice to have" isn't a very strong requirement. I strongly uggest you really, 
really think about this before you start pounding an overengineered solution to 
a non-issue :-)

h

On Mon, Oct 2, 2017 at 9:12 AM, Michael Segel 
> wrote:
Yes…
 You have bunch of unit tests you can run in parallel where you only need one 
constructor and one cleanup.

I would strongly suggest that you really, really think about this long and hard 
before you start to pound code.
Its going to be harder to back out and fix than if you take the time to think 
thru the problem and not make a dumb mistake.

On Oct 2, 2017, at 11:02 AM, Herval Freire 
> wrote:

Did anyone request such a case ("running some in parallel and some in 
sequence")? I haven't seen any requests for this in the wild (nor on this 
thread), other than theoretical "what if" - which is totally fine, when it 
doesn't introduce a lot of unecessary complexity for little to no gain (which 
seems to be the case here)

h

On Mon, Oct 2, 2017 at 8:48 AM, Michael Segel 
> wrote:
Because that simplicity doesn’t work.

You will want to run some things serial and some things in parallel.

Which is why you will need a dependency 

Re: java.lang.NoClassDefFoundError: Could not initialize class org.apache.zeppelin.cassandra.DisplaySystem

2017-10-06 Thread DuyHai Doan
Maybe, but then it will impact ALL interpreters, not just cassandra.

We need to fully test to ensure that changing this file will not break
anything.

On Fri, Oct 6, 2017 at 10:19 AM, Patrick Brunmayr <
patrick.brunm...@kpibench.com> wrote:

> What about fixing interpreter.cmd ?
>
> There is something like this
>
> call "%bin%\functions.cmd" ADDJARINDIR "%ZEPPELIN_HOME%\zeppelin-
> interpreter\target\lib"
> call "%bin%\functions.cmd" ADDJARINDIR "%ZEPPELIN_HOME%\lib\interpreter"
> call "%bin%\functions.cmd" ADDJARINDIR "%INTERPRETER_DIR%"
>
> And in functions.cmd ADDJARINDIR is defined as
>
> :ADDJARINDIR
> if exist "%~2" (
> set ZEPPELIN_CLASSPATH="%~2\*";%ZEPPELIN_CLASSPATH%
> )
> exit /b
>
>
> This probably adds the wildcard entry. Why not switch to
>
>
> call "%bin%\functions.cmd" ADDEACHJARINDIR"%ZEPPELIN_
> HOME%\zeppelin-interpreter\target\lib"
> call "%bin%\functions.cmd" ADDEACHJARINDIR"%ZEPPELIN_
> HOME%\lib\interpreter"
> call "%bin%\functions.cmd" ADDEACHJARINDIR"%INTERPRETER_DIR%"
>
>
> ?
>
>
>
>
>
> 2017-10-05 14:43 GMT+02:00 DuyHai Doan :
>
>> Thank you Patrick for your in depth analysis
>>
>> It seems to come from Scalate library itself: https://github.com/sca
>> late/scalate/blob/master/scalate-util/src/main/scala/org/
>> fusesource/scalate/util/ClassPathBuilder.scala#L148
>>
>> Unfortunately I have check their github and on maven central, we're
>> already using the latest version of Scalate.
>>
>> I guess this issue only occurs on Windows platform.
>>
>> Now as a fix we have the choices between:
>>
>> 1) patch scalate library ourselve
>> 2) change the templating engine completely, which is kind of heavy work
>>
>> I'm opened to discussion but I don't see easy work-around here sadly :(
>>
>>
>>
>> On Thu, Oct 5, 2017 at 1:51 PM, Patrick Brunmayr <
>> patrick.brunm...@kpibench.com> wrote:
>>
>>> So i have some more information for you
>>>
>>> Its a two phase exception
>>>
>>> When i restart the interpreter i get this exception
>>>
>>>
>>> java.io.IOException: Invalid argument
>>> at java.io.WinNTFileSystem.canonicalize0(Native Method)
>>> at java.io.WinNTFileSystem.canonicalize(WinNTFileSystem.java:428)
>>> at java.io.File.getCanonicalPath(File.java:618)
>>> at org.fusesource.scalate.util.ClassPathBuilder$$anonfun$getCla
>>> ssPathFrom$3.apply(ClassPathBuilder.scala:147)
>>> at org.fusesource.scalate.util.ClassPathBuilder$$anonfun$getCla
>>> ssPathFrom$3.apply(ClassPathBuilder.scala:142)
>>> at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.a
>>> pply(TraversableLike.scala:728)
>>> at scala.collection.immutable.List.foreach(List.scala:381)
>>> at scala.collection.TraversableLike$WithFilter.map(TraversableL
>>> ike.scala:727)
>>> at org.fusesource.scalate.util.ClassPathBuilder$.getClassPathFr
>>> om(ClassPathBuilder.scala:142)
>>> at org.fusesource.scalate.util.ClassPathBuilder.addPathFrom(Cla
>>> ssPathBuilder.scala:68)
>>> at org.fusesource.scalate.util.ClassPathBuilder.addPathFromCont
>>> extClassLoader(ClassPathBuilder.scala:73)
>>> at org.fusesource.scalate.support.ScalaCompiler.generateSetting
>>> s(ScalaCompiler.scala:121)
>>> at org.fusesource.scalate.support.ScalaCompiler.(ScalaCom
>>> piler.scala:59)
>>> at org.fusesource.scalate.support.ScalaCompiler$.create(ScalaCo
>>> mpiler.scala:42)
>>> at org.fusesource.scalate.TemplateEngine.createCompiler(Templat
>>> eEngine.scala:231)
>>> at org.fusesource.scalate.TemplateEngine.compiler$lzycompute(Te
>>> mplateEngine.scala:221)
>>> at org.fusesource.scalate.TemplateEngine.compiler(TemplateEngin
>>> e.scala:221)
>>> at org.fusesource.scalate.TemplateEngine.compileAndLoad(Templat
>>> eEngine.scala:757)
>>> at org.fusesource.scalate.TemplateEngine.compileAndLoadEntry(Te
>>> mplateEngine.scala:699)
>>> at org.fusesource.scalate.TemplateEngine.liftedTree1$1(Template
>>> Engine.scala:419)
>>> at org.fusesource.scalate.TemplateEngine.load(TemplateEngine.scala:413)
>>> at org.fusesource.scalate.TemplateEngine.load(TemplateEngine.scala:471)
>>> at org.fusesource.scalate.TemplateEngine.layout(TemplateEngine.
>>> scala:573)
>>> at org.apache.zeppelin.cassandra.DisplaySystem$NoResultDisplay$
>>> .(DisplaySystem.scala:369)
>>> at org.apache.zeppelin.cassandra.DisplaySystem$NoResultDisplay$
>>> .(DisplaySystem.scala)
>>> at org.apache.zeppelin.cassandra.EnhancedSession.(Enhance
>>> dSession.scala:40)
>>> at org.apache.zeppelin.cassandra.InterpreterLogic.(Interp
>>> reterLogic.scala:98)
>>> at org.apache.zeppelin.cassandra.CassandraInterpreter.open(Cass
>>> andraInterpreter.java:231)
>>> at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(Laz
>>> yOpenInterpreter.java:70)
>>> at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServ
>>> er$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
>>> at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.ru
>>> n(ParallelScheduler.java:162)
>>> at 

Re: java.lang.NoClassDefFoundError: Could not initialize class org.apache.zeppelin.cassandra.DisplaySystem

2017-10-06 Thread Patrick Brunmayr
What about fixing interpreter.cmd ?

There is something like this

call "%bin%\functions.cmd" ADDJARINDIR
"%ZEPPELIN_HOME%\zeppelin-interpreter\target\lib"
call "%bin%\functions.cmd" ADDJARINDIR "%ZEPPELIN_HOME%\lib\interpreter"
call "%bin%\functions.cmd" ADDJARINDIR "%INTERPRETER_DIR%"

And in functions.cmd ADDJARINDIR is defined as

:ADDJARINDIR
if exist "%~2" (
set ZEPPELIN_CLASSPATH="%~2\*";%ZEPPELIN_CLASSPATH%
)
exit /b


This probably adds the wildcard entry. Why not switch to


call "%bin%\functions.cmd"
ADDEACHJARINDIR"%ZEPPELIN_HOME%\zeppelin-interpreter\target\lib"
call "%bin%\functions.cmd" ADDEACHJARINDIR"%ZEPPELIN_HOME%\lib\interpreter"
call "%bin%\functions.cmd" ADDEACHJARINDIR"%INTERPRETER_DIR%"


?





2017-10-05 14:43 GMT+02:00 DuyHai Doan :

> Thank you Patrick for your in depth analysis
>
> It seems to come from Scalate library itself: https://github.com/
> scalate/scalate/blob/master/scalate-util/src/main/scala/
> org/fusesource/scalate/util/ClassPathBuilder.scala#L148
>
> Unfortunately I have check their github and on maven central, we're
> already using the latest version of Scalate.
>
> I guess this issue only occurs on Windows platform.
>
> Now as a fix we have the choices between:
>
> 1) patch scalate library ourselve
> 2) change the templating engine completely, which is kind of heavy work
>
> I'm opened to discussion but I don't see easy work-around here sadly :(
>
>
>
> On Thu, Oct 5, 2017 at 1:51 PM, Patrick Brunmayr <
> patrick.brunm...@kpibench.com> wrote:
>
>> So i have some more information for you
>>
>> Its a two phase exception
>>
>> When i restart the interpreter i get this exception
>>
>>
>> java.io.IOException: Invalid argument
>> at java.io.WinNTFileSystem.canonicalize0(Native Method)
>> at java.io.WinNTFileSystem.canonicalize(WinNTFileSystem.java:428)
>> at java.io.File.getCanonicalPath(File.java:618)
>> at org.fusesource.scalate.util.ClassPathBuilder$$anonfun$getCla
>> ssPathFrom$3.apply(ClassPathBuilder.scala:147)
>> at org.fusesource.scalate.util.ClassPathBuilder$$anonfun$getCla
>> ssPathFrom$3.apply(ClassPathBuilder.scala:142)
>> at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.
>> apply(TraversableLike.scala:728)
>> at scala.collection.immutable.List.foreach(List.scala:381)
>> at scala.collection.TraversableLike$WithFilter.map(
>> TraversableLike.scala:727)
>> at org.fusesource.scalate.util.ClassPathBuilder$.getClassPathFr
>> om(ClassPathBuilder.scala:142)
>> at org.fusesource.scalate.util.ClassPathBuilder.addPathFrom(Cla
>> ssPathBuilder.scala:68)
>> at org.fusesource.scalate.util.ClassPathBuilder.addPathFromCont
>> extClassLoader(ClassPathBuilder.scala:73)
>> at org.fusesource.scalate.support.ScalaCompiler.generateSetting
>> s(ScalaCompiler.scala:121)
>> at org.fusesource.scalate.support.ScalaCompiler.(ScalaCom
>> piler.scala:59)
>> at org.fusesource.scalate.support.ScalaCompiler$.create(ScalaCo
>> mpiler.scala:42)
>> at org.fusesource.scalate.TemplateEngine.createCompiler(Templat
>> eEngine.scala:231)
>> at org.fusesource.scalate.TemplateEngine.compiler$lzycompute(
>> TemplateEngine.scala:221)
>> at org.fusesource.scalate.TemplateEngine.compiler(TemplateEngin
>> e.scala:221)
>> at org.fusesource.scalate.TemplateEngine.compileAndLoad(Templat
>> eEngine.scala:757)
>> at org.fusesource.scalate.TemplateEngine.compileAndLoadEntry(Te
>> mplateEngine.scala:699)
>> at org.fusesource.scalate.TemplateEngine.liftedTree1$1(Template
>> Engine.scala:419)
>> at org.fusesource.scalate.TemplateEngine.load(TemplateEngine.scala:413)
>> at org.fusesource.scalate.TemplateEngine.load(TemplateEngine.scala:471)
>> at org.fusesource.scalate.TemplateEngine.layout(TemplateEngine.scala:573)
>> at org.apache.zeppelin.cassandra.DisplaySystem$NoResultDisplay$
>> .(DisplaySystem.scala:369)
>> at org.apache.zeppelin.cassandra.DisplaySystem$NoResultDisplay$
>> .(DisplaySystem.scala)
>> at org.apache.zeppelin.cassandra.EnhancedSession.(Enhance
>> dSession.scala:40)
>> at org.apache.zeppelin.cassandra.InterpreterLogic.(Interp
>> reterLogic.scala:98)
>> at org.apache.zeppelin.cassandra.CassandraInterpreter.open(Cass
>> andraInterpreter.java:231)
>> at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(Laz
>> yOpenInterpreter.java:70)
>> at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServ
>> er$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
>> at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
>> at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.ru
>> n(ParallelScheduler.java:162)
>> at java.util.concurrent.Executors$RunnableAdapter.call(
>> Executors.java:511)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.run(ScheduledThreadPoolExecutor.java:293)
>> at 

RE: Implementing run all paragraphs sequentially

2017-10-06 Thread Polyakov Valeriy
Thank you all for sharing the problem. Naman Mishra had started the 
implementation of serial run in [1] so I propose to come back for the 
discussion of next step (both Parallel and Serial run buttons) after [1] will 
resolved.

[1] https://issues.apache.org/jira/browse/ZEPPELIN-2368


Valeriy Polyakov

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Friday, October 06, 2017 10:14 AM
To: users@zeppelin.apache.org
Subject: Re: Implementing run all paragraphs sequentially


+1 for serial run by default.  Let's leave others in future.

Mohit Jaggi >于2017年10月6日周五 
上午7:48写道:
+1 for serial run by default.

Sent from my iPhone

On Oct 5, 2017, at 3:36 PM, moon soo Lee 
> wrote:
I'd like to we also consider simplicity of use.

We can have two different modes, or two different run buttons for Serial or 
Parallel run. This gives flexibility of choosing two different scheduler as a 
benefit, but to make user understand difference between two run button, there 
must be really good UI treatment.

I see there're high user demands for run notebook sequentially. And i think 
there're 3 action items in this discussion threads.

1. Change Parallel -> Serial the current run all button behavior
2. Provide both Parallel and Serial run buttons with really good UI treatment.
3. Provides DAG

I think 1) does not stop 2) and 3) in the future. 2) also does not stop 3) in 
the future.

So, why don't we try 1) first and keep discuss and polish idea about 2) and 3)?


Thanks,
moon

On Mon, Oct 2, 2017 at 10:22 AM Michael Segel 
> wrote:
Whoa!
Seems I walked in to something.

Herval,

What do you suggest?  A simple switch that runs everything in serial, or 
everything in parallel?
That would be a very bad idea.

I gave you an example of a class of solutions where you don’t want that 
behavior.
E.g Unit testing where you have one setup and then run several unit tests in 
parallel.

If that’s not enough for you… how about if you want to test producer/consumer 
problems?

Or if you want to define classes in one paragraph but then call on them in 
later paragraphs. If everything runs in parallel from the start of time 0, you 
can’t do this.


So, if you want to do it right the first time… you need to establish a way to 
control the dependency of paragraphs. This isn’t rocket science.
And frankly not that complex.

BTW, this is the user list not the dev list…

Just saying…  ;-)


On Oct 2, 2017, at 11:24 AM, Herval Freire 
> wrote:

 "nice to have" isn't a very strong requirement. I strongly uggest you really, 
really think about this before you start pounding an overengineered solution to 
a non-issue :-)

h

On Mon, Oct 2, 2017 at 9:12 AM, Michael Segel 
> wrote:
Yes…
 You have bunch of unit tests you can run in parallel where you only need one 
constructor and one cleanup.

I would strongly suggest that you really, really think about this long and hard 
before you start to pound code.
Its going to be harder to back out and fix than if you take the time to think 
thru the problem and not make a dumb mistake.

On Oct 2, 2017, at 11:02 AM, Herval Freire 
> wrote:

Did anyone request such a case ("running some in parallel and some in 
sequence")? I haven't seen any requests for this in the wild (nor on this 
thread), other than theoretical "what if" - which is totally fine, when it 
doesn't introduce a lot of unecessary complexity for little to no gain (which 
seems to be the case here)

h

On Mon, Oct 2, 2017 at 8:48 AM, Michael Segel 
> wrote:
Because that simplicity doesn’t work.

You will want to run some things serial and some things in parallel.

Which is why you will need a dependency graph.

On Oct 2, 2017, at 10:40 AM, Herval Freire 
> wrote:

Why do you need rules and graphs and any of that to support running everything 
sequentially or everything in parallel?

3) add a “run mode” to the note. If it’s “sequential”, run the paragraphs one 
at a time, in the order they’re defined. If parallel, run using current scheme 
(as many at the same time as the threadpool permits)

Simpler and covers all cases, imo


From: Polyakov Valeriy >
Sent: Monday, October 2, 2017 8:24:35 AM
To: users@zeppelin.apache.org
Subject: RE: Implementing run all paragraphs sequentially

Let me try to summarize the discussion. Evidently, current behavior of running 
notes does not meet actual requirements. The most important thing that we need 
is the ability of sequential running. However, at the same time we want to keep 
functionality 

Re: Implementing run all paragraphs sequentially

2017-10-06 Thread Jeff Zhang
+1 for serial run by default.  Let's leave others in future.

Mohit Jaggi 于2017年10月6日周五 上午7:48写道:

> +1 for serial run by default.
>
> Sent from my iPhone
>
> On Oct 5, 2017, at 3:36 PM, moon soo Lee  wrote:
>
> I'd like to we also consider simplicity of use.
>
> We can have two different modes, or two different run buttons for Serial
> or Parallel run. This gives flexibility of choosing two different scheduler
> as a benefit, but to make user understand difference between two run
> button, there must be really good UI treatment.
>
> I see there're high user demands for run notebook sequentially. And i
> think there're 3 action items in this discussion threads.
>
> 1. Change Parallel -> Serial the current run all button behavior
> 2. Provide both Parallel and Serial run buttons with really good UI
> treatment.
> 3. Provides DAG
>
> I think 1) does not stop 2) and 3) in the future. 2) also does not stop 3)
> in the future.
>
> So, why don't we try 1) first and keep discuss and polish idea about 2)
> and 3)?
>
>
> Thanks,
> moon
>
> On Mon, Oct 2, 2017 at 10:22 AM Michael Segel 
> wrote:
>
>> Whoa!
>> Seems I walked in to something.
>>
>> Herval,
>>
>> What do you suggest?  A simple switch that runs everything in serial, or
>> everything in parallel?
>> That would be a very bad idea.
>>
>> I gave you an example of a class of solutions where you don’t want that
>> behavior.
>> E.g Unit testing where you have one setup and then run several unit tests
>> in parallel.
>>
>> If that’s not enough for you… how about if you want to test
>> producer/consumer problems?
>>
>> Or if you want to define classes in one paragraph but then call on them
>> in later paragraphs. If everything runs in parallel from the start of time
>> 0, you can’t do this.
>>
>>
>> So, if you want to do it right the first time… you need to establish a
>> way to control the dependency of paragraphs. This isn’t rocket science.
>> And frankly not that complex.
>>
>> BTW, this is the user list not the dev list…
>>
>> Just saying…  ;-)
>>
>>
>> On Oct 2, 2017, at 11:24 AM, Herval Freire  wrote:
>>
>>  "nice to have" isn't a very strong requirement. I strongly uggest you
>> really, really think about this before you start pounding an overengineered
>> solution to a non-issue :-)
>>
>> h
>>
>> On Mon, Oct 2, 2017 at 9:12 AM, Michael Segel 
>> wrote:
>>
>>> Yes…
>>>  You have bunch of unit tests you can run in parallel where you only
>>> need one constructor and one cleanup.
>>>
>>> I would strongly suggest that you really, really think about this long
>>> and hard before you start to pound code.
>>> Its going to be harder to back out and fix than if you take the time to
>>> think thru the problem and not make a dumb mistake.
>>>
>>> On Oct 2, 2017, at 11:02 AM, Herval Freire  wrote:
>>>
>>> Did anyone request such a case ("running some in parallel and some in
>>> sequence")? I haven't seen any requests for this in the wild (nor on this
>>> thread), other than theoretical "what if" - which is totally fine, when it
>>> doesn't introduce a lot of unecessary complexity for little to no gain
>>> (which seems to be the case here)
>>>
>>> h
>>>
>>> On Mon, Oct 2, 2017 at 8:48 AM, Michael Segel >> > wrote:
>>>
 Because that simplicity doesn’t work.

 You will want to run some things serial and some things in parallel.

 Which is why you will need a dependency graph.

 On Oct 2, 2017, at 10:40 AM, Herval Freire  wrote:

 Why do you need rules and graphs and any of that to support running
 everything sequentially or everything in parallel?

 3) add a “run mode” to the note. If it’s “sequential”, run the
 paragraphs one at a time, in the order they’re defined. If parallel, run
 using current scheme (as many at the same time as the threadpool permits)

 Simpler and covers all cases, imo

 --
 *From:* Polyakov Valeriy 
 *Sent:* Monday, October 2, 2017 8:24:35 AM
 *To:* users@zeppelin.apache.org
 *Subject:* RE: Implementing run all paragraphs sequentially

 Let me try to summarize the discussion. Evidently, current behavior of
 running notes does not meet actual requirements. The most important thing
 that we need is the ability of sequential running. However, at the same
 time we want to keep functionality of parallel running. We discussed that
 the most suitable solution of building paragraphs` dependencies is a DAG
 (directed acyclic graph). Therefore, surely, this kind of dependencies
 should be defined in note and the running order should not depend on how we
 launch it (button / scheduler / API). In this way, our objectives are to
 implement “dependency definition engine” and to use it in “run engine”.