Re: NewSparkInterpreter fails on yarn-cluster

2018-06-07 Thread Jeff Zhang
Hi Thomas,

I try to the latest branch-0.8, it works for me. Could you try again to
verify it ?


Thomas Bünger 于2018年6月7日周四 下午8:34写道:

> I specifically mean visualisation via ZeppelinContext inside a Spark
> interpreter. (e.g. "z.show(...)")
> The visualisation of SparkSQL results inside a SparkSQLInterpreter work
> fine, also in yarn-cluster mode.
>
> Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
> thom.bu...@googlemail.com>:
>
>> Hey Jeff,
>>
>> I tried your changes and now it works nicely. Thank you very much!
>>
>> But I still can't use any of the forms and visualizations in yarn-cluster?
>> I was hoping that this got resolved with the new SparkInterpreter so that
>> I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm still
>> getting errors like
>> "error: not found: value z"
>>
>> Was this not in scope of that change? Is this a bug? Or is it known
>> limitation and also not supported in 0.8?
>>
>> Best regards,
>>  Thomas
>>
>> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang :
>>
>>>
>>> I can confirm that this is a bug, and created
>>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>>
>>> Will fix it soon
>>>
>>> Jeff Zhang 于2018年6月5日周二 下午9:01写道:
>>>

 hmm, it looks like a bug. I will check it tomorrow.


 Thomas Bünger 于2018年6月5日周二 下午8:56写道:

> $ ls /usr/lib/spark/python/lib
> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>
> So folder exists and contains both necessary zips. Please note, that
> in local or yarn-client mode the files are properly picked up from that
> very same location.
>
> How does yarn-cluster work under the hood? Could it be that
> environment variables (like SPARK_HOME) are lost, because they are only
> available in my local shell + zeppelin daemon process? Do I need to tell
> YARN somehow about SPARK_HOME?
>
> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang  >:
>
>>
>> Could you check whether there's folder /usr/lib/spark/python/lib ?
>>
>>
>> Thomas Bünger 于2018年6月5日周二 下午8:45写道:
>>
>>>
>>> sys.env
>>> java.lang.NullPointerException at
>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>> at
>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>> at
>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>> at
>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>> at 
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>> at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>> at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>>> zjf...@gmail.com>:
>>>
 Could you paste the full stracktrace ?


 Thomas Bünger 于2018年6月5日周二 下午8:21写道:

> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
> version of spark under /usr/lib/spark.
>
> This works fine in local or yarn-client mode, but in yarn-cluster
> mode i just get a
>
> java.lang.NullPointerException at
> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>
> Seems to be caused by an unsuccessful search for the py4j
> libraries.
> I've made sure that SPARK_HOME is actually set in .bash_rc, in
> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
> interpreter, something odd is going on.
>
> Best regards,
>  Thomas
>



Re: NewSparkInterpreter fails on yarn-cluster

2018-06-07 Thread Thomas Bünger
I specifically mean visualisation via ZeppelinContext inside a Spark
interpreter. (e.g. "z.show(...)")
The visualisation of SparkSQL results inside a SparkSQLInterpreter work
fine, also in yarn-cluster mode.

Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
thom.bu...@googlemail.com>:

> Hey Jeff,
>
> I tried your changes and now it works nicely. Thank you very much!
>
> But I still can't use any of the forms and visualizations in yarn-cluster?
> I was hoping that this got resolved with the new SparkInterpreter so that
> I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm still
> getting errors like
> "error: not found: value z"
>
> Was this not in scope of that change? Is this a bug? Or is it known
> limitation and also not supported in 0.8?
>
> Best regards,
>  Thomas
>
> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang :
>
>>
>> I can confirm that this is a bug, and created
>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>
>> Will fix it soon
>>
>> Jeff Zhang 于2018年6月5日周二 下午9:01写道:
>>
>>>
>>> hmm, it looks like a bug. I will check it tomorrow.
>>>
>>>
>>> Thomas Bünger 于2018年6月5日周二 下午8:56写道:
>>>
 $ ls /usr/lib/spark/python/lib
 py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip

 So folder exists and contains both necessary zips. Please note, that in
 local or yarn-client mode the files are properly picked up from that very
 same location.

 How does yarn-cluster work under the hood? Could it be that environment
 variables (like SPARK_HOME) are lost, because they are only available in my
 local shell + zeppelin daemon process? Do I need to tell YARN somehow about
 SPARK_HOME?

 Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang >>> >:

>
> Could you check whether there's folder /usr/lib/spark/python/lib ?
>
>
> Thomas Bünger 于2018年6月5日周二 下午8:45写道:
>
>>
>> sys.env
>> java.lang.NullPointerException at
>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>> at
>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>> at
>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>> at
>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>> at 
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> at java.lang.Thread.run(Thread.java:748)
>>
>>
>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>> zjf...@gmail.com>:
>>
>>> Could you paste the full stracktrace ?
>>>
>>>
>>> Thomas Bünger 于2018年6月5日周二 下午8:21写道:
>>>
 I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
 version of spark under /usr/lib/spark.

 This works fine in local or yarn-client mode, but in yarn-cluster
 mode i just get a

 java.lang.NullPointerException at
 org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)

 Seems to be caused by an unsuccessful search for the py4j libraries.
 I've made sure that SPARK_HOME is actually set in .bash_rc, in
 zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
 interpreter, something odd is going on.

 Best regards,
  Thomas

>>>


Re: NewSparkInterpreter fails on yarn-cluster

2018-06-07 Thread Thomas Bünger
Hey Jeff,

I tried your changes and now it works nicely. Thank you very much!

But I still can't use any of the forms and visualizations in yarn-cluster?
I was hoping that this got resolved with the new SparkInterpreter so that I
can switch from yarn-client to yarn-cluster mode in 0.8, but I'm still
getting errors like
"error: not found: value z"

Was this not in scope of that change? Is this a bug? Or is it known
limitation and also not supported in 0.8?

Best regards,
 Thomas

Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang :

>
> I can confirm that this is a bug, and created
> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>
> Will fix it soon
>
> Jeff Zhang 于2018年6月5日周二 下午9:01写道:
>
>>
>> hmm, it looks like a bug. I will check it tomorrow.
>>
>>
>> Thomas Bünger 于2018年6月5日周二 下午8:56写道:
>>
>>> $ ls /usr/lib/spark/python/lib
>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>
>>> So folder exists and contains both necessary zips. Please note, that in
>>> local or yarn-client mode the files are properly picked up from that very
>>> same location.
>>>
>>> How does yarn-cluster work under the hood? Could it be that environment
>>> variables (like SPARK_HOME) are lost, because they are only available in my
>>> local shell + zeppelin daemon process? Do I need to tell YARN somehow about
>>> SPARK_HOME?
>>>
>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang :
>>>

 Could you check whether there's folder /usr/lib/spark/python/lib ?


 Thomas Bünger 于2018年6月5日周二 下午8:45写道:

>
> sys.env
> java.lang.NullPointerException at
> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
> at
> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
> at
> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
> at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
>
> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang  >:
>
>> Could you paste the full stracktrace ?
>>
>>
>> Thomas Bünger 于2018年6月5日周二 下午8:21写道:
>>
>>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
>>> version of spark under /usr/lib/spark.
>>>
>>> This works fine in local or yarn-client mode, but in yarn-cluster
>>> mode i just get a
>>>
>>> java.lang.NullPointerException at
>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>
>>> Seems to be caused by an unsuccessful search for the py4j libraries.
>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>> interpreter, something odd is going on.
>>>
>>> Best regards,
>>>  Thomas
>>>
>>


Re: NewSparkInterpreter fails on yarn-cluster

2018-06-05 Thread Jeff Zhang
I can confirm that this is a bug, and created
https://issues.apache.org/jira/browse/ZEPPELIN-3531

Will fix it soon

Jeff Zhang 于2018年6月5日周二 下午9:01写道:

>
> hmm, it looks like a bug. I will check it tomorrow.
>
>
> Thomas Bünger 于2018年6月5日周二 下午8:56写道:
>
>> $ ls /usr/lib/spark/python/lib
>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>
>> So folder exists and contains both necessary zips. Please note, that in
>> local or yarn-client mode the files are properly picked up from that very
>> same location.
>>
>> How does yarn-cluster work under the hood? Could it be that environment
>> variables (like SPARK_HOME) are lost, because they are only available in my
>> local shell + zeppelin daemon process? Do I need to tell YARN somehow about
>> SPARK_HOME?
>>
>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang :
>>
>>>
>>> Could you check whether there's folder /usr/lib/spark/python/lib ?
>>>
>>>
>>> Thomas Bünger 于2018年6月5日周二 下午8:45写道:
>>>

 sys.env
 java.lang.NullPointerException at
 org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
 at
 org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
 at
 org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
 at
 org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
 at
 org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
 at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
 org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)


 Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang >>> >:

> Could you paste the full stracktrace ?
>
>
> Thomas Bünger 于2018年6月5日周二 下午8:21写道:
>
>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
>> version of spark under /usr/lib/spark.
>>
>> This works fine in local or yarn-client mode, but in yarn-cluster
>> mode i just get a
>>
>> java.lang.NullPointerException at
>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>
>> Seems to be caused by an unsuccessful search for the py4j libraries.
>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>> interpreter, something odd is going on.
>>
>> Best regards,
>>  Thomas
>>
>


Re: NewSparkInterpreter fails on yarn-cluster

2018-06-05 Thread Jeff Zhang
hmm, it looks like a bug. I will check it tomorrow.


Thomas Bünger 于2018年6月5日周二 下午8:56写道:

> $ ls /usr/lib/spark/python/lib
> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>
> So folder exists and contains both necessary zips. Please note, that in
> local or yarn-client mode the files are properly picked up from that very
> same location.
>
> How does yarn-cluster work under the hood? Could it be that environment
> variables (like SPARK_HOME) are lost, because they are only available in my
> local shell + zeppelin daemon process? Do I need to tell YARN somehow about
> SPARK_HOME?
>
> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang :
>
>>
>> Could you check whether there's folder /usr/lib/spark/python/lib ?
>>
>>
>> Thomas Bünger 于2018年6月5日周二 下午8:45写道:
>>
>>>
>>> sys.env
>>> java.lang.NullPointerException at
>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>> at
>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>> at
>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>> at
>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>> at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>> at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang :
>>>
 Could you paste the full stracktrace ?


 Thomas Bünger 于2018年6月5日周二 下午8:21写道:

> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
> version of spark under /usr/lib/spark.
>
> This works fine in local or yarn-client mode, but in yarn-cluster mode
> i just get a
>
> java.lang.NullPointerException at
> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>
> Seems to be caused by an unsuccessful search for the py4j libraries.
> I've made sure that SPARK_HOME is actually set in .bash_rc, in
> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
> interpreter, something odd is going on.
>
> Best regards,
>  Thomas
>



Re: NewSparkInterpreter fails on yarn-cluster

2018-06-05 Thread Thomas Bünger
$ ls /usr/lib/spark/python/lib
py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip

So folder exists and contains both necessary zips. Please note, that in
local or yarn-client mode the files are properly picked up from that very
same location.

How does yarn-cluster work under the hood? Could it be that environment
variables (like SPARK_HOME) are lost, because they are only available in my
local shell + zeppelin daemon process? Do I need to tell YARN somehow about
SPARK_HOME?

Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang :

>
> Could you check whether there's folder /usr/lib/spark/python/lib ?
>
>
> Thomas Bünger 于2018年6月5日周二 下午8:45写道:
>
>>
>> sys.env
>> java.lang.NullPointerException at
>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>> at
>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>> at
>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>> at
>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> at java.lang.Thread.run(Thread.java:748)
>>
>>
>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang :
>>
>>> Could you paste the full stracktrace ?
>>>
>>>
>>> Thomas Bünger 于2018年6月5日周二 下午8:21写道:
>>>
 I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
 version of spark under /usr/lib/spark.

 This works fine in local or yarn-client mode, but in yarn-cluster mode
 i just get a

 java.lang.NullPointerException at
 org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)

 Seems to be caused by an unsuccessful search for the py4j libraries.
 I've made sure that SPARK_HOME is actually set in .bash_rc, in
 zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
 interpreter, something odd is going on.

 Best regards,
  Thomas

>>>


Re: NewSparkInterpreter fails on yarn-cluster

2018-06-05 Thread Jeff Zhang
Could you check whether there's folder /usr/lib/spark/python/lib ?


Thomas Bünger 于2018年6月5日周二 下午8:45写道:

>
> sys.env
> java.lang.NullPointerException at
> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
> at
> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
> at
> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
> at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
>
> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang :
>
>> Could you paste the full stracktrace ?
>>
>>
>> Thomas Bünger 于2018年6月5日周二 下午8:21写道:
>>
>>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
>>> version of spark under /usr/lib/spark.
>>>
>>> This works fine in local or yarn-client mode, but in yarn-cluster mode i
>>> just get a
>>>
>>> java.lang.NullPointerException at
>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>
>>> Seems to be caused by an unsuccessful search for the py4j libraries.
>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>> interpreter, something odd is going on.
>>>
>>> Best regards,
>>>  Thomas
>>>
>>


Re: NewSparkInterpreter fails on yarn-cluster

2018-06-05 Thread Thomas Bünger
sys.env
java.lang.NullPointerException at
org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
at
org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang :

> Could you paste the full stracktrace ?
>
>
> Thomas Bünger 于2018年6月5日周二 下午8:21写道:
>
>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled version
>> of spark under /usr/lib/spark.
>>
>> This works fine in local or yarn-client mode, but in yarn-cluster mode i
>> just get a
>>
>> java.lang.NullPointerException at
>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>
>> Seems to be caused by an unsuccessful search for the py4j libraries.
>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>> interpreter, something odd is going on.
>>
>> Best regards,
>>  Thomas
>>
>


Re: NewSparkInterpreter fails on yarn-cluster

2018-06-05 Thread Jeff Zhang
Could you paste the full stracktrace ?


Thomas Bünger 于2018年6月5日周二 下午8:21写道:

> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled version
> of spark under /usr/lib/spark.
>
> This works fine in local or yarn-client mode, but in yarn-cluster mode i
> just get a
>
> java.lang.NullPointerException at
> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>
> Seems to be caused by an unsuccessful search for the py4j libraries.
> I've made sure that SPARK_HOME is actually set in .bash_rc, in
> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
> interpreter, something odd is going on.
>
> Best regards,
>  Thomas
>