Chris Penny created ZEPPELIN-3633:
-------------------------------------

             Summary: ZeppelinContext Not Found in yarn-cluster Mode
                 Key: ZEPPELIN-3633
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3633
             Project: Zeppelin
          Issue Type: Bug
          Components: zeppelin-interpreter
    Affects Versions: 0.8.0
            Reporter: Chris Penny


Hi all, Thanks for the 0.8.0 release!

We’re keen to take advantage of the yarn-cluster support to take the pressure 
off our Zeppelin host. However, I am having some trouble with it. The first 
problem was in following the documentation here:

[https://zeppelin.apache.org/docs/0.8.0/interpreter/spark.html]

This suggests that we need to modify the master configuration from 
“yarn-client” to “yarn-cluster”.

However, doing so results in the following error:
Warning: Master yarn-cluster is deprecated since 2.0. Please use master “yarn” 
with specified deploy mode instead. 
Error: Client deploy mode is not compatible with master “yarn-cluster” Run with 
--help for usage help or --verbose for debug output
 <stacktrace>


I got past this error with the following settings: 
master = yarn 
spark.submit.deployMode = cluster 

I’m somewhat unclear if I’m straying from the correct (documented) 
configuration or if the documentation needs an update. Anyway; These settings 
appear to work for everything except the ZeppelinContext, which is missing.
 Code: 
%spark 
z

Output: 
<console>:24: error: not found: value z

Using yarn-client mode I can identify that z is meant to be an instance of 
org.apache.zeppelin.spark.SparkZeppelinContext 
Code: 
%spark
 z

Output: 
res4: org.apache.zeppelin.spark.SparkZeppelinContext = 
org.apache.zeppelin.spark.SparkZeppelinContext@5b9282e1

However, this class is absent in cluster-mode: 
Code: %spark 
org.apache.zeppelin.spark.SparkZeppelinContext

Output:
 <console>:24: error: object zeppelin is not a member of package org.apache 
org.apache.zeppelin.spark.SparkZeppelinContext 
                   ^

Snooping around the Zeppelin installation I was able to locate this class in 
${ZEPPELIN_INSTALL_DIR}/interpreter/spark/spark-interpreter-0.8.0.jar. I then 
uploaded this jar to HDFS and added it to spark.jars & 
spark.driver.extraClassPath. Relevant entries in driver log:
… 
Added JAR hdfs:/spark-interpreter-0.8.0.jar at 
hdfs:/tmp/zeppelin/spark-interpreter-0.8.0.jar with timestamp 1531732774379 
… 
CLASSPATH -> …:hdfs:/tmp/zeppelin/spark-interpreter-0.8.0.jar …
 … 
command: … file:$PWD/spark-interpreter-0.8.0.jar \ 
etc.

However, I still can’t use the ZeppelinContext or  
org.apache.zeppelin.spark.SparkZeppelinContext class. At this point I’ve run 
out of ideas and am willing to ask for help.

Does anyone have thoughts on how I could use the ZeppelinContext in yarn 
cluster mode?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to