Hi all, Thanks for the 0.8.0 release!
We’re keen to take advantage of the yarn-cluster support to take the pressure off our Zeppelin host. However, I am having some trouble with it. The first problem was in following the documentation here: https://zeppelin.apache.org/docs/0.8.0/interpreter/spark.html This suggests that we need to modify the master configuration from “yarn-client” to “yarn-cluster”. However, doing so results in the following error: Warning: Master yarn-cluster is deprecated since 2.0. Please use master “yarn” with specified deploy mode instead. Error: Client deploy mode is not compatible with master “yarn-cluster” Run with --help for usage help or --verbose for debug output <stacktrace> I got past this error with the following settings: master = yarn spark.submit.deployMode = cluster I’m somewhat unclear if I’m straying from the correct (documented) configuration or if the documentation needs an update. Anyway; These settings appear to work for everything except the ZeppelinContext, which is missing. Code: %spark z Output: <console>:24: error: not found: value z Using yarn-client mode I can identify that z is meant to be an instance of org.apache.zeppelin.spark.SparkZeppelinContext Code: %spark z Output: res4: org.apache.zeppelin.spark.SparkZeppelinContext = org.apache.zeppelin.spark.SparkZeppelinContext@5b9282e1 However, this class is absent in cluster-mode: Code: %spark org.apache.zeppelin.spark.SparkZeppelinContext Output: <console>:24: error: object zeppelin is not a member of package org.apache org.apache.zeppelin.spark.SparkZeppelinContext ^ Snooping around the Zeppelin installation I was able to locate this class in ${ZEPPELIN_INSTALL_DIR}/interpreter/spark/spark-interpreter-0.8.0.jar. I then uploaded this jar to HDFS and added it to spark.jars & spark.driver.extraClassPath. Relevant entries in driver log: … Added JAR hdfs:/spark-interpreter-0.8.0.jar at hdfs:/tmp/zeppelin/spark-interpreter-0.8.0.jar with timestamp 1531732774379 … CLASSPATH -> …:hdfs:/tmp/zeppelin/spark-interpreter-0.8.0.jar … … command: … file:$PWD/spark-interpreter-0.8.0.jar \ etc. However, I still can’t use the ZeppelinContext or org.apache.zeppelin.spark.SparkZeppelinContext class. At this point I’ve run out of ideas and am willing to ask for help. Does anyone have thoughts on how I could use the ZeppelinContext in yarn cluster mode? Regards, Chris.