Hi,
I am trying to get Zeppelin work in yarn-client and yarn-submit mode. There
are some conflicting notes in the email distros about how to get Zeppelin
to work on yarn-client mode. I have tried a few different things but none
have worked for me so far.
I have CDH 5.3 . Here is what I have so far
1. Built Zeppelin on local (OSX) using
1. mvn clean package -Pspark-1.2 -Dhadoop.version=2.5.0-cdh5.3.0
-Phadoop-2.4 -DskipTests -Pyarn
2. Generated a distribution package and deployed to edge node of my
cluster using
1. mvn clean package -P build-distr -DskipTests
3. At this point local mode works fine
4. To get Yarn mode working
1. I set Hadoop conf dir in zeppelin-env.sh and set Yarn
Master=yarn-client
I get an exception in my logs
ERROR ({pool-2-thread-5} Job.java[run]:165) - Job failed
com.nflabs.zeppelin.interpreter.InterpreterException:
org.apache.thrift.TApplicationException: Internal error processing open
2. Set Spark_Yarn_Jar but got same exception as above
3. Copied my spark assembly jar in interpreter/spark directory but
that did not work either
4. Also set Spark Yarn Jar and Spark Home in interpreter UI but that
did not work
5. I took Spark Yarn Jar out and let Zeppelin use Spark that comes
bundled with it
1. This seemed to work initially, but as soon as I call Spark
Action (collect/count etc) I get this exception
java.lang.RuntimeException: Error in configuring object at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:182)
Any pointers on I might have missed in configuring zeppelin to work with
yarn-client mode?
Thanks,
Charmee