Rahul Jain created ZEPPELIN-3830:
------------------------------------
Summary: Zeppelin loading of jars for yarn execution using
zeppelin-env.sh does not work as documented
Key: ZEPPELIN-3830
URL: https://issues.apache.org/jira/browse/ZEPPELIN-3830
Project: Zeppelin
Issue Type: Bug
Reporter: Rahul Jain
The documentation under
[https://zeppelin.apache.org/docs/0.8.0/interpreter/spark.html#2-loading-spark-properties]
describes that "–jar" option would be sufficient to load the jars for both
spark driver and executor.
However, when executing under yarn mode, we see the following:
We added a very basic (1 class) jar to SPARK_SUBMIT_OPTIONS under
conf/zeppelin-env.sh. "--jars "" to preload the required class "
{color:#000000}"com.mycompany.app.App"{color}
The jar seem to be distributed correctly to spark/yarn cluster and is available
in the class path.
However the following statement on zeppelin spark notebook still fails:
import com.mycompany.app.App
"<console>:23: error: object mycompany is not a member of package com
{color:#000000}import com.mycompany.app.App"{color}
{color:#000000}To make the above work, we had to go into the zeppelin
interpreter configuration using the UI, added: key=spark.jars , value="
/home/hadoop/my-app-1.0-SNAPSHOT.jar"{color}
The above import worked fine after that.
We also compared the difference between using explicit
"z.load("/home/hadoop/my-app-1.0-SNAPSHOT.jar") statement and the above
pre-loading method.
The explicit loading of additional jars through z.load("") does work properly
under yarn mode. Again, the difference between pre-load methodology through
"–jars" option and z.load() appears to be spark.jars setting.
I'd assume the fix for jar pre-load to work properly would be to provide a
different methodology in zeppelin as SPARK_SUBMIT_OPTIONS method does not seem
to be sufficient.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)