[
https://issues.apache.org/jira/browse/SPARK-13202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15134332#comment-15134332
]
Michael Schmitz edited comment on SPARK-13202 at 2/5/16 3:40 PM:
-----------------------------------------------------------------
[~srowen] in that case the documentation is confusing as it claims
spark.executor.extraClassPath is deprecated (specifically "This exists
primarily for backwards-compatibility with older versions of Spark. Users
typically should not need to set this option."). It also means that the
`--jars` options isn't as useful as it could be. It copies the jars to the
worker nodes, but I cannot use those jars for the Kryo serialization because I
don't know where they will end up.
was (Author: schmmd):
[~srowen] in that case the documentation is confusing as it claims
spark.executor.extraClassPath is deprecated. It also means that the `--jars`
options isn't as useful as it could be. It copies the jars to the worker
nodes, but I cannot use those jars for the Kryo serialization because I don't
know where they will end up.
> Jars specified with --jars do not exist on the worker classpath.
> ----------------------------------------------------------------
>
> Key: SPARK-13202
> URL: https://issues.apache.org/jira/browse/SPARK-13202
> Project: Spark
> Issue Type: Bug
> Components: Spark Shell
> Affects Versions: 1.5.2, 1.6.0
> Reporter: Michael Schmitz
>
> I have a Spark Scala 2.11 application. To deploy it to the cluster, I create
> a jar of the dependencies and a jar of the project (although this problem
> still manifests if I create a single jar with everything). I will focus on
> problems specific to Spark Shell, but I'm pretty sure they also apply to
> Spark Submit.
> I can get Spark Shell to work with my application, however I need to set
> spark.executor.extraClassPath. From reading the documentation
> (http://spark.apache.org/docs/latest/configuration.html#runtime-environment)
> it sounds like I shouldn't need to set this option ("Users typically should
> not need to set this option.") After reading about --jars, I understand that
> this should set the classpath for the workers to use the jars that are synced
> to those machines.
> When I don't set spark.executor.extraClassPath, I get a kryo registrator
> exception with the root cause being that a class is not found.
> java.io.IOException: org.apache.spark.SparkException: Failed to register
> classes with Kryo
> java.lang.ClassNotFoundException: org.allenai.common.Enum
> If I SSH into the workers, I can see that we did create directories that
> contain the jars specified by --jars.
> /opt/data/spark/worker/app-20160204212742-0002/0
> /opt/data/spark/worker/app-20160204212742-0002/1
> Now, if I re-run spark-shell but with `--conf
> spark.executor.extraClassPath=/opt/data/spark/worker/app-20160204212742-0002/0/myjar.jar`,
> my job will succeed. In other words, if I put my jars at a location that is
> available to all the workers and specify that as an extra executor class
> path, the job succeeds.
> Unfortunately, this means that the jars are being copied to the workers for
> no reason. How can I get --jars to add the jars it copies to the workers to
> the classpath?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]