[
https://issues.apache.org/jira/browse/SPARK-10713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271613#comment-15271613
]
Dara Adib edited comment on SPARK-10713 at 5/4/16 11:18 PM:
------------------------------------------------------------
[~devaraj.k] Thanks for trying to reproduce. I'm not using the Hadoop-free
builds anymore, so I tried testing with a random jar (in this case
spark-streaming-kafka-assembly) on Spark 1.6.1.
I'm using PySpark but here is a Scala example that seems to work too:
{code}
// Get classpath, taken from https://gist.github.com/jessitron/8376139.
def urlses(cl: ClassLoader): Array[java.net.URL] = cl match {
case null => Array()
case u: java.net.URLClassLoader => u.getURLs() ++ urlses(cl.getParent)
case _ => urlses(cl.getParent)
}
// driver
println(sys.env.get("SPARK_DIST_CLASSPATH"))
println(urlses(getClass.getClassLoader).mkString(":"))
// executor
println(sc.parallelize(Vector(0)).map(_ =>
sys.env.get("SPARK_DIST_CLASSPATH")).collect()(0))
println(sc.parallelize(Vector(0)).map(_ =>
urlses(getClass.getClassLoader).mkString(":")).collect()(0))
{code}
On both Mesos and YARN, SPARK_DIST_CLASSPATH is defined on the driver and jar
is included in classpath. However, on Mesos, SPARK_DIST_CLASSPATH is missing
from executors and jar is not in the classpath. It is present on YARN. Am I
missing something? Do you see different behavior?
was (Author: daradib):
[~devaraj.k] Thanks for trying to reproduce. I'm not using the Hadoop-free
builds anymore, so I tried testing with a random jar (in this case
spark-streaming-kafka-assembly) on Spark 1.6.1.
I'm using PySpark but here is a Scala example that seems to work too:
{code}
# Get classpath, taken from https://gist.github.com/jessitron/8376139.
def urlses(cl: ClassLoader): Array[java.net.URL] = cl match {
case null => Array()
case u: java.net.URLClassLoader => u.getURLs() ++ urlses(cl.getParent)
case _ => urlses(cl.getParent)
}
# driver
println(sys.env.get("SPARK_DIST_CLASSPATH"))
println(urlses(getClass.getClassLoader).mkString(":"))
# executor
println(sc.parallelize(Vector(0)).map(_ =>
sys.env.get("SPARK_DIST_CLASSPATH")).collect()(0))
println(sc.parallelize(Vector(0)).map(_ =>
urlses(getClass.getClassLoader).mkString(":")).collect()(0))
{code}
On both Mesos and YARN, SPARK_DIST_CLASSPATH is defined on the driver and jar
is included in classpath. However, on Mesos, SPARK_DIST_CLASSPATH is missing
from executors and jar is not in the classpath. It is present on YARN. Am I
missing something? Do you see different behavior?
> SPARK_DIST_CLASSPATH ignored on Mesos executors
> -----------------------------------------------
>
> Key: SPARK-10713
> URL: https://issues.apache.org/jira/browse/SPARK-10713
> Project: Spark
> Issue Type: Bug
> Components: Deploy, Mesos
> Affects Versions: 1.5.0
> Reporter: Dara Adib
> Priority: Minor
>
> If I set the environment variable SPARK_DIST_CLASSPATH, the jars are included
> on the driver, but not on Mesos executors. Docs:
> https://spark.apache.org/docs/latest/hadoop-provided.html
> I see SPARK_DIST_CLASSPATH mentioned in these files:
> launcher/src/main/java/org/apache/spark/launcher/AbstractCommandBuilder.java
> project/SparkBuild.scala
> yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
> But not the Mesos executor (or should it be included by the launcher
> library?):
> spark/core/src/main/scala/org/apache/spark/executor/Executor.scala
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]