Re: When using spark shell, classpath on workers does not seem to see all of my custom classes

Gary Malouf Mon, 09 Sep 2013 11:23:06 -0700

Posted on JIRA: https://spark-project.atlassian.net/browse/SPARK-896  I
slightly messed up the markup but I do not have edit permissions on an
issue I report.  :(



On Mon, Sep 9, 2013 at 1:27 PM, Gary Malouf <[email protected]> wrote:

> That resolves the issue.  What is strange to me is that with just the
> 'ADD_JARS' set, before the map is called I am able to import my main
> package object and use functions defined in that jar.  The failure is when
> the mapper function starts to run.
>
>
> On Mon, Sep 9, 2013 at 1:13 PM, Gary Malouf <[email protected]> wrote:
>
>> Will test and report back.  Is this the same issue as:
>> https://groups.google.com/forum/#!topic/spark-users/fEcgIrL-gII?
>>
>>
>> On Mon, Sep 9, 2013 at 1:03 PM, Matei Zaharia <[email protected]>wrote:
>>
>>> No, I think this might be an actual bug. The problem seems to be with
>>> the classpath on the driver program actually, not on the executors. You
>>> might be able to fix it as follows: export SPARK_CLASSPATH=<your JAR>
>>> before running spark-shell, in addition to doing ADD_JARS. But even if that
>>> fixes it you should report the issue.
>>>
>>> Matei
>>>
>>> On Sep 9, 2013, at 5:18 AM, Gary Malouf <[email protected]> wrote:
>>>
>>> Any other checks I should do before filing this as an issue?  I know for
>>> my team it's a significant blocker right now.
>>>
>>>
>>> On Sun, Sep 8, 2013 at 7:59 PM, Gary Malouf <[email protected]>wrote:
>>>
>>>> Hi Matei,
>>>>
>>>> We are using Spark 0.7.3 on a Mesos cluster.
>>>>
>>>> The logs when I start Spark shell include:
>>>>
>>>> 13/09/08 23:44:17 INFO spark.SparkContext: Added JAR
>>>> /opt/spark/mx-lib/verrazano_2.9.3-0.1-SNAPSHOT-assembly.jar at
>>>> http://10.236.136.202:31658/jars/verrazano_2.9.3-0.1-SNAPSHOT-assembly.jarwith
>>>>  timestamp 1378683857701
>>>>
>>>> I can also confirm that the 'verrazano' jar (my custom one) is in a
>>>> mesos slave temp directory on all of the slave nodes.
>>>>
>>>>
>>>>
>>>>
>>>> On Sun, Sep 8, 2013 at 7:01 PM, Matei Zaharia 
>>>> <[email protected]>wrote:
>>>>
>>>>> Which version of Spark is this with? Did the logs print something
>>>>> about sending the JAR you added with ADD_JARS to the cluster?
>>>>>
>>>>> Matei
>>>>>
>>>>> On Sep 8, 2013, at 8:56 AM, Gary Malouf <[email protected]> wrote:
>>>>>
>>>>> > I built a custom jar with among other things, nscalatime and joda
>>>>> time packed inside of it.  Using the ADD_JARS variable, I have added this
>>>>> super jar to my classpath on the scheduler when running spark-shell.  I
>>>>> wrote a function that grabs protobuf data, filters and then maps each
>>>>> message to a (LocalDate, Option[String]) format.  Unfortunately, this does
>>>>> not run and I get the following:
>>>>> >
>>>>> > 13/09/08 15:50:43 INFO cluster.TaskSetManager: Finished TID 6 in 348
>>>>> ms (progress: 7/576)
>>>>> > Exception in thread "Thread-159" java.lang.ClassNotFoundException:
>>>>> org.joda.time.LocalDate
>>>>> >     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>> >     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>> >     at java.security.AccessController.doPrivileged(Native Method)
>>>>> >     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>> >     at
>>>>> scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.scala$tools$nsc$util$ScalaClassLoader$$super$findClass(ScalaClassLoader.scala:88)
>>>>> >     at
>>>>> scala.tools.nsc.util.ScalaClassLoader$class.findClass(ScalaClassLoader.scala:44)
>>>>> >     at
>>>>> scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.findClass(ScalaClassLoader.scala:88)
>>>>> >     at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>>> >     at
>>>>> scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.scala$tools$nsc$util$ScalaClassLoader$$super$loadClass(ScalaClassLoader.scala:88)
>>>>> >     at
>>>>> scala.tools.nsc.util.ScalaClassLoader$class.loadClass(ScalaClassLoader.scala:50)
>>>>> >     at
>>>>> scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.loadClass(ScalaClassLoader.scala:88)
>>>>> >     at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>>> >     at java.lang.Class.forName0(Native Method)
>>>>> >     at java.lang.Class.forName(Class.java:266)
>>>>> >     at
>>>>> spark.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:20)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1610)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1769)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>>>> >     at
>>>>> it.unimi.dsi.fastutil.objects.Object2LongOpenHashMap.readObject(Object2LongOpenHashMap.java:757)
>>>>> >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> >     at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>> >     at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> >     at java.lang.reflect.Method.invoke(Method.java:601)
>>>>> >     at
>>>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1004)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1891)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>>>>> >     at
>>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>>>> >     at spark.scheduler.TaskResult.readExternal(TaskResult.scala:26)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1835)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1794)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
>>>>> >     at
>>>>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>>>> >     at
>>>>> spark.JavaDeserializationStream.readObject(JavaSerializer.scala:23)
>>>>> >     at
>>>>> spark.JavaSerializerInstance.deserialize(JavaSerializer.scala:45)
>>>>> >     at
>>>>> spark.scheduler.cluster.TaskSetManager.taskFinished(TaskSetManager.scala:261)
>>>>> >     at
>>>>> spark.scheduler.cluster.TaskSetManager.statusUpdate(TaskSetManager.scala:236)
>>>>> >     at
>>>>> spark.scheduler.cluster.ClusterScheduler.statusUpdate(ClusterScheduler.scala:219)
>>>>> >     at
>>>>> spark.scheduler.mesos.MesosSchedulerBackend.statusUpdate(MesosSchedulerBackend.scala:264)
>>>>> > 13/09/08 15:50:43 INFO mesos.MesosSchedulerBackend: driver.run()
>>>>> returned with code DRIVER_ABORTED
>>>>> >
>>>>> >
>>>>> > The code definitely compiles in the interpreter and the executors
>>>>> seem to find the protobuf messages which are in the same jar - any idea
>>>>> what could be causing the problem?
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>

Re: When using spark shell, classpath on workers does not seem to see all of my custom classes

Reply via email to