Posted on JIRA: https://spark-project.atlassian.net/browse/SPARK-896 I slightly messed up the markup but I do not have edit permissions on an issue I report. :(
On Mon, Sep 9, 2013 at 1:27 PM, Gary Malouf <[email protected]> wrote: > That resolves the issue. What is strange to me is that with just the > 'ADD_JARS' set, before the map is called I am able to import my main > package object and use functions defined in that jar. The failure is when > the mapper function starts to run. > > > On Mon, Sep 9, 2013 at 1:13 PM, Gary Malouf <[email protected]> wrote: > >> Will test and report back. Is this the same issue as: >> https://groups.google.com/forum/#!topic/spark-users/fEcgIrL-gII? >> >> >> On Mon, Sep 9, 2013 at 1:03 PM, Matei Zaharia <[email protected]>wrote: >> >>> No, I think this might be an actual bug. The problem seems to be with >>> the classpath on the driver program actually, not on the executors. You >>> might be able to fix it as follows: export SPARK_CLASSPATH=<your JAR> >>> before running spark-shell, in addition to doing ADD_JARS. But even if that >>> fixes it you should report the issue. >>> >>> Matei >>> >>> On Sep 9, 2013, at 5:18 AM, Gary Malouf <[email protected]> wrote: >>> >>> Any other checks I should do before filing this as an issue? I know for >>> my team it's a significant blocker right now. >>> >>> >>> On Sun, Sep 8, 2013 at 7:59 PM, Gary Malouf <[email protected]>wrote: >>> >>>> Hi Matei, >>>> >>>> We are using Spark 0.7.3 on a Mesos cluster. >>>> >>>> The logs when I start Spark shell include: >>>> >>>> 13/09/08 23:44:17 INFO spark.SparkContext: Added JAR >>>> /opt/spark/mx-lib/verrazano_2.9.3-0.1-SNAPSHOT-assembly.jar at >>>> http://10.236.136.202:31658/jars/verrazano_2.9.3-0.1-SNAPSHOT-assembly.jarwith >>>> timestamp 1378683857701 >>>> >>>> I can also confirm that the 'verrazano' jar (my custom one) is in a >>>> mesos slave temp directory on all of the slave nodes. >>>> >>>> >>>> >>>> >>>> On Sun, Sep 8, 2013 at 7:01 PM, Matei Zaharia >>>> <[email protected]>wrote: >>>> >>>>> Which version of Spark is this with? Did the logs print something >>>>> about sending the JAR you added with ADD_JARS to the cluster? >>>>> >>>>> Matei >>>>> >>>>> On Sep 8, 2013, at 8:56 AM, Gary Malouf <[email protected]> wrote: >>>>> >>>>> > I built a custom jar with among other things, nscalatime and joda >>>>> time packed inside of it. Using the ADD_JARS variable, I have added this >>>>> super jar to my classpath on the scheduler when running spark-shell. I >>>>> wrote a function that grabs protobuf data, filters and then maps each >>>>> message to a (LocalDate, Option[String]) format. Unfortunately, this does >>>>> not run and I get the following: >>>>> > >>>>> > 13/09/08 15:50:43 INFO cluster.TaskSetManager: Finished TID 6 in 348 >>>>> ms (progress: 7/576) >>>>> > Exception in thread "Thread-159" java.lang.ClassNotFoundException: >>>>> org.joda.time.LocalDate >>>>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>>>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>>>> > at java.security.AccessController.doPrivileged(Native Method) >>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>>>> > at >>>>> scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.scala$tools$nsc$util$ScalaClassLoader$$super$findClass(ScalaClassLoader.scala:88) >>>>> > at >>>>> scala.tools.nsc.util.ScalaClassLoader$class.findClass(ScalaClassLoader.scala:44) >>>>> > at >>>>> scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.findClass(ScalaClassLoader.scala:88) >>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) >>>>> > at >>>>> scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.scala$tools$nsc$util$ScalaClassLoader$$super$loadClass(ScalaClassLoader.scala:88) >>>>> > at >>>>> scala.tools.nsc.util.ScalaClassLoader$class.loadClass(ScalaClassLoader.scala:50) >>>>> > at >>>>> scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.loadClass(ScalaClassLoader.scala:88) >>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) >>>>> > at java.lang.Class.forName0(Native Method) >>>>> > at java.lang.Class.forName(Class.java:266) >>>>> > at >>>>> spark.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:20) >>>>> > at >>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1610) >>>>> > at >>>>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1515) >>>>> > at >>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1769) >>>>> > at >>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>>> > at >>>>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >>>>> > at >>>>> it.unimi.dsi.fastutil.objects.Object2LongOpenHashMap.readObject(Object2LongOpenHashMap.java:757) >>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>> > at >>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>>> > at >>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>> > at java.lang.reflect.Method.invoke(Method.java:601) >>>>> > at >>>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1004) >>>>> > at >>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1891) >>>>> > at >>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>>> > at >>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>>> > at >>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1989) >>>>> > at >>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1913) >>>>> > at >>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >>>>> > at >>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>>> > at >>>>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >>>>> > at spark.scheduler.TaskResult.readExternal(TaskResult.scala:26) >>>>> > at >>>>> java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1835) >>>>> > at >>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1794) >>>>> > at >>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348) >>>>> > at >>>>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >>>>> > at >>>>> spark.JavaDeserializationStream.readObject(JavaSerializer.scala:23) >>>>> > at >>>>> spark.JavaSerializerInstance.deserialize(JavaSerializer.scala:45) >>>>> > at >>>>> spark.scheduler.cluster.TaskSetManager.taskFinished(TaskSetManager.scala:261) >>>>> > at >>>>> spark.scheduler.cluster.TaskSetManager.statusUpdate(TaskSetManager.scala:236) >>>>> > at >>>>> spark.scheduler.cluster.ClusterScheduler.statusUpdate(ClusterScheduler.scala:219) >>>>> > at >>>>> spark.scheduler.mesos.MesosSchedulerBackend.statusUpdate(MesosSchedulerBackend.scala:264) >>>>> > 13/09/08 15:50:43 INFO mesos.MesosSchedulerBackend: driver.run() >>>>> returned with code DRIVER_ABORTED >>>>> > >>>>> > >>>>> > The code definitely compiles in the interpreter and the executors >>>>> seem to find the protobuf messages which are in the same jar - any idea >>>>> what could be causing the problem? >>>>> >>>>> >>>> >>> >>> >> >
