We're running into this error with standalone Spark clusters
<http://spark.apache.org/docs/1.5.2/spark-standalone.html>.

```
WARN  org.apache.spark.scheduler.TaskSetManager  - Lost task 0.0 in stage
0.0 (TID 0, 192.168.14.103): java.io.InvalidClassException:
org.apache.spark.rdd.RDD; local class incompatible: stream classdesc
serialVersionUID = -3343649307726848892, local class serialVersionUID =
-3996494161745401652
    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:621)
    at
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
    at
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
    at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
    at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
    at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
    at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
    at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:64)
    at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:88)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
```

You can reproduce this error 2 ways:
* Run a SparkGraphComputer from TinkerPop 3.1.0-incubating against a Spark
1.5.2 standalone cluster
* Run a SparkGraphComputer from TinkerPop 3.1.1-SNAPSHOT against a Spark
1.5.1 standalone cluster

Only standalone Spark cluster gets broken -- the Spark cluster version must
be matched exactly with what TinkerPop is built against.

This commit
<https://github.com/apache/incubator-tinkerpop/commit/78b10569755070b088c460341bb473112dfe3ffe#diff-402e09222db9327564f28924e1b39d0c>
bumped up the Spark version from 1.5.1 to 1.5.2. As Marko mentioned, it
does pass the unit tests, but the unit tests are run with
`spark.master=local`. I've tested that it also works with
`spark.master=yarn-client`.

What is -- or rather, what should be -- the direction/policy for dependency
version upgrades in TinkerPop?

-- Jason

Reply via email to