Re: spark standalone cluster 1.5.2

Jason Plurad Fri, 29 Jan 2016 09:09:19 -0800

They came back with https://issues.apache.org/jira/browse/SPARK-13084


RDD
<https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L74>
is declared as Serializable, but it doesn't define a serialVersionUID.

In the meantime, it sounds like you have to match the compiled Spark
version with the runtime. I saw a bunch of posts and a couple JIRA where
they always came back to that as the solution.

Wonder how exposed TinkerPop is with Serializable and serialVersionUIDs.


On Thu, Jan 28, 2016 at 4:10 PM, Jason Plurad <plur...@gmail.com> wrote:

> Yeah, I was surprised about the incompatibility. It seems contained to the
> standalone Spark server deployment only.
>
> You can reproduce the same stack trace with their Spark Pi example on
> standalone Spark servers (try to run Pi from 1.5.2 on a 1.5.1 standalone,
> or Pi 1.5.1 on a 1.5.2 standalone).
>
> yarn-client and local tested out fine.
>
> I'll post out on the Spark list and see what they come back with.
>
>
> On Thu, Jan 28, 2016 at 3:51 PM, Marko Rodriguez <okramma...@gmail.com>
> wrote:
>
>> Hello,
>>
>> This is odd. We are currently doing TinkerPop 3.1.1-SNAPSHOT + Spark
>> 1.5.2 2-billion edge benchmarking (against SparkServer) and all is good.
>>
>> Are you saying that Spark 1.5.1 and Spark 1.5.2 are incompatible? Thats a
>> bummer.
>>
>> I don't think there is an "official policy," but I always bump minor
>> release versions with minor release versions. That is, I didn't bump to
>> Spark 1.6.0 (we will do that for TinkerPop 3.2.0), but since 1.5.1 is minor
>> to 1.5.2, I bumped. We have always done that -- e.g. Neo4j, Hadoop, various
>> Java libraries…
>>
>> Thoughts?,
>> Marko.
>>
>> http://markorodriguez.com
>>
>> On Jan 28, 2016, at 1:48 PM, Jason Plurad <plur...@gmail.com> wrote:
>>
>> > We're running into this error with standalone Spark clusters
>> > <http://spark.apache.org/docs/1.5.2/spark-standalone.html>.
>> >
>> > ```
>> > WARN  org.apache.spark.scheduler.TaskSetManager  - Lost task 0.0 in
>> stage
>> > 0.0 (TID 0, 192.168.14.103): java.io.InvalidClassException:
>> > org.apache.spark.rdd.RDD; local class incompatible: stream classdesc
>> > serialVersionUID = -3343649307726848892, local class serialVersionUID =
>> > -3996494161745401652
>> >    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:621)
>> >    at
>> > java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
>> >    at
>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>> >    at
>> > java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
>> >    at
>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>> >    at
>> >
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>> >    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>> >    at
>> > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>> >    at
>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>> >    at
>> >
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>> >    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>> >    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>> >    at
>> >
>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
>> >    at
>> >
>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
>> >    at
>> >
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:64)
>> >    at
>> >
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>> >    at org.apache.spark.scheduler.Task.run(Task.scala:88)
>> >    at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>> >    at
>> >
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> >    at
>> >
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> >    at java.lang.Thread.run(Thread.java:745)
>> > ```
>> >
>> > You can reproduce this error 2 ways:
>> > * Run a SparkGraphComputer from TinkerPop 3.1.0-incubating against a
>> Spark
>> > 1.5.2 standalone cluster
>> > * Run a SparkGraphComputer from TinkerPop 3.1.1-SNAPSHOT against a Spark
>> > 1.5.1 standalone cluster
>> >
>> > Only standalone Spark cluster gets broken -- the Spark cluster version
>> must
>> > be matched exactly with what TinkerPop is built against.
>> >
>> > This commit
>> > <
>> https://github.com/apache/incubator-tinkerpop/commit/78b10569755070b088c460341bb473112dfe3ffe#diff-402e09222db9327564f28924e1b39d0c
>> >
>> > bumped up the Spark version from 1.5.1 to 1.5.2. As Marko mentioned, it
>> > does pass the unit tests, but the unit tests are run with
>> > `spark.master=local`. I've tested that it also works with
>> > `spark.master=yarn-client`.
>> >
>> > What is -- or rather, what should be -- the direction/policy for
>> dependency
>> > version upgrades in TinkerPop?
>> >
>> > -- Jason
>>
>>
>

Re: spark standalone cluster 1.5.2

Reply via email to