[
https://issues.apache.org/jira/browse/SPARK-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14059419#comment-14059419
]
Ankur Dave commented on SPARK-2347:
-----------------------------------
VertexPartition is actually supposed to be Serializable; it was an oversight
not to mark it as such. A workaround is to use Kryo serialization instead of
Java serialization, and I'm submitting a fix as well.
> Graph object can not be set to StorageLevel.MEMORY_ONLY_SER
> -----------------------------------------------------------
>
> Key: SPARK-2347
> URL: https://issues.apache.org/jira/browse/SPARK-2347
> Project: Spark
> Issue Type: Bug
> Components: GraphX
> Affects Versions: 1.0.0
> Environment: Spark standalone with 5 workers and 1 driver
> Reporter: Baoxu Shi
>
> I'm creating Graph object by using
> Graph(vertices, edges, null, StorageLevel.MEMORY_ONLY,
> StorageLevel.MEMORY_ONLY)
> But that will throw out not serializable exception on both workers and
> driver.
> 14/07/02 16:30:26 ERROR BlockManagerWorker: Exception handling buffer message
> java.io.NotSerializableException: org.apache.spark.graphx.impl.VertexPartition
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
> at
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)
> at
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
> at
> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42)
> at
> org.apache.spark.serializer.SerializationStream$class.writeAll(Serializer.scala:106)
> at
> org.apache.spark.serializer.JavaSerializationStream.writeAll(JavaSerializer.scala:30)
> at
> org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:988)
> at
> org.apache.spark.storage.BlockManager.dataSerialize(BlockManager.scala:997)
> at org.apache.spark.storage.MemoryStore.getBytes(MemoryStore.scala:102)
> at
> org.apache.spark.storage.BlockManager.doGetLocal(BlockManager.scala:392)
> at
> org.apache.spark.storage.BlockManager.getLocalBytes(BlockManager.scala:358)
> at
> org.apache.spark.storage.BlockManagerWorker.getBlock(BlockManagerWorker.scala:90)
> at
> org.apache.spark.storage.BlockManagerWorker.processBlockMessage(BlockManagerWorker.scala:69)
> at
> org.apache.spark.storage.BlockManagerWorker$$anonfun$2.apply(BlockManagerWorker.scala:44)
> at
> org.apache.spark.storage.BlockManagerWorker$$anonfun$2.apply(BlockManagerWorker.scala:44)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> at
> org.apache.spark.storage.BlockMessageArray.foreach(BlockMessageArray.scala:28)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> at
> org.apache.spark.storage.BlockMessageArray.map(BlockMessageArray.scala:28)
> at
> org.apache.spark.storage.BlockManagerWorker.onBlockMessageReceive(BlockManagerWorker.scala:44)
> at
> org.apache.spark.storage.BlockManagerWorker$$anonfun$1.apply(BlockManagerWorker.scala:34)
> at
> org.apache.spark.storage.BlockManagerWorker$$anonfun$1.apply(BlockManagerWorker.scala:34)
> at
> org.apache.spark.network.ConnectionManager.org$apache$spark$network$ConnectionManager$$handleMessage(ConnectionManager.scala:662)
> at
> org.apache.spark.network.ConnectionManager$$anon$9.run(ConnectionManager.scala:504)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Even if the driver sometime does not throw this exception, it will throw
> java.io.FileNotFoundException:
> /tmp/spark-local-20140702151845-9620/2a/shuffle_2_25_3 (No such file or
> directory)
> I know that VertexPartition not supposed to be serializable, so is there any
> workaround on this?
--
This message was sent by Atlassian JIRA
(v6.2#6252)