[
https://issues.apache.org/jira/browse/SPARK-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guoqiang Li updated SPARK-2520:
-------------------------------
Description:
This issue occurs with a very small probability. I can not reproduce it.
The executor log:
{code}
14/07/15 21:54:50 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 0 to spark@sanshan:34429
14/07/15 21:54:50 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 0 to spark@sanshan:31934
14/07/15 21:54:50 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 0 to spark@sanshan:30557
14/07/15 21:54:50 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 0 to spark@sanshan:42606
14/07/15 21:54:50 INFO spark.MapOutputTrackerMasterActor: Asked to send map
output locations for shuffle 0 to spark@sanshan:37314
14/07/15 21:54:50 INFO scheduler.TaskSetManager: Starting task 0.0:166 as TID
4948 on executor 20: tuan221 (PROCESS_LOCAL)
14/07/15 21:54:50 INFO scheduler.TaskSetManager: Serialized task 0.0:166 as
3129 bytes in 1 ms
14/07/15 21:54:50 WARN scheduler.TaskSetManager: Lost TID 4868 (task 0.0:86)
14/07/15 21:54:50 WARN scheduler.TaskSetManager: Loss was due to
java.io.StreamCorruptedException
java.io.StreamCorruptedException: invalid type code: AC
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1377)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
at
org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:125)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at
org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:30)
at
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at
org.apache.spark.Aggregator.combineCombinersByKey(Aggregator.scala:87)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:101)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:100)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
14/07/15 21:54:50 INFO scheduler.TaskSetManager: Starting task 0.0:86 as TID
4949 on executor 20: tuan221 (PROCESS_LOCAL)
14/07/15 21:54:50 INFO scheduler.TaskSetManager: Serialized task 0.0:86 as 3129
bytes in 0 ms
14/07/15 21:54:50 WARN scheduler.TaskSetManager: Lost TID 4785 (task 0.0:3)
{code}
was:
This issue occurs with a very small probability. I can not reproduce it.
The executor log:
{code}
java.io.StreamCorruptedException: invalid type code: AC
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1377)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
at
org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:125)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at
org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:30)
at
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at
org.apache.spark.Aggregator.combineCombinersByKey(Aggregator.scala:87)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:101)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:100)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}
> the executor is thrown java.io.StreamCorruptedException
> -------------------------------------------------------
>
> Key: SPARK-2520
> URL: https://issues.apache.org/jira/browse/SPARK-2520
> Project: Spark
> Issue Type: Bug
> Affects Versions: 1.0.0
> Reporter: Guoqiang Li
> Priority: Critical
>
> This issue occurs with a very small probability. I can not reproduce it.
> The executor log:
> {code}
> 14/07/15 21:54:50 INFO spark.MapOutputTrackerMasterActor: Asked to send map
> output locations for shuffle 0 to spark@sanshan:34429
> 14/07/15 21:54:50 INFO spark.MapOutputTrackerMasterActor: Asked to send map
> output locations for shuffle 0 to spark@sanshan:31934
> 14/07/15 21:54:50 INFO spark.MapOutputTrackerMasterActor: Asked to send map
> output locations for shuffle 0 to spark@sanshan:30557
> 14/07/15 21:54:50 INFO spark.MapOutputTrackerMasterActor: Asked to send map
> output locations for shuffle 0 to spark@sanshan:42606
> 14/07/15 21:54:50 INFO spark.MapOutputTrackerMasterActor: Asked to send map
> output locations for shuffle 0 to spark@sanshan:37314
> 14/07/15 21:54:50 INFO scheduler.TaskSetManager: Starting task 0.0:166 as TID
> 4948 on executor 20: tuan221 (PROCESS_LOCAL)
> 14/07/15 21:54:50 INFO scheduler.TaskSetManager: Serialized task 0.0:166 as
> 3129 bytes in 1 ms
> 14/07/15 21:54:50 WARN scheduler.TaskSetManager: Lost TID 4868 (task 0.0:86)
> 14/07/15 21:54:50 WARN scheduler.TaskSetManager: Loss was due to
> java.io.StreamCorruptedException
> java.io.StreamCorruptedException: invalid type code: AC
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1377)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
> at
> org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:125)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> at
> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:30)
> at
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at
> org.apache.spark.Aggregator.combineCombinersByKey(Aggregator.scala:87)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:101)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:100)
> at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
> at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
> at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> 14/07/15 21:54:50 INFO scheduler.TaskSetManager: Starting task 0.0:86 as TID
> 4949 on executor 20: tuan221 (PROCESS_LOCAL)
> 14/07/15 21:54:50 INFO scheduler.TaskSetManager: Serialized task 0.0:86 as
> 3129 bytes in 0 ms
> 14/07/15 21:54:50 WARN scheduler.TaskSetManager: Lost TID 4785 (task 0.0:3)
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)