Hi,

I have a weird issue - spark streaming application fails once/twice a day
with java.io.OptionalDataException. It happens when deserializing task.

The problem appeared after migration from spark 1.6 to spark 2.2, cluster
runs in latest CDH distro, YARN mode.

I have no clue what to blame and where to search for problems, maybe you
have some ideas?

Overall application works stable, but once a day fails with this exception

The trace:
java.io.OptionalDataException
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1371)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
  .......
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:85)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Full trace:
https://gist.github.com/anonymous/3f62764fb1f438f1bd6e397017d092c0

Reply via email to