Re: Driver not able to restart the job automatically after the application of Streaming with Kafka Direct went down
Following is the error that I see when it retries. org.apache.spark.SparkException: Failed to read checkpoint from directory /share/checkpointDir at org.apache.spark.streaming.CheckpointReader$.read(Checkpoint.scala:342) at org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:866) at com.walmart.platform.exp.reporting.streaming.ExpoStreamingJob$.main(ExpoStreamingJob.scala:35) at com.walmart.platform.exp.reporting.streaming.ExpoStreamingJob.main(ExpoStreamingJob.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58) at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala) Caused by: java.io.IOException: java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.streaming.dstream.TransformedDStream.parents of type scala.collection.Seq in instance of org.apache.spark.streaming.dstream.TransformedDStream at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163) at org.apache.spark.streaming.DStreamGraph.readObject(DStreamGraph.scala:188) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.spark.streaming.Checkpoint$$anonfun$deserialize$2.apply(Checkpoint.scala:151) at org.apache.spark.streaming.Checkpoint$$anonfun$deserialize$2.apply(Checkpoint.scala:141) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1206) at org.apache.spark.streaming.Checkpoint$.deserialize(Checkpoint.scala:154) at org.apache.spark.streaming.CheckpointReader$$anonfun$read$2.apply(Checkpoint.scala:329) at org.apache.spark.streaming.CheckpointReader$$anonfun$read$2.apply(Checkpoint.scala:325) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35) at org.apache.spark.streaming.CheckpointReader$.read(Checkpoint.scala:325) ... 9 more Caused by: java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.streaming.dstream.TransformedDStream.parents of type scala.collection.Seq in instance of org.apache.spark.streaming.dstream.TransformedDStream at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2089) at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1261) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2006) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900) at
Driver not able to restart the job automatically after the application of Streaming with Kafka Direct went down
Hi, I have the Streaming job running in qa/prod. Due to Kafka issues both the jobs went down. After the Kafka issues got resolved and after the deletion of the checkpoint directory the driver in the qa job restarted the job automatically and the application UI was up. But, in the prod job, the driver did not restart the application. Any idea as to why the prod driver not able to restart the job with everything being same in qa/prod including the --supervise option? Thanks, Swetha -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Driver-not-able-to-restart-the-job-automatically-after-the-application-of-Streaming-with-Kafka-Direcn-tp26155.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org