Re: Spark Streaming and SQL checkpoint error: (java.io.NotSerializableException: org.apache.hadoop.hive.conf.HiveConf)
You probably want to mark the HiveContext as @transient as its not valid to use it on the slaves anyway. On Mon, Feb 16, 2015 at 1:58 AM, Haopu Wang hw...@qilinsoft.com wrote: I have a streaming application which registered temp table on a HiveContext for each batch duration. The application runs well in Spark 1.1.0. But I get below error from 1.1.1. Do you have any suggestions to resolve it? Thank you! *java.io.NotSerializableException*: org.apache.hadoop.hive.conf.HiveConf - field (class scala.Tuple2, name: _1, type: class java.lang.Object) - object (class scala.Tuple2, (Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@2158ce23 ,org.apache.hadoop.hive.ql.session.SessionState@49b6eef9)) - field (class org.apache.spark.sql.hive.HiveContext, name: x$3, type: class scala.Tuple2) - object (class org.apache.spark.sql.hive.HiveContext, org.apache.spark.sql.hive.HiveContext@4e6e66a4) - field (class com.vitria.spark.streaming.api.scala.BaseQueryableDStream$$anonfun$registerTempTable$2, name: sqlContext$1, type: class org.apache.spark.sql.SQLContext) - object (class com.vitria.spark.streaming.api.scala.BaseQueryableDStream$$anonfun$registerTempTable$2, function1) - field (class org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1, name: foreachFunc$1, type: interface scala.Function1) - object (class org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1, function2) - field (class org.apache.spark.streaming.dstream.ForEachDStream, name: org$apache$spark$streaming$dstream$ForEachDStream$$foreachFunc, type: interface scala.Function2) - object (class org.apache.spark.streaming.dstream.ForEachDStream, org.apache.spark.streaming.dstream.ForEachDStream@5ccbdc20) - element of array (index: 0) - array (class [Ljava.lang.Object;, size: 16) - field (class scala.collection.mutable.ArrayBuffer, name: array, type: class [Ljava.lang.Object;) - object (class scala.collection.mutable.ArrayBuffer, ArrayBuffer(org.apache.spark.streaming.dstream.ForEachDStream@5ccbdc20)) - field (class org.apache.spark.streaming.DStreamGraph, name: outputStreams, type: class scala.collection.mutable.ArrayBuffer) - custom writeObject data (class org.apache.spark.streaming.DStreamGraph) - object (class org.apache.spark.streaming.DStreamGraph, org.apache.spark.streaming.DStreamGraph@776ae7da) - field (class org.apache.spark.streaming.Checkpoint, name: graph, type: class org.apache.spark.streaming.DStreamGraph) - root object (class org.apache.spark.streaming.Checkpoint, org.apache.spark.streaming.Checkpoint@5eade065) at java.io.ObjectOutputStream.writeObject0(Unknown Source)
Spark Streaming and SQL checkpoint error: (java.io.NotSerializableException: org.apache.hadoop.hive.conf.HiveConf)
I have a streaming application which registered temp table on a HiveContext for each batch duration. The application runs well in Spark 1.1.0. But I get below error from 1.1.1. Do you have any suggestions to resolve it? Thank you! java.io.NotSerializableException: org.apache.hadoop.hive.conf.HiveConf - field (class scala.Tuple2, name: _1, type: class java.lang.Object) - object (class scala.Tuple2, (Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@2158ce23,org.apa che.hadoop.hive.ql.session.SessionState@49b6eef9)) - field (class org.apache.spark.sql.hive.HiveContext, name: x$3, type: class scala.Tuple2) - object (class org.apache.spark.sql.hive.HiveContext, org.apache.spark.sql.hive.HiveContext@4e6e66a4) - field (class com.vitria.spark.streaming.api.scala.BaseQueryableDStream$$anonfun$regi sterTempTable$2, name: sqlContext$1, type: class org.apache.spark.sql.SQLContext) - object (class com.vitria.spark.streaming.api.scala.BaseQueryableDStream$$anonfun$regi sterTempTable$2, function1) - field (class org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1, name: foreachFunc$1, type: interface scala.Function1) - object (class org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1, function2) - field (class org.apache.spark.streaming.dstream.ForEachDStream, name: org$apache$spark$streaming$dstream$ForEachDStream$$foreachFunc, type: interface scala.Function2) - object (class org.apache.spark.streaming.dstream.ForEachDStream, org.apache.spark.streaming.dstream.ForEachDStream@5ccbdc20) - element of array (index: 0) - array (class [Ljava.lang.Object;, size: 16) - field (class scala.collection.mutable.ArrayBuffer, name: array, type: class [Ljava.lang.Object;) - object (class scala.collection.mutable.ArrayBuffer, ArrayBuffer(org.apache.spark.streaming.dstream.ForEachDStream@5ccbdc20)) - field (class org.apache.spark.streaming.DStreamGraph, name: outputStreams, type: class scala.collection.mutable.ArrayBuffer) - custom writeObject data (class org.apache.spark.streaming.DStreamGraph) - object (class org.apache.spark.streaming.DStreamGraph, org.apache.spark.streaming.DStreamGraph@776ae7da) - field (class org.apache.spark.streaming.Checkpoint, name: graph, type: class org.apache.spark.streaming.DStreamGraph) - root object (class org.apache.spark.streaming.Checkpoint, org.apache.spark.streaming.Checkpoint@5eade065) at java.io.ObjectOutputStream.writeObject0(Unknown Source)