Re: Spark Streaming and SQL checkpoint error: (java.io.NotSerializableException: org.apache.hadoop.hive.conf.HiveConf)

2015-02-16 Thread Michael Armbrust
You probably want to mark the HiveContext as @transient as its not valid to
use it on the slaves anyway.

On Mon, Feb 16, 2015 at 1:58 AM, Haopu Wang hw...@qilinsoft.com wrote:

  I have a streaming application which registered temp table on a
 HiveContext for each batch duration.

 The application runs well in Spark 1.1.0. But I get below error from 1.1.1.

 Do you have any suggestions to resolve it? Thank you!



 *java.io.NotSerializableException*: org.apache.hadoop.hive.conf.HiveConf

 - field (class scala.Tuple2, name: _1, type: class
 java.lang.Object)

 - object (class scala.Tuple2, (Configuration: core-default.xml,
 core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml,
 yarn-site.xml, hdfs-default.xml, hdfs-site.xml,
 org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@2158ce23
 ,org.apache.hadoop.hive.ql.session.SessionState@49b6eef9))

 - field (class org.apache.spark.sql.hive.HiveContext, name: x$3,
 type: class scala.Tuple2)

 - object (class org.apache.spark.sql.hive.HiveContext,
 org.apache.spark.sql.hive.HiveContext@4e6e66a4)

 - field (class
 com.vitria.spark.streaming.api.scala.BaseQueryableDStream$$anonfun$registerTempTable$2,
 name: sqlContext$1, type: class org.apache.spark.sql.SQLContext)

- object (class
 com.vitria.spark.streaming.api.scala.BaseQueryableDStream$$anonfun$registerTempTable$2,
 function1)

 - field (class
 org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1, name:
 foreachFunc$1, type: interface scala.Function1)

 - object (class
 org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1,
 function2)

 - field (class org.apache.spark.streaming.dstream.ForEachDStream,
 name: org$apache$spark$streaming$dstream$ForEachDStream$$foreachFunc,
 type: interface scala.Function2)

 - object (class org.apache.spark.streaming.dstream.ForEachDStream,
 org.apache.spark.streaming.dstream.ForEachDStream@5ccbdc20)

 - element of array (index: 0)

 - array (class [Ljava.lang.Object;, size: 16)

 - field (class scala.collection.mutable.ArrayBuffer, name: array,
 type: class [Ljava.lang.Object;)

 - object (class scala.collection.mutable.ArrayBuffer,
 ArrayBuffer(org.apache.spark.streaming.dstream.ForEachDStream@5ccbdc20))

 - field (class org.apache.spark.streaming.DStreamGraph, name:
 outputStreams, type: class scala.collection.mutable.ArrayBuffer)

 - custom writeObject data (class
 org.apache.spark.streaming.DStreamGraph)

 - object (class org.apache.spark.streaming.DStreamGraph,
 org.apache.spark.streaming.DStreamGraph@776ae7da)

 - field (class org.apache.spark.streaming.Checkpoint, name: graph,
 type: class org.apache.spark.streaming.DStreamGraph)

 - root object (class org.apache.spark.streaming.Checkpoint,
 org.apache.spark.streaming.Checkpoint@5eade065)

 at java.io.ObjectOutputStream.writeObject0(Unknown Source)





Spark Streaming and SQL checkpoint error: (java.io.NotSerializableException: org.apache.hadoop.hive.conf.HiveConf)

2015-02-16 Thread Haopu Wang
I have a streaming application which registered temp table on a
HiveContext for each batch duration.

The application runs well in Spark 1.1.0. But I get below error from
1.1.1.

Do you have any suggestions to resolve it? Thank you!

 

java.io.NotSerializableException: org.apache.hadoop.hive.conf.HiveConf

- field (class scala.Tuple2, name: _1, type: class
java.lang.Object)

- object (class scala.Tuple2, (Configuration: core-default.xml,
core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml,
yarn-site.xml, hdfs-default.xml, hdfs-site.xml,
org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@2158ce23,org.apa
che.hadoop.hive.ql.session.SessionState@49b6eef9))

- field (class org.apache.spark.sql.hive.HiveContext, name: x$3,
type: class scala.Tuple2)

- object (class org.apache.spark.sql.hive.HiveContext,
org.apache.spark.sql.hive.HiveContext@4e6e66a4)

- field (class
com.vitria.spark.streaming.api.scala.BaseQueryableDStream$$anonfun$regi
sterTempTable$2, name: sqlContext$1, type: class
org.apache.spark.sql.SQLContext)

   - object (class
com.vitria.spark.streaming.api.scala.BaseQueryableDStream$$anonfun$regi
sterTempTable$2, function1)

- field (class
org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1,
name: foreachFunc$1, type: interface scala.Function1)

- object (class
org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1,
function2)

- field (class org.apache.spark.streaming.dstream.ForEachDStream,
name: org$apache$spark$streaming$dstream$ForEachDStream$$foreachFunc,
type: interface scala.Function2)

- object (class org.apache.spark.streaming.dstream.ForEachDStream,
org.apache.spark.streaming.dstream.ForEachDStream@5ccbdc20)

- element of array (index: 0)

- array (class [Ljava.lang.Object;, size: 16)

- field (class scala.collection.mutable.ArrayBuffer, name:
array, type: class [Ljava.lang.Object;)

- object (class scala.collection.mutable.ArrayBuffer,
ArrayBuffer(org.apache.spark.streaming.dstream.ForEachDStream@5ccbdc20))

- field (class org.apache.spark.streaming.DStreamGraph, name:
outputStreams, type: class scala.collection.mutable.ArrayBuffer)

- custom writeObject data (class
org.apache.spark.streaming.DStreamGraph)

- object (class org.apache.spark.streaming.DStreamGraph,
org.apache.spark.streaming.DStreamGraph@776ae7da)

- field (class org.apache.spark.streaming.Checkpoint, name:
graph, type: class org.apache.spark.streaming.DStreamGraph)

- root object (class org.apache.spark.streaming.Checkpoint,
org.apache.spark.streaming.Checkpoint@5eade065)

at java.io.ObjectOutputStream.writeObject0(Unknown Source)