Can you register Put with Kryo ?

Thanks

> On May 29, 2016, at 4:58 PM, Nirav Patel <npa...@xactlycorp.com> wrote:
> 
> I pasted code snipped for that method.
> 
> here's full def:
> 
>   def writeRddToHBase2(hbaseRdd: RDD[(ImmutableBytesWritable, Put)], 
> tableName: String) {
> 
> 
> 
>     hbaseRdd.values.foreachPartition{ itr =>
> 
>         val hConf = HBaseConfiguration.create()
> 
>         hConf.setInt("hbase.client.write.buffer", 16097152)
> 
>         val table = new HTable(hConf, tableName)
> 
>         //table.setWriteBufferSize(8388608)
> 
>         itr.grouped(100).foreach(table.put(_))   // << Exception happens at 
> this point
> 
>         table.close()
> 
>     }
> 
>   }
> 
> 
> 
> I am using hbase 0.98.12 mapr distribution.
> 
> 
> 
> Thanks
> 
> Nirav
> 
> 
>> On Sun, May 29, 2016 at 4:46 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>> bq.  at 
>> com.mycorpt.myprojjobs.spark.jobs.hbase.HbaseUtils$$anonfun$writeRddToHBase2$1.apply(HbaseUtils.scala:80)
>> 
>> Can you reveal related code from HbaseUtils.scala ?
>> 
>> Which hbase version are you using ?
>> 
>> Thanks
>> 
>>> On Sun, May 29, 2016 at 4:26 PM, Nirav Patel <npa...@xactlycorp.com> wrote:
>>> Hi,
>>> 
>>> I am getting following Kryo deserialization error when trying to buklload 
>>> Cached RDD into Hbase. It works if I don't cache the RDD. I cache it with 
>>> MEMORY_ONLY_SER.
>>> 
>>> here's the code snippet:
>>> 
>>> 
>>> hbaseRdd.values.foreachPartition{ itr =>
>>>         val hConf = HBaseConfiguration.create()
>>>         hConf.setInt("hbase.client.write.buffer", 16097152)
>>>         val table = new HTable(hConf, tableName)
>>>         itr.grouped(100).foreach(table.put(_))
>>>         table.close()
>>>     }
>>> hbaseRdd is of type RDD[(ImmutableBytesWritable, Put)]
>>> 
>>> 
>>> Exception I am getting. I read on Kryo JIRA that this may be issue with 
>>> incorrect use of serialization library. So could this be issue with 
>>> twitter-chill library or spark core it self ? 
>>> 
>>> Job aborted due to stage failure: Task 16 in stage 9.0 failed 10 times, 
>>> most recent failure: Lost task 16.9 in stage 9.0 (TID 28614, 
>>> hdn10.mycorptcorporation.local): com.esotericsoftware.kryo.KryoException: 
>>> java.lang.IndexOutOfBoundsException: Index: 100, Size: 6
>>> Serialization trace:
>>> familyMap (org.apache.hadoop.hbase.client.Put)
>>>     at 
>>> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:626)
>>>     at 
>>> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
>>>     at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>>>     at com.twitter.chill.Tuple2Serializer.read(TupleSerializers.scala:42)
>>>     at com.twitter.chill.Tuple2Serializer.read(TupleSerializers.scala:33)
>>>     at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>>>     at 
>>> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:192)
>>>     at 
>>> org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:181)
>>>     at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>>>     at 
>>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>>>     at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>>>     at scala.collection.Iterator$GroupedIterator.fill(Iterator.scala:966)
>>>     at scala.collection.Iterator$GroupedIterator.hasNext(Iterator.scala:972)
>>>     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>>     at 
>>> com.mycorpt.myprojjobs.spark.jobs.hbase.HbaseUtils$$anonfun$writeRddToHBase2$1.apply(HbaseUtils.scala:80)
>>>     at 
>>> com.mycorpt.myprojjobs.spark.jobs.hbase.HbaseUtils$$anonfun$writeRddToHBase2$1.apply(HbaseUtils.scala:75)
>>>     at 
>>> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:902)
>>>     at 
>>> org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:902)
>>>     at 
>>> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1850)
>>>     at 
>>> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1850)
>>>     at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>>     at org.apache.spark.scheduler.Task.run(Task.scala:88)
>>>     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:744)
>>> Caused by: java.lang.IndexOutOfBoundsException: Index: 100, Size: 6
>>>     at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>>>     at java.util.ArrayList.get(ArrayList.java:411)
>>>     at 
>>> com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:42)
>>>     at com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:773)
>>>     at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:727)
>>>     at 
>>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:134)
>>>     at 
>>> com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
>>>     at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
>>>     at 
>>> com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
>>>     ... 26 more
>>> 
>>> 
>>> 
>>> 
>>> 
>>>         
> 
> 
> 
> 
> 
> 
>         

Reply via email to