[
https://issues.apache.org/jira/browse/SPARK-11016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14951772#comment-14951772
]
Sean Owen commented on SPARK-11016:
-----------------------------------
Yes, I get that roaringbitmap has a particular serialization mechanism that
Kryo has to be taught to use. I think the answer to my dumb question was: yes
Spark still uses roaringbitmap so it has to ensure Kryo knows how to serialize
it, including registering serializers. Yes you're doing the right thing then.
These classes implement Externalizable but not Serializable; I think the
KryoJavaSerializer could be registered to delegate serialization to the
correct, custom Java serialization these classes define. But they're not
Serializable. I wonder if we can build a KryoJavaExternalizableSerializer to do
something similar automatically? that's tidy. Then it just needs to register
roaringbitmaps classes to use this.
Otherwise, if it has to be bridged by hand, then since DataOutputStream is a
DataOutput that can wrap an OutputStream, and you can get an OutputStream from
KryoOutput, it seems possible.
> Spark fails when running with a task that requires a more recent version of
> RoaringBitmaps
> ------------------------------------------------------------------------------------------
>
> Key: SPARK-11016
> URL: https://issues.apache.org/jira/browse/SPARK-11016
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.4.0
> Reporter: Charles Allen
>
> The following error appears during Kryo init whenever a more recent version
> (>0.5.0) of Roaring bitmaps is required by a job.
> org/roaringbitmap/RoaringArray$Element was removed in 0.5.0
> {code}
> A needed class was not found. This could be due to an error in your runpath.
> Missing class: org/roaringbitmap/RoaringArray$Element
> java.lang.NoClassDefFoundError: org/roaringbitmap/RoaringArray$Element
> at
> org.apache.spark.serializer.KryoSerializer$.<init>(KryoSerializer.scala:338)
> at
> org.apache.spark.serializer.KryoSerializer$.<clinit>(KryoSerializer.scala)
> at
> org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:93)
> at
> org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:237)
> at
> org.apache.spark.serializer.KryoSerializerInstance.<init>(KryoSerializer.scala:222)
> at
> org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:138)
> at
> org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:201)
> at
> org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102)
> at
> org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:85)
> at
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
> at
> org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63)
> at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1318)
> at
> org.apache.spark.SparkContext$$anonfun$hadoopFile$1.apply(SparkContext.scala:1006)
> at
> org.apache.spark.SparkContext$$anonfun$hadoopFile$1.apply(SparkContext.scala:1003)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
> at org.apache.spark.SparkContext.withScope(SparkContext.scala:700)
> at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:1003)
> at
> org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:818)
> at
> org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:816)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
> at org.apache.spark.SparkContext.withScope(SparkContext.scala:700)
> at org.apache.spark.SparkContext.textFile(SparkContext.scala:816)
> {code}
> See https://issues.apache.org/jira/browse/SPARK-5949 for related info
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]