[ 
https://issues.apache.org/jira/browse/SPARK-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949518#comment-14949518
 ] 

Charles Allen commented on SPARK-5949:
--------------------------------------

This breaks when using more recent versions of Roaring where 
org.roaringbitmap.RoaringArray$Element is no longer present. The following 
stack trace appears:

{code}
A needed class was not found. This could be due to an error in your runpath. 
Missing class: org/roaringbitmap/RoaringArray$Element
java.lang.NoClassDefFoundError: org/roaringbitmap/RoaringArray$Element
        at 
org.apache.spark.serializer.KryoSerializer$.<init>(KryoSerializer.scala:338)
        at 
org.apache.spark.serializer.KryoSerializer$.<clinit>(KryoSerializer.scala)
        at 
org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:93)
        at 
org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:237)
        at 
org.apache.spark.serializer.KryoSerializerInstance.<init>(KryoSerializer.scala:222)
        at 
org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:138)
        at 
org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:201)
        at 
org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102)
        at 
org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:85)
        at 
org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
        at 
org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63)
        at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1318)
        at 
org.apache.spark.SparkContext$$anonfun$hadoopFile$1.apply(SparkContext.scala:1006)
        at 
org.apache.spark.SparkContext$$anonfun$hadoopFile$1.apply(SparkContext.scala:1003)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
        at org.apache.spark.SparkContext.withScope(SparkContext.scala:700)
        at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:1003)
        at 
org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:818)
        at 
org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:816)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
        at org.apache.spark.SparkContext.withScope(SparkContext.scala:700)
        at org.apache.spark.SparkContext.textFile(SparkContext.scala:816)
        at 
io.druid.indexer.spark.SparkDruidIndexer$$anonfun$2.apply(SparkDruidIndexer.scala:84)
        at 
io.druid.indexer.spark.SparkDruidIndexer$$anonfun$2.apply(SparkDruidIndexer.scala:84)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        at scala.collection.AbstractTraversable.map(Traversable.scala:105)
        at 
io.druid.indexer.spark.SparkDruidIndexer$.loadData(SparkDruidIndexer.scala:84)
        at 
io.druid.indexer.spark.TestSparkDruidIndexer$$anonfun$1.apply$mcV$sp(TestSparkDruidIndexer.scala:131)
        at 
io.druid.indexer.spark.TestSparkDruidIndexer$$anonfun$1.apply(TestSparkDruidIndexer.scala:40)
        at 
io.druid.indexer.spark.TestSparkDruidIndexer$$anonfun$1.apply(TestSparkDruidIndexer.scala:40)
        at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
        at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
        at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
        at org.scalatest.Transformer.apply(Transformer.scala:22)
        at org.scalatest.Transformer.apply(Transformer.scala:20)
        at org.scalatest.FlatSpecLike$$anon$1.apply(FlatSpecLike.scala:1647)
        at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
        at org.scalatest.FlatSpec.withFixture(FlatSpec.scala:1683)
        at 
org.scalatest.FlatSpecLike$class.invokeWithFixture$1(FlatSpecLike.scala:1644)
        at 
org.scalatest.FlatSpecLike$$anonfun$runTest$1.apply(FlatSpecLike.scala:1656)
        at 
org.scalatest.FlatSpecLike$$anonfun$runTest$1.apply(FlatSpecLike.scala:1656)
        at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
        at org.scalatest.FlatSpecLike$class.runTest(FlatSpecLike.scala:1656)
        at org.scalatest.FlatSpec.runTest(FlatSpec.scala:1683)
        at 
org.scalatest.FlatSpecLike$$anonfun$runTests$1.apply(FlatSpecLike.scala:1714)
        at 
org.scalatest.FlatSpecLike$$anonfun$runTests$1.apply(FlatSpecLike.scala:1714)
        at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
        at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
        at 
org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:390)
        at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:427)
        at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
        at 
org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
        at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
        at org.scalatest.FlatSpecLike$class.runTests(FlatSpecLike.scala:1714)
        at org.scalatest.FlatSpec.runTests(FlatSpec.scala:1683)
        at org.scalatest.Suite$class.run(Suite.scala:1424)
        at 
org.scalatest.FlatSpec.org$scalatest$FlatSpecLike$$super$run(FlatSpec.scala:1683)
        at 
org.scalatest.FlatSpecLike$$anonfun$run$1.apply(FlatSpecLike.scala:1760)
        at 
org.scalatest.FlatSpecLike$$anonfun$run$1.apply(FlatSpecLike.scala:1760)
        at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
        at org.scalatest.FlatSpecLike$class.run(FlatSpecLike.scala:1760)
        at org.scalatest.FlatSpec.run(FlatSpec.scala:1683)
        at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:55)
        at 
org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$3.apply(Runner.scala:2563)
        at 
org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$3.apply(Runner.scala:2557)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:2557)
        at 
org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1044)
        at 
org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1043)
        at 
org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:2722)
        at 
org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1043)
        at org.scalatest.tools.Runner$.run(Runner.scala:883)
        at org.scalatest.tools.Runner.run(Runner.scala)
        at 
org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:138)
        at 
org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:28)
Caused by: java.lang.ClassNotFoundException: 
org.roaringbitmap.RoaringArray$Element
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 84 more
{code}

> Driver program has to register roaring bitmap classes used by spark with Kryo 
> when number of partitions is greater than 2000
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-5949
>                 URL: https://issues.apache.org/jira/browse/SPARK-5949
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.2.0
>            Reporter: Peter Torok
>            Assignee: Imran Rashid
>              Labels: kryo, partitioning, serialization
>             Fix For: 1.4.0
>
>
> When more than 2000 partitions are being used with Kryo, the following 
> classes need to be registered by driver program:
> - org.apache.spark.scheduler.HighlyCompressedMapStatus
> - org.roaringbitmap.RoaringBitmap
> - org.roaringbitmap.RoaringArray
> - org.roaringbitmap.ArrayContainer
> - org.roaringbitmap.RoaringArray$Element
> - org.roaringbitmap.RoaringArray$Element[]
> - short[]
> Our project doesn't have dependency on roaring bitmap and 
> HighlyCompressedMapStatus is intended for internal spark usage. Spark should 
> take care of this registration when Kryo is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to