[
https://issues.apache.org/jira/browse/SPARK-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949819#comment-14949819
]
Daniel Lemire commented on SPARK-5949:
--------------------------------------
@drcallen Thanks for pinging me.
We will be glad to help address any issue you have.
The Element class was removed entirely with the release of Roaring version
0.5.0 as a way to save memory (each Element instance used 24 bytes).
https://github.com/lemire/RoaringBitmap/issues/31
This change did not affect the public API. (Element as "protected".) For
obvious reasons, I recommend against introducing dependencies against
non-public classes or function classes.
The recommended way to persist a RoaringBitmap (or a MutableRoaringBitmap) is
in this manner:
// r is my bitmap
r.runOptimize(); //if needed, to improve compression (new with version
0.5.x)
r.serialize(DataOutput ... );
Then one can deserialize it, or just map it (using the ImmutableRoaringBitmap
class). We are committed to preserving the data format produced by "serialize"
across versions.
> Driver program has to register roaring bitmap classes used by spark with Kryo
> when number of partitions is greater than 2000
> ----------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-5949
> URL: https://issues.apache.org/jira/browse/SPARK-5949
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.2.0
> Reporter: Peter Torok
> Assignee: Imran Rashid
> Labels: kryo, partitioning, serialization
> Fix For: 1.4.0
>
>
> When more than 2000 partitions are being used with Kryo, the following
> classes need to be registered by driver program:
> - org.apache.spark.scheduler.HighlyCompressedMapStatus
> - org.roaringbitmap.RoaringBitmap
> - org.roaringbitmap.RoaringArray
> - org.roaringbitmap.ArrayContainer
> - org.roaringbitmap.RoaringArray$Element
> - org.roaringbitmap.RoaringArray$Element[]
> - short[]
> Our project doesn't have dependency on roaring bitmap and
> HighlyCompressedMapStatus is intended for internal spark usage. Spark should
> take care of this registration when Kryo is used.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]