[ 
https://issues.apache.org/jira/browse/KYLIN-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15825684#comment-15825684
 ] 

Shaofeng SHI commented on KYLIN-2387:
-------------------------------------

With a build from latest master branch, I got the following error on CDH 5.8:
{code}
Exception in thread "main" java.lang.NoSuchMethodError: 
org.roaringbitmap.buffer.ImmutableRoaringBitmap.bitmapOf([I)Lorg/roaringbitmap/buffer/ImmutableRoaringBitmap;
        at 
org.apache.kylin.measure.bitmap.ImmutableBitmapCounter.<init>(ImmutableBitmapCounter.java:38)
        at 
org.apache.kylin.measure.bitmap.BitmapSerializer.<clinit>(BitmapSerializer.java:28)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:278)
{code}

It works before, so I think the error is related with this change. I also 
searched on the machine, but found CDH 5.8 still use an old roaringbitmap:

find / -name RoaringBitmap*.jar
/opt/cloudera/parcels/CDH-5.8.3-1.cdh5.8.3.p0.2/jars/RoaringBitmap-0.5.11.jar
/opt/cloudera/parcels/CDH-5.8.3-1.cdh5.8.3.p0.2/lib/oozie/oozie-sharelib-mr1/lib/spark/RoaringBitmap-0.5.11.jar
/opt/cloudera/parcels/CDH-5.8.3-1.cdh5.8.3.p0.2/lib/oozie/oozie-sharelib-yarn/lib/spark/RoaringBitmap-0.5.11.jar


[~dayue] can we use a method that also exists in old version? Thanks.

> A new BitmapCounter with better performance
> -------------------------------------------
>
>                 Key: KYLIN-2387
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2387
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Metadata, Query Engine, Storage - HBase
>    Affects Versions: v2.0.0
>            Reporter: Dayue Gao
>            Assignee: Dayue Gao
>
> We found the old BitmapCounter does not perform very well on very large 
> bitmap. The inefficiency comes from
> * Poor serialize implementation: instead of serialize bitmap directly to 
> ByteBuffer, it uses ByteArrayOutputStream as a temporal storage, which causes 
> superfluous memory allocations
> * Poor peekLength implementation: the whole bitmap is deserialized in order 
> to retrieve its serialized size
> * Extra deserialize cost: even if only cardinality info is needed to answer 
> query, the whole bitmap is deserialize into MutableRoaringBitmap
> A new BitmapCounter is designed to solve these problems
> * It comes in tow flavors, mutable and immutable, which is based on 
> Mutable/Immutable RoaringBitmap correspondingly
> * ImmutableBitmapCounter has lower deserialize cost, as it just maps to a 
> copied buffer. So we always deserialize to ImmutableBitmapCounter at first, 
> and convert it to MutableBitmapCounter only when necessary
> * peekLength is implemented using 
> ImmutableRoaringBitmap.serializedSizeInBytes, which is very fast since only 
> the header of roaring format is examined
> * It can directly serializes to ByteBuffer, no intermediate buffer is 
> allocated
> * The wire format is the same as before 
> ([RoaringFormatSpec|https://github.com/RoaringBitmap/RoaringFormatSpec/]). 
> Therefore no cube rebuild is needed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to