Thanks Daniel for the recommendation.
Let's have a try :)

Li Yang <[email protected]>于2015年9月18日周五 下午3:28写道:

> Many thanks Daniel for the information!
>
> We will definitely give RoaringBitmap
> <https://github.com/lemire/RoaringBitmap/> a try. If it boosts Spark and
> Druid.io, it can help Kylin's inverted index as well.  :-)
>
> On Thu, Sep 17, 2015 at 11:16 PM, Daniel Lemire <[email protected]> wrote:
>
> > Good day,
> >
> > I see that Kylin isusing ConciseSet for bitmap indexing. In our work, we
> > found that Roaring bitmaps are often much better than ConciseSet (e.g.,
> see
> > experimental section in http://arxiv.org/pdf/1402.6407.pdf ). The
> > compression is often better and the speed difference can be substantial.
> > This is even more so with version 0.5 of Roaring.
> >
> > We have a high quality Java implementation that is used by Apache Spark
> > and Druid.io. The Druid people found that switching to Roaring bitmaps
> > could improve real-world performance by 30% or more.
> >
> > https://github.com/lemire/RoaringBitmap/
> >
> > When desired, the library supports memory file mapping, so that
> >  out-of-JVM heap memory is used instead. This can greatly improve IO
> > issues. The library is available under the Apache license and is
> > patent-free.
> >
> > We have an extensive real-data benchmark framework which you can run for
> > yourself to compare Roaring with competitive alternatives such as
> > ConciseSet :
> >
> > https://github.com/lemire/RoaringBitmap/tree/master/jmh
> >
> > Running such a benchmark can be as simple as launching a script.
> >
> > What we did for Druid is to make the bitmap format "pluggable" so you can
> > switch from one format to the other using a configuration flag. This is
> > implemented through simple wrappers, e.g., see
> >
> >
> >
> https://github.com/metamx/bytebuffer-collections/tree/master/src/main/java/com/metamx/collections/bitmap
> >
> > So it can be really easy to make it possible to switch the format while
> > preserving backward compatibility if needed...
> >
> > If you are interested in trying out Roaring, please let us know... We do
> > not think it is difficult work to integrate Roaring in Kylin (maybe a day
> > or so of programming) and it could potentially improve performance while
> > reducing memory storage.
> >
> > Note: Roaring bitmaps were also adopted in Apache Lucene, though they
> have
> > their own implementation, see
> > https://www.elastic.co/blog/frame-of-reference-and-roaring-bitmaps
> >
> > --
> >  Daniel Lemire for the Roaring team
> > https://github.com/lemire/
> >
> >
>

Reply via email to