Patch merged into 1.x branch.
https://github.com/apache/incubator-kylin/commit/f00c838e6117933e725c3e69f0f30a908541b8a8

There shall be major refactoring around inverted-index and realtime OLAP in
Q4 or early 2016. Will embrace more from Roaring bitmaps.

Many thanks Daniel!


On Tue, Sep 29, 2015 at 11:45 PM, Luke Han <[email protected]> wrote:

> Hi Daniel,
>     The patch looks very great, will ask Yang to help review and merge if
> there's no issue.
>
>     Thank you very much for your contribution.
>
>
>
>
> Best Regards!
> ---------------------
>
> Luke Han
>
> On Mon, Sep 28, 2015 at 11:12 PM, Daniel Lemire <[email protected]> wrote:
>
> > Good day Luke,
> >
> > A patch has been added to the JIRA :
> >
> > https://issues.apache.org/jira/browse/KYLIN-1034
> >
> > I have also issued a PR on GitHub:
> >
> > https://github.com/apache/incubator-kylin/pull/12
> >
> > The patch is straight-forward, and simply replaces Concise by Roaring
> > throughout.
> >
> > The relevant unit tests appear to pass.
> >
> > Further review, testing and benchmarking is encouraged. The purpose of
> > this patch
> > is to get the process started.
> >
> > To keep things simple, I did not do *any* redesign. Still... here are my
> > thoughts...
> >
> > Design-wise : It does look to me like the bitmaps are serialized to
> > streams of bytes.
> > From there, *immutable* bitmaps are reloaded on demand, then possibly
> > copied and modified.
> > The Roaring library has a class ideally suited for this purpose, called
> > ImmutableRoaringBitmap...
> > From any ByteBuffer, you can map directly a bitmap :
> >
> >
> https://github.com/lemire/RoaringBitmap/blob/master/examples/ImmutableRoaringBitmapExample.java
> > Compared to deserializing a bitmap from a stream of bytes, this approach
> > avoids copying
> > and parsing the data: constructing an ImmutableRoaringBitmap is very fast
> > and uses very
> > little memory. Because they are formally immutable, you only need one
> > instance in your entire
> > application, irrespective of the number of cores. The data is accessed
> > only when the
> > ImmutableRoaringBitmap is actually queried, and what is accessed is the
> > original stream of
> > bytes (no unnecessary copy is made). So it uses less memory.
> >
> > Making us of ImmutableRoaringBitmap and mapped bitmaps in kylin would not
> > be difficult,
> > programming-wise, but this would make the patch more difficult to review.
> >
> > (I'll recopy some of my comments on JIRA.)
> >
> >
> > As usual, the copyright of this patch and be assigned to whoever...
> should
> > you choose
> > to use it. This patch or the Roaring library itself are *not* covered by
> > patents. And
> > so forth.
> >
> >
> >
> > On Sun, Sep 27, 2015 at 2:03 PM, Daniel Lemire <[email protected]> wrote:
> >
> >> Thanks for clarifying.
> >>
> >> Let me see what we can do on this front.
> >>
> >> On Sat, Sep 26, 2015 at 7:16 PM, Luke Han <[email protected]> wrote:
> >>
> >>> Thanks Daniel, I think that's most efficient way to have Roaring
> >>> work in existing code, patch is really be appreciated :)
> >>>
> >>> It's great discussion in KYLIN-1034.
> >>>
> >>> Thanks.
> >>>
> >>>
> >>> Best Regards!
> >>> ---------------------
> >>>
> >>> Luke Han
> >>>
> >>> On Sat, Sep 26, 2015 at 9:59 PM, Daniel Lemire <[email protected]>
> wrote:
> >>>
> >>>> Good day Luke,
> >>>>
> >>>> May I ask you a favor to bring it into our source code? We could work
> >>>>> together to make it work for
> >>>>> our current cases and then run some benchmark with real case.
> >>>>>
> >>>>
> >>>> We can rather easily substitute Roaring for Concise in the source
> code.
> >>>> Then submit a patch.
> >>>> Is that what would move this along most efficiently?
> >>>>
> >>>>
> >>>> Meanwhile, we have been flushing out some of the issues on JIRA :
> >>>>
> >>>> https://issues.apache.org/jira/browse/KYLIN-1034
> >>>>
> >>>> Some of these issues (e.g., memory-file mapping) might be of general
> >>>> interest.
> >>>>
> >>>> - Daniel
> >>>>
> >>>
> >>>
> >>
> >
>

Reply via email to