I will be happy to collaborate. There are other people from the Roaring
team who are available
to help too.

On Tue, Oct 6, 2015 at 7:23 AM, Li Yang <[email protected]> wrote:

> Patch merged into 1.x branch.
> https://github.com/apache/incubator-kylin/commit/f00c838e6117933e725c3e69f0f30a908541b8a8
>
> There shall be major refactoring around inverted-index and realtime OLAP
> in Q4 or early 2016. Will embrace more from Roaring bitmaps.
>
> Many thanks Daniel!
>
>
> On Tue, Sep 29, 2015 at 11:45 PM, Luke Han <[email protected]> wrote:
>
>> Hi Daniel,
>>     The patch looks very great, will ask Yang to help review and merge if
>> there's no issue.
>>
>>     Thank you very much for your contribution.
>>
>>
>>
>>
>> Best Regards!
>> ---------------------
>>
>> Luke Han
>>
>> On Mon, Sep 28, 2015 at 11:12 PM, Daniel Lemire <[email protected]> wrote:
>>
>> > Good day Luke,
>> >
>> > A patch has been added to the JIRA :
>> >
>> > https://issues.apache.org/jira/browse/KYLIN-1034
>> >
>> > I have also issued a PR on GitHub:
>> >
>> > https://github.com/apache/incubator-kylin/pull/12
>> >
>> > The patch is straight-forward, and simply replaces Concise by Roaring
>> > throughout.
>> >
>> > The relevant unit tests appear to pass.
>> >
>> > Further review, testing and benchmarking is encouraged. The purpose of
>> > this patch
>> > is to get the process started.
>> >
>> > To keep things simple, I did not do *any* redesign. Still... here are my
>> > thoughts...
>> >
>> > Design-wise : It does look to me like the bitmaps are serialized to
>> > streams of bytes.
>> > From there, *immutable* bitmaps are reloaded on demand, then possibly
>> > copied and modified.
>> > The Roaring library has a class ideally suited for this purpose, called
>> > ImmutableRoaringBitmap...
>> > From any ByteBuffer, you can map directly a bitmap :
>> >
>> >
>> https://github.com/lemire/RoaringBitmap/blob/master/examples/ImmutableRoaringBitmapExample.java
>> > Compared to deserializing a bitmap from a stream of bytes, this approach
>> > avoids copying
>> > and parsing the data: constructing an ImmutableRoaringBitmap is very
>> fast
>> > and uses very
>> > little memory. Because they are formally immutable, you only need one
>> > instance in your entire
>> > application, irrespective of the number of cores. The data is accessed
>> > only when the
>> > ImmutableRoaringBitmap is actually queried, and what is accessed is the
>> > original stream of
>> > bytes (no unnecessary copy is made). So it uses less memory.
>> >
>> > Making us of ImmutableRoaringBitmap and mapped bitmaps in kylin would
>> not
>> > be difficult,
>> > programming-wise, but this would make the patch more difficult to
>> review.
>> >
>> > (I'll recopy some of my comments on JIRA.)
>> >
>> >
>> > As usual, the copyright of this patch and be assigned to whoever...
>> should
>> > you choose
>> > to use it. This patch or the Roaring library itself are *not* covered by
>> > patents. And
>> > so forth.
>> >
>> >
>> >
>> > On Sun, Sep 27, 2015 at 2:03 PM, Daniel Lemire <[email protected]>
>> wrote:
>> >
>> >> Thanks for clarifying.
>> >>
>> >> Let me see what we can do on this front.
>> >>
>> >> On Sat, Sep 26, 2015 at 7:16 PM, Luke Han <[email protected]> wrote:
>> >>
>> >>> Thanks Daniel, I think that's most efficient way to have Roaring
>> >>> work in existing code, patch is really be appreciated :)
>> >>>
>> >>> It's great discussion in KYLIN-1034.
>> >>>
>> >>> Thanks.
>> >>>
>> >>>
>> >>> Best Regards!
>> >>> ---------------------
>> >>>
>> >>> Luke Han
>> >>>
>> >>> On Sat, Sep 26, 2015 at 9:59 PM, Daniel Lemire <[email protected]>
>> wrote:
>> >>>
>> >>>> Good day Luke,
>> >>>>
>> >>>> May I ask you a favor to bring it into our source code? We could work
>> >>>>> together to make it work for
>> >>>>> our current cases and then run some benchmark with real case.
>> >>>>>
>> >>>>
>> >>>> We can rather easily substitute Roaring for Concise in the source
>> code.
>> >>>> Then submit a patch.
>> >>>> Is that what would move this along most efficiently?
>> >>>>
>> >>>>
>> >>>> Meanwhile, we have been flushing out some of the issues on JIRA :
>> >>>>
>> >>>> https://issues.apache.org/jira/browse/KYLIN-1034
>> >>>>
>> >>>> Some of these issues (e.g., memory-file mapping) might be of general
>> >>>> interest.
>> >>>>
>> >>>> - Daniel
>> >>>>
>> >>>
>> >>>
>> >>
>> >
>>
>
>

Reply via email to