Github Issues for tracking: https://github.com/KylinOLAP/Kylin/issues/263

2014-12-25 13:13 GMT+08:00 Luke Han <[email protected]>:

> Cool, that's what we need to enhance Kylin's storage and build process.
>
> Will create Issues/JIRA to tracking this.
>
> Thank you very much.
>
> Luke
>
>
> 2014-12-24 14:14 GMT+08:00 Li Yang <[email protected]>:
>
>> This is an very interesting idea. Actually many less general solutions
>> (from talk to various people we met) took exactly this approach.
>>
>> This feature will benefit users who have their hadoop cluster hosted in
>> cloud service. Less cuboid means less CPU cycles, and that's less to pay.
>>
>> Yang
>>
>> On Wed, Dec 24, 2014 at 1:47 PM, hongbin ma <[email protected]> wrote:
>>
>> > Logically, a cube contains cuboids representing all combinations of
>> > dimensions. Apparently, a naive cube building strategy that materializes
>> > all cuboids will easily meet curse-of-dimension problems. Currently
>> Kylin
>> > leverages a strategy called "aggregation groups" to reduce the number of
>> > cuboids need being materialized.
>> >
>> > However, if the query pattern is simple and fixed, the "aggregation
>> group"
>> > strategy is still not efficient enough. For example, suppose there're
>> five
>> > dimensions, namely A,B,C,D and E. The data modeler is sure that only
>> > combinations (A,B,C), (D,E), (A,E) will be queried, so he’ll use the
>> > aggregation group tool to optimize his cube definition. However,
>> whatever
>> > aggregation group he chooses, lots of useless combinations would be
>> > materialized.
>> >
>> > With a new strategy called "cuboid whitelist", data modelers can guide
>> > Kylin to only materialize the cuboids he's interested in. Depending on
>> the
>> > whitelist, Kylin will materialize the minimal set of cuboids to cover
>> each
>> > cuboid in the whitelist. To support this, the following functionalities
>> > should be added:
>> >
>> > 1. Front-end/UI for specifying whitelist members, and persistent them to
>> > cube description.
>> > 2. Enhanced job engine scheduler that will calculate a minimal spanning
>> > build tree based on the whitelist.
>> > 3. (OPTIONAL) Enhanced job engine to support dynamic whitelist, trigger
>> new
>> > builds for lately added whitelist members.
>> >
>> >
>> >
>> > Hongbin Ma
>> >
>>
>
>

Reply via email to