Github Issues for tracking: https://github.com/KylinOLAP/Kylin/issues/263
2014-12-25 13:13 GMT+08:00 Luke Han <[email protected]>: > Cool, that's what we need to enhance Kylin's storage and build process. > > Will create Issues/JIRA to tracking this. > > Thank you very much. > > Luke > > > 2014-12-24 14:14 GMT+08:00 Li Yang <[email protected]>: > >> This is an very interesting idea. Actually many less general solutions >> (from talk to various people we met) took exactly this approach. >> >> This feature will benefit users who have their hadoop cluster hosted in >> cloud service. Less cuboid means less CPU cycles, and that's less to pay. >> >> Yang >> >> On Wed, Dec 24, 2014 at 1:47 PM, hongbin ma <[email protected]> wrote: >> >> > Logically, a cube contains cuboids representing all combinations of >> > dimensions. Apparently, a naive cube building strategy that materializes >> > all cuboids will easily meet curse-of-dimension problems. Currently >> Kylin >> > leverages a strategy called "aggregation groups" to reduce the number of >> > cuboids need being materialized. >> > >> > However, if the query pattern is simple and fixed, the "aggregation >> group" >> > strategy is still not efficient enough. For example, suppose there're >> five >> > dimensions, namely A,B,C,D and E. The data modeler is sure that only >> > combinations (A,B,C), (D,E), (A,E) will be queried, so he’ll use the >> > aggregation group tool to optimize his cube definition. However, >> whatever >> > aggregation group he chooses, lots of useless combinations would be >> > materialized. >> > >> > With a new strategy called "cuboid whitelist", data modelers can guide >> > Kylin to only materialize the cuboids he's interested in. Depending on >> the >> > whitelist, Kylin will materialize the minimal set of cuboids to cover >> each >> > cuboid in the whitelist. To support this, the following functionalities >> > should be added: >> > >> > 1. Front-end/UI for specifying whitelist members, and persistent them to >> > cube description. >> > 2. Enhanced job engine scheduler that will calculate a minimal spanning >> > build tree based on the whitelist. >> > 3. (OPTIONAL) Enhanced job engine to support dynamic whitelist, trigger >> new >> > builds for lately added whitelist members. >> > >> > >> > >> > Hongbin Ma >> > >> > >
