Just like database schema design is critical to the database performance, the metadata design is also critical to cube building and query performance.
Kylin has lots of optimization options in metadata: partial cube, dictionary encoding, hierarchy dimension, derived dimension and etc. You'd better get the cardinality of all dimension in Kylin . Then design the metadata based on dimension characteristics and query pattern. Thanks Jiang Xu ------------------ ???????? ------------------ ??????: Luke Han <[email protected]> ????????: 2015??03??02?? 19:58 ??????: [email protected] <[email protected]> ????: Re: cube building VS cognos There's one magic but very important concept called "partial cube". As Ted also mentioned. If you build a full cube, it will calculate all possible combination of dimensions which will certainly exposed when you have more dimensions even when you have one column contains high cardinality. Kylin supports partial cube already, please update on "Advance settings" of cube designer to tune "Aggregation Group". And, for derived dimension, you actually just need put one column into aggregation group which will reduced very much of final cube size. with more other optimization rules applied, the final cube size will be well controlled. For example, we have many production cases contains more than 10+B rows and 10+ dimensions, most cube sizes are around several hundreds GB, 20~50% of source Hive table size. (we also have some cubes have more big expansion rate to serve extreme case). Also, Kylin's main focus is to accelerate query performance with pre-calculated result (cube), size is just one dimension. Could you also please run some queries on both Cognos and Kylin? We would like to know the query performance result. Thank you very much. Luke Best Regards! --------------------- Luke Han 2015-03-02 13:47 GMT+08:00 Ted Dunning <[email protected]>: > This sounds like cognos is actually just building a few, possibly just one, > of the more detailed cubes expecting that queries will roll these up to get > effect of many of the less detailed cubes. That is, it may not be building > all of the cubes requested. > > Kylin, on the other hand, seems to be building up all requested cubes with > no optimization being imposed. > > Note that these are my impressions based on seeing how new open source > software often behaves relative to older, more established alternatives, > not anything based on concrete information. I look forward to being > contradicted by facts. > > > > > On Mon, Mar 2, 2015 at 3:59 AM, ?????? <[email protected]> wrote: > > > Hi > > > > I've done some test for comparison with cognos which we used for olap. > > Below is one case: > > > > Fact table size: 500M > > > > Dimension: 19( 16 derived, 3 normal) > > > > Measure: 56 > > > > Cognos building this cube in about one hour with a 1.2G cube file. > > While in kylin, cube size is already up to 64G with 10 Dimensions and 20 > > Measures. If i understand right, the cube size will double at least for > > each additional dimension. As to the test case, cube size is estimated in > > TB which is beyond our expectations greatly. > > So, is there any test report we can refer to for comparison with > > traditional olap tools like cognos, if so, please let me know, it will be > > very helpful. > > > > Looking forward to reply. > > > > Thanks > > >
