Let me post the link again. :-) http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin
But anyway, one lesson learn here is Kylin could provide a more partial default for new user. Or do size estimate and provide warning before user build a huge cube. Cheers Yang On Tue, Mar 3, 2015 at 8:24 AM, 蒋旭 <[email protected]> wrote: > Just like database schema design is critical to the database performance, > the metadata design is also critical to cube building and query performance. > > > Kylin has lots of optimization options in metadata: partial cube, > dictionary encoding, hierarchy dimension, derived dimension and etc. > > > You'd better get the cardinality of all dimension in Kylin . Then design > the metadata based on dimension characteristics and query pattern. > > Thanks > Jiang Xu > > ------------------ 原始邮件 ------------------ > 发件人: Luke Han <[email protected]> > 发送时间: 2015年03月02日 19:58 > 收件人: [email protected] <[email protected]> > 主题: Re: cube building VS cognos > > > > There's one magic but very important concept called "partial cube". As Ted > also mentioned. > If you build a full cube, it will calculate all possible combination of > dimensions which will certainly exposed when you have more dimensions even > when you have one column contains high cardinality. > > Kylin supports partial cube already, please update on "Advance settings" of > cube designer to tune "Aggregation Group". > > And, for derived dimension, you actually just need put one column into > aggregation group which will reduced very much of final cube size. with > more other optimization rules applied, the final cube size will be well > controlled. > For example, we have many production cases contains more than 10+B rows and > 10+ dimensions, most cube sizes are around several hundreds GB, 20~50% of > source Hive table size. (we also have some cubes have more big expansion > rate to serve extreme case). > > Also, Kylin's main focus is to accelerate query performance with > pre-calculated result (cube), size is just one dimension. Could you also > please run some queries on both Cognos and Kylin? We would like to know the > query performance result. > > Thank you very much. > Luke > > > > > > Best Regards! > --------------------- > > Luke Han > > 2015-03-02 13:47 GMT+08:00 Ted Dunning <[email protected]>: > > > This sounds like cognos is actually just building a few, possibly just > one, > > of the more detailed cubes expecting that queries will roll these up to > get > > effect of many of the less detailed cubes. That is, it may not be > building > > all of the cubes requested. > > > > Kylin, on the other hand, seems to be building up all requested cubes > with > > no optimization being imposed. > > > > Note that these are my impressions based on seeing how new open source > > software often behaves relative to older, more established alternatives, > > not anything based on concrete information. I look forward to being > > contradicted by facts. > > > > > > > > > > On Mon, Mar 2, 2015 at 3:59 AM, 王西斌 <[email protected]> wrote: > > > > > Hi > > > > > > I've done some test for comparison with cognos which we used for olap. > > > Below is one case: > > > > > > Fact table size: 500M > > > > > > Dimension: 19( 16 derived, 3 normal) > > > > > > Measure: 56 > > > > > > Cognos building this cube in about one hour with a 1.2G cube file. > > > While in kylin, cube size is already up to 64G with 10 Dimensions and > 20 > > > Measures. If i understand right, the cube size will double at least for > > > each additional dimension. As to the test case, cube size is estimated > in > > > TB which is beyond our expectations greatly. > > > So, is there any test report we can refer to for comparison with > > > traditional olap tools like cognos, if so, please let me know, it will > be > > > very helpful. > > > > > > Looking forward to reply. > > > > > > Thanks > > > > > >
