Don't be scared by the number of rows in hbase. They are highly compressed in storage. You can check the actual size of the hbase tables from hbase console to get a correct feeling of the actual storage. Also Kylin GUI shows the original table, cube size, and the inflation rate.
Small/Medium/Large is a mean to control hbase region size, does not related to how many cuboids to aggregate. What matter is the the number of dimensions and their type. "Derived" does not store in the cube (they get looked up at query time), thus is most lightweight. "Hierarchy" generate less cuboids than normal dimension. Cheers Yang On Wed, May 27, 2015 at 5:53 PM, Puneet Gupta <[email protected]> wrote: > Thanks for the reply Bin Mahone . > > I had seen the ppt before and based on it I tried creating my cube . But > there is no way mentioned to know actually how many aggregates get created > for small size cube . I am assuming for the same aggregate group different > numbet of aggregates/cuboids get created based on cube size ( small/ > medium/ large) > > Also how can one have even more fine grained control on cuboids that are > created ? > > _______ > sent from my phone > On May 27, 2015 2:00 PM, "hongbin ma" <[email protected]> wrote: > > > this link might be helpful: > > > http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin?qid=145080e7-4abe-42c9-8048-f29ffec8a66c&v=default&b=&from_search=10 > > > > On Wed, May 27, 2015 at 4:17 PM, Luke Han <[email protected]> wrote: > > > >> Forward to mailing list for further support. > >> > >> Thanks. > >> > >> 在 2015年5月27日星期三 UTC+8下午4:11:09,Puneet Gupta写道: > >>> > >>> Hi , > >>> > >>> Is there any log message that i can look for to determine the exact > >>> number of cuboids ( dimension combination) that will be generated . > >>> > >>> > >>> I have a small Fact table with 10,000 rows . The number of dimensions > >>> are 11. > >>> When I arrange the dimensions such that 2 are of type "column" and 9 > are > >>> of type "derived" and I choose cube size ="small" in GUI , i get close > to > >>> 830,000 rows in HBase Aggregate/Cuboids table. > >>> > >>> When I arrange the dimensions such that 2 are of type "Hierarchy" > >>> (Hierarchy1 has 4 levels Year,month,day,hour and Hirerachy2 has 2 > levels > >>> Protocol Category and Protocol), 2 are of type "column" and rest are of > >>> type "derived" and I choose cube size ="small" in GUI , i get close to > >>> 2,000,000 rows in HBase Aggregate/Cuboids table. > >>> > >>> In both cases i selected Dictionary compression for row keys > >>> > >>> I wanted to control how many aggregates get generated. I feel the size > >>> of aggregate table is too high . > >>> > >>> Any suggestions ? > >> > >> > > > > > > -- > > Regards, > > > > *Bin Mahone | 马洪宾* > > Apache Kylin: http://kylin.io > > Github: https://github.com/binmahone > > >
