the cutting is based on estimation. due to hbase compression and encoding, the estimation might be not very accurate. one recent ticket on this is https://issues.apache.org/jira/browse/KYLIN-1237
On Tue, Jan 5, 2016 at 12:30 AM, Zhang, Zhong <[email protected]> wrote: > Hi All, > > Happy new year! > > Kylin provides three options for cut size. Please see the following: > > # The cut size for hbase region, in GB. > # E.g, for cube whose capacity be marked as "SMALL", split region per 10GB > by default > kylin.hbase.region.cut.small=10 > kylin.hbase.region.cut.medium=20 > kylin.hbase.region.cut.large=100 > > I choose cube size as small to build the cube and the following is one of > the HTable I got. > HTable: KYLIN_O03ZWB4DK9 > > * Region Count: 979 > * Size: 5.75 TB > * Start Time: 2011-12-31 00:00:00 > * End Time: 2014-05-01 01:00:00 > So the size of the HTable is 5.75TB and there are 979 regions in total? > > Let's do a little bit math. 979*10GB (since split region per 10GB when > cube size is > marked as small) definitely does not equal 5.75TB. Do I understand > correctly? > > Best regards > Zhong > > -- Regards, *Bin Mahone | 马洪宾* Apache Kylin: http://kylin.io Github: https://github.com/binmahone
