>There're some new regions that they're just a some KBytes!. Why they are so small?? When does HBase decide to split? because it started to split two hours later to create the table.
When hbase does a split, it doesn't actually split at the disk/file level. Its just a metadata operation which creates new regions that contain the reference files that still point to old HFiles. That is the reason you find KB size regions. >I thought major compaction just happen once at day and compact many files per region. Data is always the same here, I don't inject new data. IIRC sometimes minor compactions get promoted to major compactions based on some criteria, but I'll leave it for others to answer! On Tue, Apr 15, 2014 at 3:15 PM, Guillermo Ortiz <[email protected]>wrote: > I have a table in Hbase that sizes around 96Gb, > > I generate 4 regions of 30Gb. Some time, table starts to split because the > max size for region is 1Gb (I just realize of that, I'm going to change it > or create more pre-splits.). > > There're two things that I don't understand. how is it creating the splits? > right now I have 130 regions and growing. The problem is the size of the > new regions: > > 1.7 M /hbase/filters/4ddbc34a2242e44c03121ae4608788a2 > 1.6 G /hbase/filters/548bdcec79cfe9a99fa57cb18f801be2 > 3.1 G /hbase/filters/58b50df089bd9d4d1f079f53238e060d > 2.5 M /hbase/filters/5a0d6d5b3b8faf67889ac5f5c2947c4f > 1.9 G /hbase/filters/5b0a35b5735a473b7e804c4b045ce374 > 883.4 M /hbase/filters/5b49c68e305b90d87b3c64a0eee60b8c > 1.7 M /hbase/filters/5d43fd7ea9808ab7d2f2134e80fbfae7 > 632.4 M /hbase/filters/5f04c7cd450d144f88fb4c7cff0796a2 > > There're some new regions that they're just a some KBytes!. Why they are so > small?? When does HBase decide to split? because it started to split two > hours later to create the table. > > One, I create the table and insert data, I don't insert new data or modify > them. > > > Another interested point it's why there're major compactions: > 2014-04-15 11:33:47,400 INFO org.apache.hadoop.hbase.regionserver.Store: > Renaming compacted file at > > hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/.tmp/df90c260cb4e4256a153dd178244f04c > to > > hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/d/df90c260cb4e4256a153dd178244f04c > 2014-04-15 11:33:47,407 INFO > org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded ROWCOL > (CompoundBloomFilter) metadata for df90c260cb4e4256a153dd178244f04c > 2014-04-15 11:33:47,416 INFO org.apache.hadoop.hbase.regionserver.Store:* > Completed major compaction of 1 file*(s) in d of > filters,51,1397554175140.ef994715505054299ede8c48c600cea4. into > df90c260cb4e4256a153dd178244f04c, size=789.1 M; total size for store is > 789.1 M > 2014-04-15 11:33:47,416 INFO > org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: > completed compaction: > regionName=filters,51,1397554175140.ef994715505054299ede8c48c600cea4., > storeName=d, fileCount=1, fileSize=1.5 G, priority=6, time=414761474510060; > duration=7sec > > I thought major compaction just happen once at day and compact many files > per region. Data is always the same here, I don't inject new data. > > > I'm working with 0.94.6 CDH44. I'm going to change the size of the regions, > but, I would like to understand why things happen. > > Thank you. > -- Bharath Vissapragada <http://www.cloudera.com>
