[ 
https://issues.apache.org/jira/browse/HBASE-14838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014035#comment-15014035
 ] 

Josh Elser commented on HBASE-14838:
------------------------------------

bq. If you have 5-6 empty regions and no data in there, do you want normalizer 
to merge them together?

I think that's the ultimate question to answer. Given how the code read, it 
just looked like this was something that wasn't considered. Specifically, in 
working with Romil, we were trying to investigate why the normalizer wasn't 
doing anything with these regions as we expected it to (in a contrived 
environment).

bq. If you have 5-6 empty regions and no data in there, do you want normalizer 
to merge them together?

In this contrived case, we had only written a small amount of data to one of 
the regions (<1MB). I've yet to investigate why the greater-than-zero amount of 
data in one region was ultimately treated as no data (confirmed via a remote 
debugger attached to the master). Because of this, the average size was 
reported as zero (even for a data with a small amount of data in it). Purely 
background information at this point -- I need to look into this again.

bq. As Mikhail Antonov says, if there's no data, we have no way to guess at a 
reasonable distribution of split points.

At first glance, my reaction was that the "reasonable distribution of split 
points" for no data in a table is having no split points. Same goes for small 
amounts of data. I hadn't considered the side-effect of the normalizer undo-ing 
a pre-split table (dev goes to get lunch before starting ingest) which would be 
a confusing story to tell. Perhaps some comments on the class would be 
sufficient to record that "hey, this won't do anything to empty tables". What 
do you guys think?

> SimpleRegionNormalizer does not merge empty region of a table
> -------------------------------------------------------------
>
>                 Key: HBASE-14838
>                 URL: https://issues.apache.org/jira/browse/HBASE-14838
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.2.0
>            Reporter: Romil Choksi
>
> SImpleRegionNormalizer does not merge empty region of a table
> Steps to repro:
> - Create an empty table with few, say 5-6 regions without any data in any of 
> them
> - Verify hbase:meta table to verify the regions for the table or check 
> HMaster UI
> - Enable normalizer switch and normalization for this table
> - Run normalizer, by 'normalize' command from hbase shell
> - Verify the regions for table by scanning hbase:meta table or checking 
> HMaster web UI
> The empty regions are not merged on running the region normalizer. This seems 
> to be an edge case with completely empty regions since the Normalizer checks 
> for: smallestRegion (in this case 0 size) + smallestNeighborOfSmallestRegion 
> (in this case 0 size) > avg region size (in this case 0 size)
> thanks to [~elserj] for verifying this from the source code side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to