[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593598#comment-14593598
 ] 

Nick Dimiduk commented on HBASE-13103:
--------------------------------------

Max of 250 total regions on a region server, not per table. This is a rough 
guideline, and will vary based on individual cluster configuration. Yes, this 
is definitely related to the 1M regions ticket.

bq. 1) should check that total number of regions doesn't approach the limits of 
AM

Yeah, there should be some upper bound on the total number of regions, which I 
assume would be something like {{$MAX_REGIONS_PER_SERVER * $NUM_SERVERS}}, 
where max regions per server is configurable.

bq. 2) we don't break table into ridiculously small regions (less than N hdfs 
blocks?)

Generally yes, but there is the counter case example i mentioned above, where 
I'm new to HBase and my "big table" is only a single region on a single host. 
We want the beginners to have a good experience too. More, smaller regions 
spread over an overpowered cluster should result in everything being cached and 
a better intro experience.

bq. do you think what's discussed here about ideal size should go there, or in 
subsequent ticket?

I'm fine with improvements on the normalizer algorithms going in with 
subsequent patches. I think your harness here is enough to let people get 
started -- for instance, Nasron from the user list thread titled "Stochastic 
Balancer by tables".

> [ergonomics] add region size balancing as a feature of master
> -------------------------------------------------------------
>
>                 Key: HBASE-13103
>                 URL: https://issues.apache.org/jira/browse/HBASE-13103
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer, Usability
>            Reporter: Nick Dimiduk
>            Assignee: Mikhail Antonov
>             Fix For: 2.0.0, 1.2.0
>
>         Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to