[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486884#comment-14486884
 ] 

Mikhail Antonov commented on HBASE-13103:
-----------------------------------------

bq. Yeah, meant the "reshaping" after I identified that something is odd/bad 
about a table. But maybe it's better to just automate, otherwise nobody would 
use it, as you say.

I could have a switch like "auto" (chore + admin rpc calls accepted), "manual" 
(no chore, admin calls accepted), "disabled" (no chore, no rpc calls allowed) 
in hbase config for master. Or just "auto" and "manual". Also thinking may be 
exposing more params to adjust the aggressiveness of reshaping would help 
people to adopt it. Probably better have policy which improves cluster state 
little bit, which many people are willing to turn on and forget about, rather 
than a policy, which could theoretically improve cluster state a lot, which 
most of production users would be afraid to turn on.

As you said (and many users would likely agree!) that you'd be hesitant to turn 
it on unless you know that it takes nearly perfect decision. What if we try to 
formalize these rules, like - 

 - only normalize tables which opted in (like in table descriptor)
 - don't touch regions which served writes in last N minutes, or served more 
than X reads last hour
 - don't normalize if balancer is in progress, or any splits/merges are in 
progress
 - don't normalize if RS hosting regions we want to split/merge is under high 
load (need to define it)

May be you could list some more? Thanks for highlighting that point. W/o 
proper/configurable safeguarding probably many people won't have it enabled.



> [ergonomics] add region size balancing as a feature of master
> -------------------------------------------------------------
>
>                 Key: HBASE-13103
>                 URL: https://issues.apache.org/jira/browse/HBASE-13103
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: Usability
>            Reporter: Nick Dimiduk
>            Assignee: Mikhail Antonov
>             Fix For: 2.0.0, 1.1.0
>
>         Attachments: HBASE-13103-v0.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to