[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108227#comment-15108227 ] Mikhail Antonov commented on HBASE-13103: - [~stack] - missed that comments, sorry. I've assigned to myself jira to create refguide on that. Normalization is operated on/off in the same way as balancer or other znode-based trackers. I'm actually torn on whether we should have it on or off by default. Having it on by default sounds a bit aggressive, having it off might delay adoption. What do you think? I'm inclined to have it on by default with appropriate release note. If it misbehaves for someone, it's one shell command to disable completely, and any feedback on such case would help to improve "self-healing" heuristics. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103526#comment-15103526 ] stack commented on HBASE-13103: --- Yeah, and add a note on what the 'normalize' process is sir. Thanks boss. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103516#comment-15103516 ] stack commented on HBASE-13103: --- [~mantonov] The release note is no correct (or circumstance has changed since you wrote it?) It says this feature is off but it is on by default in 1.2, right sir? > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103549#comment-15103549 ] stack commented on HBASE-13103: --- nvm [~mantonov] Lets just work on a bit of doc for this new feature instead. Release was 'correct' as of writing. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730821#comment-14730821 ] Lars George commented on HBASE-13103: - Added linked issue HBASE-14367 for the shell work. It is an easy one but needs a little insight into how to do this best. [~mantonov], you want to take a stab? > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731177#comment-14731177 ] Mikhail Antonov commented on HBASE-13103: - Sure, let me assign it to me. Thanks Lars! > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725773#comment-14725773 ] Sean Busbey commented on HBASE-13103: - if it's in branch-1.2 it's good to go. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725794#comment-14725794 ] Sean Busbey commented on HBASE-13103: - ah. sorry, missed that. Do we have a jira yet? is the ETA a matter of hours or days? I'm probably not going to make RC0 today. I would like to make it tomorrow or Thursday. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725775#comment-14725775 ] Nick Dimiduk commented on HBASE-13103: -- bq. if it's in branch-1.2 it's good to go. They're talking about a new shell feature to better expose what's committed here. So the current feature is on branch-1.2, but the new shell code isn't there yet. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725094#comment-14725094 ] Lars George commented on HBASE-13103: - Nope: {noformat} hbase(main):028:0> alter 'testtable', {NORMALIZATION_ENABLED => 'true'} NameError: uninitialized constant NORMALIZATION_ENABLED {noformat} And even if so, it requires knowledge about the internal key name (says in the Java doc for the key in HTD). > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725156#comment-14725156 ] Mikhail Antonov commented on HBASE-13103: - That would be something to go in 1.3 and 2.*, or how do you see it? Does 1.2 next minor (patch?) release sound good? > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725107#comment-14725107 ] Lars George commented on HBASE-13103: - You may be able to force it like so: {noformat} hbase(main):035:0> alter 'normtable', {CONFIGURATION => {'NORMALIZATION_ENABLED' => 'true'}} Updating all regions with the new schema... 1/1 regions updated. Done. {noformat} but that is error-prone as you could easily misspell the arbitrary key string. I vote for proper shell support. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725155#comment-14725155 ] Mikhail Antonov commented on HBASE-13103: - Totally agreed - and thanks for bringing it up! Would you open a jira for that, or I can open one? Adding this support shouldn't be a lot of work.. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725679#comment-14725679 ] Nick Dimiduk commented on HBASE-13103: -- Getting this in for 1.2.0 would be great, if there's time. ping [~busbey]. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723616#comment-14723616 ] Lars George commented on HBASE-13103: - Is there follow up work or a JIRA tracking adding this to the shell? Is the only way to enable this per table using the Java API? > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723619#comment-14723619 ] Lars George commented on HBASE-13103: - Sorry, above was for [~mantonov] I guess. :) Please advise. > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723836#comment-14723836 ] Mikhail Antonov commented on HBASE-13103: - Normalization enable/disable flag per table is set in HTableDescriptor like, for example, compaction, so you should be able to do it from shell? alter 'table1', {NORMALIZATION_ENABLED => 'true'} > [ergonomics] add region size balancing as a feature of master > - > > Key: HBASE-13103 > URL: https://issues.apache.org/jira/browse/HBASE-13103 > Project: HBase > Issue Type: Improvement > Components: Balancer, Usability >Reporter: Nick Dimiduk >Assignee: Mikhail Antonov > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, > HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch > > > Often enough, folks miss-judge split points or otherwise end up with a > suboptimal number of regions. We should have an automated, reliable way to > "reshape" or "balance" a table's region boundaries. This would be for tables > that contain existing data. This might look like: > {noformat} > Admin#reshapeTable(TableName, int numSplits); > {noformat} > or from the shell: > {noformat} > > reshape TABLE, numSplits > {noformat} > Better still would be to have a maintenance process, similar to the existing > Balancer that runs AssignmentManager on an interval, to run the above > "reshape" operation on an interval. That way, the cluster will automatically > self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599574#comment-14599574 ] Sean Busbey commented on HBASE-13103: - Is the quota on # of regions or size of regions? [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599639#comment-14599639 ] Sean Busbey commented on HBASE-13103: - sure then, let's continue on a new jira. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599540#comment-14599540 ] Ted Yu commented on HBASE-13103: I have been thinking about the implications of this feature when namespace quota is turned on. Consider this scenario: the sum of regions of the tables in a particular namespace is close to the quota of this namespace. After some normalization activities, the sum of regions of the tables approaches the quota even further. When user wants to create a (pre-split) table in the same namespace, he / she may find out that there is not enough quota for the new table. I have a simple patch which disables normalization when the underlying namespace is under quota control. If people think the above idea is plausible, I can create a JIRA so that we continue discussion there. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599586#comment-14599586 ] Ted Yu commented on HBASE-13103: Quota is based on number of regions. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599985#comment-14599985 ] Mikhail Antonov commented on HBASE-13103: - Thanks [~te...@apache.org], I opened and linked HBASE-13964 for that. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597915#comment-14597915 ] Nick Dimiduk commented on HBASE-13103: -- Woo! Nice work [~mantonov]. For reference, please update the FixVersion such that every branch committed to is represented. Right now, master branch is JIRA version 2.0.0; branch-1 is 1.3.0, branch-1.2 is 1.2.0. This JIRA should be marked fixVerions=2.0.0, 1.3.0, 1.2.0. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 1.2.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598150#comment-14598150 ] Nick Dimiduk commented on HBASE-13103: -- Yeah, I thought as much; no problem. We should really update https://hbase.apache.org/book.html#_guide_for_hbase_committers ! [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598114#comment-14598114 ] Mikhail Antonov commented on HBASE-13103: - [~ndimiduk] done, I see - thanks! I thought I should only set the earliest branch where it was committed. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597260#comment-14597260 ] Hudson commented on HBASE-13103: FAILURE: Integrated in HBase-TRUNK #6591 (See [https://builds.apache.org/job/HBase-TRUNK/6591/]) HBASE-13103 [ergonomics] add region size balancing as a feature of master (antonov: rev fd37ccb63c545850c08c132b2f6470354a6629f9) * hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizationPlan.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerChore.java * hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/NormalizationPlan.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizerOnCluster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/EmptyNormalizationPlan.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerFactory.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitNormalizationPlan.java [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 1.2.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597230#comment-14597230 ] Hadoop QA commented on HBASE-13103: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12741223/HBASE-13103-branch-1.v3.patch against branch-1 branch at commit 6a537eb8545c7dd6c01c0d911ad12e789eeab3ae. ATTACHMENT ID: 12741223 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14521//console This message is automatically generated. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597337#comment-14597337 ] Hudson commented on HBASE-13103: SUCCESS: Integrated in HBase-1.3 #10 (See [https://builds.apache.org/job/HBase-1.3/10/]) HBASE-13103 [ergonomics] add region size balancing as a feature of master (antonov: rev 84675ef6159692b0a8da219df5abcf111fe46845) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/EmptyNormalizationPlan.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerFactory.java * hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/NormalizationPlan.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerChore.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizerOnCluster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizationPlan.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitNormalizationPlan.java * hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizer.java [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 1.2.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597364#comment-14597364 ] Hudson commented on HBASE-13103: SUCCESS: Integrated in HBase-1.2-IT #17 (See [https://builds.apache.org/job/HBase-1.2-IT/17/]) HBASE-13103 [ergonomics] add region size balancing as a feature of master (antonov: rev 5d1603f7591d22c212c2869d4cc820790a0a2f11) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java * hbase-common/src/main/resources/hbase-default.xml * hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/EmptyNormalizationPlan.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerFactory.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerChore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitNormalizationPlan.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizerOnCluster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/NormalizationPlan.java * hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizationPlan.java [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 1.2.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597385#comment-14597385 ] Hudson commented on HBASE-13103: SUCCESS: Integrated in HBase-1.3-IT #2 (See [https://builds.apache.org/job/HBase-1.3-IT/2/]) HBASE-13103 [ergonomics] add region size balancing as a feature of master (antonov: rev 84675ef6159692b0a8da219df5abcf111fe46845) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitNormalizationPlan.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerFactory.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/EmptyNormalizationPlan.java * hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/NormalizationPlan.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizationPlan.java * hbase-common/src/main/resources/hbase-default.xml * hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizerOnCluster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerChore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizer.java [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 1.2.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597378#comment-14597378 ] Hudson commented on HBASE-13103: FAILURE: Integrated in HBase-1.2 #24 (See [https://builds.apache.org/job/HBase-1.2/24/]) HBASE-13103 [ergonomics] add region size balancing as a feature of master (antonov: rev 5d1603f7591d22c212c2869d4cc820790a0a2f11) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/NormalizationPlan.java * hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java * hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerFactory.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerChore.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizationPlan.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitNormalizationPlan.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizerOnCluster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/EmptyNormalizationPlan.java [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 1.2.0 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596185#comment-14596185 ] Sean Busbey commented on HBASE-13103: - I'd like to see this in 1.2, but feature freeze is nigh. I'll leave this targeting until I actually cut the RC this afternoon/evening. Feel free to bump out to 1.3 if you don't think things will be ready. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596251#comment-14596251 ] Nick Dimiduk commented on HBASE-13103: -- Looks great [~mantonov], +1 ship it. Only thing is you're using java language {{assert}} in a couple places in test; instead use JUnit's {{assertTrue}}, but that can be fixed on commit. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595466#comment-14595466 ] Mikhail Antonov commented on HBASE-13103: - Test failures don't seem to be related (TestRegionRebalancing is generally flaky and fails for me on and off, and visibility labels tests pass on my local). Checkstyles (lack of final on 2 classes) I'll add in next version of patch. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596309#comment-14596309 ] Mikhail Antonov commented on HBASE-13103: - Thanks [~ndimiduk] (will fix this remaining nits on commit), [~busbey] - thanks! I'll commit this shortly in an hour or two unless there's objections. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597046#comment-14597046 ] Ted Yu commented on HBASE-13103: Latest patch should be good to go. There is room for improvement which can be addressed in follow-on JIRAs. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597089#comment-14597089 ] Mikhail Antonov commented on HBASE-13103: - Thanks [~te...@apache.org], agree there's room to further improve. I'm going to commit v3 to master, branch-1 and branch-1.2 then. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596982#comment-14596982 ] Hadoop QA commented on HBASE-13103: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12741160/HBASE-13103-v3.patch against master branch at commit d51a184051d968dc3bdc00b1c9257c0a9e5ff8a6. ATTACHMENT ID: 12741160 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14511//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14511//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14511//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14511//console This message is automatically generated. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597180#comment-14597180 ] Mikhail Antonov commented on HBASE-13103: - Committed to master, will commit version for branch-1 and branch-1.2 shortly [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594427#comment-14594427 ] Hadoop QA commented on HBASE-13103: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12740774/HBASE-13103-v2.patch against master branch at commit db08013ebeeaa85802d9795cc72b4c29c5338a47. ATTACHMENT ID: 12740774 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1908 checkstyle errors (more than the master's current 1906 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestRegionRebalancing {color:red}-1 core zombie tests{color}. There are 5 zombie test(s): at org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testVisibilityLabelsThatDoesNotPassTheCriteria(TestVisibilityLabels.java:231) at org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testVisibilityLabelsInGetThatDoesNotMatchAnyDefinedLabels(TestVisibilityLabels.java:400) at org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes.testVisibilityLabelsWithDeleteColumnsWithNoMatchVisExpWithMultipleVersionsNoTimestamp(TestVisibilityLabelsWithDeletes.java:376) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14477//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14477//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14477//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14477//console This message is automatically generated. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, HBASE-13103-v2.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594050#comment-14594050 ] Mikhail Antonov commented on HBASE-13103: - bq. Yeah, there should be some upper bound on the total number of regions, which I assume would be something like $MAX_REGIONS_PER_SERVER * $NUM_SERVERS, where max regions per server is configurable. I thought the limit is the other way around, i.e. the total number of regions is more or less fixed (as assignment manager won't handle that properly), hence, increasing number of region servers would inevitably decrease number of regions per RS - larger and larger regions? [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594149#comment-14594149 ] Nick Dimiduk commented on HBASE-13103: -- Yes there's a global upper bound as well, much more discussion over on HBASE-11165. I'm talking about a local upper bound on an individual region server's ability to maintain active regions online for reads and writes. Usually this is confined by memory pressure due to active memstores of open regions. I dunno though. Maybe [~lhofhansl] and [~phobos182] want to meter this upper bound differently? [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593220#comment-14593220 ] Mikhail Antonov commented on HBASE-13103: - Yeah. We used to mention here that region has some ideal size and we should try to get each region to this size, and I think we mentioned that ideal size might be a fixed fraction of max size or something like that. May'be needs to be more configurable. I guess you assume here that every large table is supposed to be spread across all RSs, and not just some subset (group?) of them? Also, to make sure I understand right, when you say 250 regions per RS, you mean 250regions of each table, or across all tables? Also this number of regions per RS.. I suppose we can derive it dynamically like (max number of regions total in cluster, as limited by AM performance, see issue about scaling to 1M regions) / # of RS? Total max number of regions could be set in config,like 100k or 300k? I'm thinking about roughly same logic for lower and upper ends (for lower end another implicit threshold would be max size of each region, and for upper limit I think there should be 2 more guards - 1) should check that total number of regions doesn't approach the limits of AM and 2) we don't break table into ridiculously small regions (less than N hdfs blocks?). [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593235#comment-14593235 ] Mikhail Antonov commented on HBASE-13103: - Thanks for review, will fix remaining items and update patch (do you think what's discussed here about ideal size should go there, or in subsequent ticket?) [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593598#comment-14593598 ] Nick Dimiduk commented on HBASE-13103: -- Max of 250 total regions on a region server, not per table. This is a rough guideline, and will vary based on individual cluster configuration. Yes, this is definitely related to the 1M regions ticket. bq. 1) should check that total number of regions doesn't approach the limits of AM Yeah, there should be some upper bound on the total number of regions, which I assume would be something like {{$MAX_REGIONS_PER_SERVER * $NUM_SERVERS}}, where max regions per server is configurable. bq. 2) we don't break table into ridiculously small regions (less than N hdfs blocks?) Generally yes, but there is the counter case example i mentioned above, where I'm new to HBase and my big table is only a single region on a single host. We want the beginners to have a good experience too. More, smaller regions spread over an overpowered cluster should result in everything being cached and a better intro experience. bq. do you think what's discussed here about ideal size should go there, or in subsequent ticket? I'm fine with improvements on the normalizer algorithms going in with subsequent patches. I think your harness here is enough to let people get started -- for instance, Nasron from the user list thread titled Stochastic Balancer by tables. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592689#comment-14592689 ] Nick Dimiduk commented on HBASE-13103: -- Left some comments on RB. Further thinking about the use-case, this chore is aiming for an ideal state of even cluster utilization. We seem to think of this in terms of (1) evenly distributed load and (2) region servers are not hosting more regions than they can hold -- regions are sized just right. We assume schema design results in a natural application load over keys, so (1) can be approximated by uniform region size and count. Uniform count/server is handled by the Balancer, which leaves the Normalizer to worry about overall count and size. Too few overall and you have unused hosts (i just stood up a 10 node cluster but only one host is doing work!), too many and you end up with 1k regions/server. At the lower end, we probably want to split relatively empty tables toward a goal of {{# of regions = 2x number of region servers}}. Or maybe 3x or 5x? At the upper end, we want to push toward a target of ~250 regions per region server and those regions being of uniform size if possible. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592702#comment-14592702 ] Nick Dimiduk commented on HBASE-13103: -- Oh, and probably we'll want coprocessor hooks pre- and post balancer invocation, but that can be a follow-on ticket. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585051#comment-14585051 ] Hadoop QA commented on HBASE-13103: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12739481/HBASE-13103-v1.patch against master branch at commit 682b8ab8a542a903e5807053282693e3a96bad2d. ATTACHMENT ID: 12739481 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1908 checkstyle errors (more than the master's current 1907 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.regionnormalizer.TestSimpleRegionNormalizerOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14403//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14403//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14403//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14403//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14403//console This message is automatically generated. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585016#comment-14585016 ] Mikhail Antonov commented on HBASE-13103: - bq. it'll be good to get this out for folks to start playing with it in 1.1.0. That's what I'm thinking too. Apparently nobody is going to turn it on in production env (yet); thinking what would be the most conservative yet usable strategy folks may want to play with in some sandbox clusters? [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554925#comment-14554925 ] Nick Dimiduk commented on HBASE-13103: -- Ping [~toffer], this is the ticket I mentioned. In another conversation I had recently, it occurred to me that this would be really handy for folks running elastic deployments, environments like EC2, YARN/Slider or Mesos where clusters are intentionally growing and shrinking capacity as business requirements change (cc [~clehene], [~stmcpherson], [~ste...@apache.org]) [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516547#comment-14516547 ] Mikhail Antonov commented on HBASE-13103: - Yeah :( that still needs some more work from my side to incorporate the feedbacks and probably several more rounds of reviews to get something ready for folks to try. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Improvement Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497774#comment-14497774 ] Mikhail Antonov commented on HBASE-13103: - All right, let me revise the suggestions here and I'll try to roll out next version of patch next week or so. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498445#comment-14498445 ] Nick Dimiduk commented on HBASE-13103: -- Thanks [~mantonov], it'll be good to get this out for folks to start playing with it in 1.1.0. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Balancer, Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486831#comment-14486831 ] Mikhail Antonov commented on HBASE-13103: - Also could have a variation of reshaping which doesn't really take any action, but writes down recommended merge/splits command into a file and makes it available somewhere? [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486884#comment-14486884 ] Mikhail Antonov commented on HBASE-13103: - bq. Yeah, meant the reshaping after I identified that something is odd/bad about a table. But maybe it's better to just automate, otherwise nobody would use it, as you say. I could have a switch like auto (chore + admin rpc calls accepted), manual (no chore, admin calls accepted), disabled (no chore, no rpc calls allowed) in hbase config for master. Or just auto and manual. Also thinking may be exposing more params to adjust the aggressiveness of reshaping would help people to adopt it. Probably better have policy which improves cluster state little bit, which many people are willing to turn on and forget about, rather than a policy, which could theoretically improve cluster state a lot, which most of production users would be afraid to turn on. As you said (and many users would likely agree!) that you'd be hesitant to turn it on unless you know that it takes nearly perfect decision. What if we try to formalize these rules, like - - only normalize tables which opted in (like in table descriptor) - don't touch regions which served writes in last N minutes, or served more than X reads last hour - don't normalize if balancer is in progress, or any splits/merges are in progress - don't normalize if RS hosting regions we want to split/merge is under high load (need to define it) May be you could list some more? Thanks for highlighting that point. W/o proper/configurable safeguarding probably many people won't have it enabled. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486812#comment-14486812 ] Lars Hofhansl commented on HBASE-13103: --- Just reading through the comments here. Unless the reshaper perfectly takes all factors into account I'd be very hesitant to run it on our clusters. With perfect I mean that it knows about load, disk IO, etc. Since that's hard (or impossible) I think I'd prefer to trigger this manually, as suggested in the description. But maybe I am overly cautious. Split decisions can be made locally and are rarely bad (unless really excessive). Merge decisions need (a) global knowledge - not all regions may on the same server and (b) can possibly lead to worse performance (hot regions merged together, etc) What we can automate is merging empty regions away. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486827#comment-14486827 ] Mikhail Antonov commented on HBASE-13103: - [~larsh] bq. I think I'd prefer to trigger this manually You mean - you'd prefer to do splits and merges manually, or you'd prefer to kick the reshaping manually (via admin command, rather than letting it run as a chore)? I'm thinking what could be done to make it safer and more conservative, while still reliving cluster admin of at least some housekeeping tasks. If this isn't safe, most people probably just won't turn it on.. - since as you said, split decisions are generally safer than merge decisions, could have policy which is much more conservative in merging, than it is in splitting - regarding the load..What's there in ServerLoad and RegionLoad won't suffice you think? Would it help it we grab some OS-level info in ServerLoad (or similar class)? [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486841#comment-14486841 ] Lars Hofhansl commented on HBASE-13103: --- Yeah, meant the reshaping after I identified that something is odd/bad about a table. But maybe it's better to just automate, otherwise nobody would use it, as you say. Splits already happen automatically with nice simple local-only logic do we need more logic for those? (but we could get rid of IncreasingToUpperBoundRegionSplitPolicy and combine it all in one class, which would be nice). bq. could have policy which is much more conservative in merging, than it is in splitting I think that'd be nice. With IncreasingToUpperBoundRegionSplitPolicy it's possible that we get a 2x size difference between regions for a bit. Hard to say whether a region will be written to in the future, and avoid an early merge. Maybe we can track the age of a region? And then favor older regions for merges unless they're hot... bq. ServerLoad and RegionLoad won't suffice you think? You're right, that's probably all the information we need. And if not, we'd add it. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486829#comment-14486829 ] Mikhail Antonov commented on HBASE-13103: - I meant [~lhofhansl], sorry, mistyped.. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484842#comment-14484842 ] Mikhail Antonov commented on HBASE-13103: - Thanks for review guys! bq. I'd like this to eventually be a globally enabled feature, with opt-out via table configuration. For it's initial commit, it should probably be opt-in instead. Having a global kill switch is probably a good idea too. How about adding a boolean flag it table descriptor (false by default) to bring a table to normalizers attention? Global switch would be still in place to turn everything off if so desired (probably just like with balancer, there shoud be admin rpc call to turn balancer on/off?) .bq Yes, priorities will become a useful feature. I think what you have here is a nice, committable first pass though. Thanks, yeah - that would probably be subsequent patch (as well as running in parallel on all tables opted-in) .bq Why no touch regions with fewer than 3 regions? 2 considerations I had in mind - tables with fewer than 3 regions probably too small to be hot spots in terms of cluster throughout (that may be not true?), and also they would require more thought-out rules. Let's say you have table with 2 regions, 10 and 70 gb. This current logic would say - well, 70 is than 40 * 2, so no normalization for you :) What you may really want is 2 regions of 40, or 5 regions by 16. Need to have ideal region size? [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484844#comment-14484844 ] Mikhail Antonov commented on HBASE-13103: - (sorry, formatting slided off) [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484867#comment-14484867 ] Mikhail Antonov commented on HBASE-13103: - [~phobos182] thanks for feedback! Very useful. I guess I have a lot of questions I'd like to ask, if you don't mind, to better understand the real needs. bq. Given the time difference between when the commands were run, this could end up with different region boundaries between the clusters – which is not desired. So I second the idea of generates reshaping plan so it can be applied in the same manner on the slave cluster. - How strictly consistent are you master and slave clusters? How much can they diverge? Is second cluster mostly for long-running analytics, which only dumps output in some other table? - So you don't have automatic splits now, as I understand, only pre-split tables? Otherwise how are you ensuring that the region boundaries are exactly the same? What's the avg region size? - Do you want region boundaries to be exactly the same, or approximately the same? Current patch has notion of reshaping plan, which includes params like split point (currently not computed though :) ). It'd be feasible to send these plans to normalizer on the other side (or rather, expose normalize() call, which accepts serialized reshaping plan, in master rpc services, but, the region names wouldn't be the same anyway) bq. Probably should think about performing a major compaction operation before the normalize policy runs. Yeah, that makes sense. Though I think most people run major compactions infrequently, so making this prerequisite would change that operational practice? How often do you run major compactions? [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485581#comment-14485581 ] Nick Dimiduk commented on HBASE-13103: -- [~mantonov]: bq. probably just like with balancer, there shoud be admin rpc call to turn balancer on/off? Yes, that would be good. Exposure through shell would be desirable as well, and a get status as well. bq. Need to have ideal region size? That's a good point. Probably ideal size is some percentage (70% ?) of the max region size, with a close enough allowance (ie, this normalizer's target region size is 70 +/- 5% of {{hbase.hregion.max.filesize}}. Thanks for coming around [~phobos182]! bq. Since this operation is pretty impactful on performance... I see this as not a single operation you run to normalize a table all at once, but rather something that happens in the background all the time, a kind of active anti-entropy happening behind the scenes to nudge a table into an ideal state. You think even a single split/merge operation is too heavy-weight to be done without premeditation? [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483558#comment-14483558 ] Nick Dimiduk commented on HBASE-13103: -- Nice work [~mantonov]. I left some comments over on RB. bq. being able to choose which table to normalize I'd like this to eventually be a globally enabled feature, with opt-out via table configuration. For it's initial commit, it should probably be opt-in instead. Having a global kill switch is probably a good idea too. bq. need to define normalization rules more strictly (including priority of operations? if table has both types of outlier in the ranks of its regions - too small and too big regions, then what action is more urgent) Yes, priorities will become a useful feature. I think what you have here is a nice, committable first pass though. bq. run normalization across several tables in parallel - is that something we should/shouldn't do Probably that's something we can and should do. Can be future patch though. bq. detecting currently running merges and splits. Current simple rules are just that we don't touch system tables and tables with less than 3 regions. Why no touch regions with fewer than 3 regions? These are all good questions for our operator friends. [~eclark], [~toffer], [~lhofhansl] any opinions here? Think you fellas may be interested in this feature. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484275#comment-14484275 ] Jeremy Carroll commented on HBASE-13103: Few comments from my side of things. As a setup, in our architecture, we run master / slave HBase clusters with replication setup between them. - Since this operation is pretty impactful on performance most likely we would do this on the slave cluster first. Switch the roles between master / slave. Then run the command on the master. Given the time difference between when the commands were run, this could end up with different region boundaries between the clusters -- which is not desired. So I second the idea of generates reshaping plan so it can be applied in the same manner on the slave cluster. - Probably should think about performing a major compaction operation before the normalize policy runs. We have a lot of tombstones on some of our clusters, which can inflate the region size by 60%. So splitting / merging in this condition is not ideal since the condition is temporary. Though after a compaction where you have the steady state is more realistic. I think it's a great feature. Though most of our clusters are balanced for QPS distribution, as CPU is one of our primary capacity planning metrics. Any tool which makes it easier to recover from pre-splitting mistakes is welcome. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482210#comment-14482210 ] Mikhail Antonov commented on HBASE-13103: - [~ndimiduk] any thoughts on the patch? :) [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393411#comment-14393411 ] Mikhail Antonov commented on HBASE-13103: - https://reviews.apache.org/r/32790/ - up on RB [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390744#comment-14390744 ] Mikhail Antonov commented on HBASE-13103: - Since that's the draft, many obviously needed things are missing, namely: - being able to choose which table to normalize - need to define normalization rules more strictly (including priority of operations? if table has both types of outlier in the ranks of its regions - too small and too big regions, then what action is more urgent) - run normalization across several tables in parallel - is that something we should/shouldn't do - detecting currently running merges and splits. Current simple rules are just that we don't touch system tables and tables with less than 3 regions. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391934#comment-14391934 ] Mikhail Antonov commented on HBASE-13103: - TestRegionServerObserver failure is unrelated, javadoc checkstyle will be fixed in v1 patch [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390977#comment-14390977 ] Hadoop QA commented on HBASE-13103: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12708703/HBASE-13103-v0.patch against master branch at commit 874aa9eb85077a4e5ab42d06820692ed379775ca. ATTACHMENT ID: 12708703 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 10 warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1928 checkstyle errors (more than the master's current 1924 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings (more than the master's current 0 warnings). {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.coprocessor.TestRegionServerObserver.testCoprocessorHooksInRegionsMerge(TestRegionServerObserver.java:100) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13520//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13520//artifact/patchprocess/patchReleaseAuditWarnings.txt Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13520//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13520//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13520//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13520//console This message is automatically generated. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Attachments: HBASE-13103-v0.patch Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370963#comment-14370963 ] Mikhail Antonov commented on HBASE-13103: - Is it fair to say this one supersedes the one I linked? [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372509#comment-14372509 ] Nick Dimiduk commented on HBASE-13103: -- Ha! Seems of identical intent. Good find. Might as well mark this one as a dupe and carry on. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Assignee: Mikhail Antonov Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364754#comment-14364754 ] Mikhail Antonov commented on HBASE-13103: - Some notes as I'm sketching the draft: - generally looks like the basic balancing architecture would just work - a chore on master, interface + pluggable initialization, main invocation gets kicked in HMaster, period and cutoff time limits. - runs on per-table basis (for first cut - could just do all or nothing, then add normalization params to table level configuration if needed) - normalizer computes list of normalization plans (which are simply, either split R1, or merge R1 and R1), those plans then executed one by one, I guess we don't want more than one merge or split going on the table in most cases? Execution of plan is simply figuring currently assigned HRS and requesting split over rpc. - whole thing is stateless, if master crashed during normalization, on next scheduled iteration it will be recomputed anyway [~ndimiduk] thoughts? [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366112#comment-14366112 ] Mikhail Antonov commented on HBASE-13103: - Yep, that's how I see it too. Practically, I think, splitting of excessively large regions (in environments with very high max region size or pre-split policy in place) would be more frequent usecase, than merging, since tables tend to grow up over time, not shrink down. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366193#comment-14366193 ] Nick Dimiduk commented on HBASE-13103: -- I've been thinking a lot about this feature in the context of a user who's upgrading from 0.92 or 0.94 with little 1-2g regions. Suddenly they have way more regions than necessary, so the region merge becomes very useful. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366195#comment-14366195 ] Mikhail Antonov commented on HBASE-13103: - Huh, right. I didn't think from that perspective. Thanks for the pointer. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366255#comment-14366255 ] Dave Latham commented on HBASE-13103: - We also have cases where some regions of data are correlated with time and also have a TTL. Over time those regions end up with little data in them and other regions end up with more data. It would be great for the small regions to merge. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366012#comment-14366012 ] Nick Dimiduk commented on HBASE-13103: -- Sounds about right! I'd say the plan can come up with a single action to execute per table, and then sleep again -- merging has impact on availability. Perhaps it wakes up more frequently than the BalancerChore, maybe every minute instead of every 5 by default. Should also have some back-pressure mechanism, so it shouldn't start a new merge if existing merge is gummed up. I don't think we can do that without HBASE-12439. Maybe it cannot be stateless is that's the case, or perhaps it can check if any merge operations are in flight for the target table and if so, go back to sleep (or move onto the next table). [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357769#comment-14357769 ] Mikhail Antonov commented on HBASE-13103: - Doing some prototyping here, may have something to show next week.. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355288#comment-14355288 ] Nick Dimiduk commented on HBASE-13103: -- [~mantonov] yeah, RegionEqualizer sounds like the right idea. Maybe normalizer? But yes, a background chore invokes something that analyzes current state and makes efforts to move toward an ideal state. UI improvements across the board are always welcome and very much encouraged. We don't get enough love in that department. A (graphical?) representation of region sizes and distribution would be awesome. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master
[ https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354522#comment-14354522 ] Mikhail Antonov commented on HBASE-13103: - [~ndimiduk] if this chore (RegionEqualizer? :) or better name?) is pluggable the guarding logic would be easy to experiment with. If I could help with this one, I'd be glad to. As a complementary idea - we can have web UI showing unevenly split regions/alerts etc. [ergonomics] add region size balancing as a feature of master - Key: HBASE-13103 URL: https://issues.apache.org/jira/browse/HBASE-13103 Project: HBase Issue Type: Brainstorming Components: Usability Reporter: Nick Dimiduk Often enough, folks miss-judge split points or otherwise end up with a suboptimal number of regions. We should have an automated, reliable way to reshape or balance a table's region boundaries. This would be for tables that contain existing data. This might look like: {noformat} Admin#reshapeTable(TableName, int numSplits); {noformat} or from the shell: {noformat} reshape TABLE, numSplits {noformat} Better still would be to have a maintenance process, similar to the existing Balancer that runs AssignmentManager on an interval, to run the above reshape operation on an interval. That way, the cluster will automatically self-correct toward a desirable state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)