[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2016-01-20 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108227#comment-15108227
 ] 

Mikhail Antonov commented on HBASE-13103:
-

[~stack] - missed that comments, sorry. I've assigned to myself jira to create 
refguide on that.

Normalization is operated on/off in the same way as balancer or other 
znode-based trackers. I'm actually torn on whether we should have it on or off 
by default. Having it on by default sounds a bit aggressive, having it off 
might delay adoption. What do you think? I'm inclined to have it on by default 
with appropriate release note. If it misbehaves for someone, it's one shell 
command to disable completely, and any feedback on such case would help to 
improve "self-healing" heuristics.

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2016-01-16 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103526#comment-15103526
 ] 

stack commented on HBASE-13103:
---

Yeah, and add a note on what the 'normalize' process is sir. Thanks boss.

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2016-01-16 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103516#comment-15103516
 ] 

stack commented on HBASE-13103:
---

[~mantonov] The release note is no correct (or circumstance has changed since 
you wrote it?) It says this feature is off but it is on by default in 1.2, 
right sir?

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2016-01-16 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103549#comment-15103549
 ] 

stack commented on HBASE-13103:
---

nvm [~mantonov] Lets just work on a bit of doc for this new feature instead.  
Release was 'correct' as of writing.

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-09-04 Thread Lars George (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730821#comment-14730821
 ] 

Lars George commented on HBASE-13103:
-

Added linked issue HBASE-14367 for the shell work. It is an easy one but needs 
a little insight into how to do this best. [~mantonov], you want to take a stab?

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-09-04 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731177#comment-14731177
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Sure, let me assign it to me. Thanks Lars!

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-09-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725773#comment-14725773
 ] 

Sean Busbey commented on HBASE-13103:
-

if it's in branch-1.2 it's good to go.

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-09-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725794#comment-14725794
 ] 

Sean Busbey commented on HBASE-13103:
-

ah. sorry, missed that. Do we have a jira yet? is the ETA a matter of hours or 
days?

I'm probably not going to make RC0 today. I would like to make it tomorrow or 
Thursday.

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-09-01 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725775#comment-14725775
 ] 

Nick Dimiduk commented on HBASE-13103:
--

bq. if it's in branch-1.2 it's good to go.

They're talking about a new shell feature to better expose what's committed 
here. So the current feature is on branch-1.2, but the new shell code isn't 
there yet.

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-09-01 Thread Lars George (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725094#comment-14725094
 ] 

Lars George commented on HBASE-13103:
-

Nope: 

{noformat}
hbase(main):028:0> alter 'testtable', {NORMALIZATION_ENABLED => 'true'}
NameError: uninitialized constant NORMALIZATION_ENABLED
{noformat}

And even if so, it requires knowledge about the internal key name (says in the 
Java doc for the key in HTD).

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-09-01 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725156#comment-14725156
 ] 

Mikhail Antonov commented on HBASE-13103:
-

That would be something to go in 1.3 and 2.*, or how do you see it? Does 1.2 
next minor (patch?) release sound good?

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-09-01 Thread Lars George (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725107#comment-14725107
 ] 

Lars George commented on HBASE-13103:
-

You may be able to force it like so:

{noformat}
hbase(main):035:0> alter 'normtable', {CONFIGURATION => 
{'NORMALIZATION_ENABLED' => 'true'}}
Updating all regions with the new schema...
1/1 regions updated.
Done.
{noformat}

but that is error-prone as you could easily misspell the arbitrary key string. 
I vote for proper shell support.

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-09-01 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725155#comment-14725155
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Totally agreed - and thanks for bringing it up! Would you open a jira for that, 
or I can open one? Adding this support shouldn't be a lot of work..

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-09-01 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725679#comment-14725679
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Getting this in for 1.2.0 would be great, if there's time. ping [~busbey].

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-08-31 Thread Lars George (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723616#comment-14723616
 ] 

Lars George commented on HBASE-13103:
-

Is there follow up work or a JIRA tracking adding this to the shell? Is the 
only way to enable this per table using the Java API?

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-08-31 Thread Lars George (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723619#comment-14723619
 ] 

Lars George commented on HBASE-13103:
-

Sorry, above was for [~mantonov] I guess. :) Please advise.

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-08-31 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723836#comment-14723836
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Normalization enable/disable flag per table is set in HTableDescriptor like, 
for example, compaction, so you should be able to do it from shell?

alter 'table1', {NORMALIZATION_ENABLED => 'true'}

> [ergonomics] add region size balancing as a feature of master
> -
>
> Key: HBASE-13103
> URL: https://issues.apache.org/jira/browse/HBASE-13103
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Usability
>Reporter: Nick Dimiduk
>Assignee: Mikhail Antonov
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
> HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch
>
>
> Often enough, folks miss-judge split points or otherwise end up with a 
> suboptimal number of regions. We should have an automated, reliable way to 
> "reshape" or "balance" a table's region boundaries. This would be for tables 
> that contain existing data. This might look like:
> {noformat}
> Admin#reshapeTable(TableName, int numSplits);
> {noformat}
> or from the shell:
> {noformat}
> > reshape TABLE, numSplits
> {noformat}
> Better still would be to have a maintenance process, similar to the existing 
> Balancer that runs AssignmentManager on an interval, to run the above 
> "reshape" operation on an interval. That way, the cluster will automatically 
> self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-24 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599574#comment-14599574
 ] 

Sean Busbey commented on HBASE-13103:
-

Is the quota on # of regions or size of regions?

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-24 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599639#comment-14599639
 ] 

Sean Busbey commented on HBASE-13103:
-

sure then, let's continue on a new jira.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599540#comment-14599540
 ] 

Ted Yu commented on HBASE-13103:


I have been thinking about the implications of this feature when namespace 
quota is turned on.
Consider this scenario: the sum of regions of the tables in a particular 
namespace is close to the quota of this namespace. After some normalization 
activities, the sum of regions of the tables approaches the quota even further.
When user wants to create a (pre-split) table in the same namespace, he / she 
may find out that there is not enough quota for the new table.

I have a simple patch which disables normalization when the underlying 
namespace is under quota control.

If people think the above idea is plausible, I can create a JIRA so that we 
continue discussion there.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599586#comment-14599586
 ] 

Ted Yu commented on HBASE-13103:


Quota is based on number of regions.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-24 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599985#comment-14599985
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Thanks [~te...@apache.org], I opened and linked HBASE-13964 for that.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-23 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597915#comment-14597915
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Woo! Nice work [~mantonov].

For reference, please update the FixVersion such that every branch committed to 
is represented. Right now, master branch is JIRA version 2.0.0; branch-1 is 
1.3.0, branch-1.2 is 1.2.0. This JIRA should be marked fixVerions=2.0.0, 1.3.0, 
1.2.0.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 1.2.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-23 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598150#comment-14598150
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Yeah, I thought as much; no problem. We should really update 
https://hbase.apache.org/book.html#_guide_for_hbase_committers !

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-23 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598114#comment-14598114
 ] 

Mikhail Antonov commented on HBASE-13103:
-

[~ndimiduk] done, I see - thanks! I thought I should only set the earliest 
branch where it was committed. 

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597260#comment-14597260
 ] 

Hudson commented on HBASE-13103:


FAILURE: Integrated in HBase-TRUNK #6591 (See 
[https://builds.apache.org/job/HBase-TRUNK/6591/])
HBASE-13103 [ergonomics] add region size balancing as a feature of master 
(antonov: rev fd37ccb63c545850c08c132b2f6470354a6629f9)
* hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
* hbase-common/src/main/resources/hbase-default.xml
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizationPlan.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerChore.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/NormalizationPlan.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizerOnCluster.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/EmptyNormalizationPlan.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerFactory.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitNormalizationPlan.java


 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 1.2.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597230#comment-14597230
 ] 

Hadoop QA commented on HBASE-13103:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12741223/HBASE-13103-branch-1.v3.patch
  against branch-1 branch at commit 6a537eb8545c7dd6c01c0d911ad12e789eeab3ae.
  ATTACHMENT ID: 12741223

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14521//console

This message is automatically generated.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597337#comment-14597337
 ] 

Hudson commented on HBASE-13103:


SUCCESS: Integrated in HBase-1.3 #10 (See 
[https://builds.apache.org/job/HBase-1.3/10/])
HBASE-13103 [ergonomics] add region size balancing as a feature of master 
(antonov: rev 84675ef6159692b0a8da219df5abcf111fe46845)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/EmptyNormalizationPlan.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerFactory.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
* hbase-common/src/main/resources/hbase-default.xml
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/NormalizationPlan.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerChore.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizerOnCluster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizationPlan.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitNormalizationPlan.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizer.java


 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 1.2.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597364#comment-14597364
 ] 

Hudson commented on HBASE-13103:


SUCCESS: Integrated in HBase-1.2-IT #17 (See 
[https://builds.apache.org/job/HBase-1.2-IT/17/])
HBASE-13103 [ergonomics] add region size balancing as a feature of master 
(antonov: rev 5d1603f7591d22c212c2869d4cc820790a0a2f11)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
* hbase-common/src/main/resources/hbase-default.xml
* hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/EmptyNormalizationPlan.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerFactory.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerChore.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitNormalizationPlan.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizerOnCluster.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/NormalizationPlan.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizationPlan.java


 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 1.2.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597385#comment-14597385
 ] 

Hudson commented on HBASE-13103:


SUCCESS: Integrated in HBase-1.3-IT #2 (See 
[https://builds.apache.org/job/HBase-1.3-IT/2/])
HBASE-13103 [ergonomics] add region size balancing as a feature of master 
(antonov: rev 84675ef6159692b0a8da219df5abcf111fe46845)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitNormalizationPlan.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerFactory.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/EmptyNormalizationPlan.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/NormalizationPlan.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizationPlan.java
* hbase-common/src/main/resources/hbase-default.xml
* hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizerOnCluster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerChore.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizer.java


 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 1.2.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597378#comment-14597378
 ] 

Hudson commented on HBASE-13103:


FAILURE: Integrated in HBase-1.2 #24 (See 
[https://builds.apache.org/job/HBase-1.2/24/])
HBASE-13103 [ergonomics] add region size balancing as a feature of master 
(antonov: rev 5d1603f7591d22c212c2869d4cc820790a0a2f11)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
* hbase-common/src/main/resources/hbase-default.xml
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/NormalizationPlan.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerFactory.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizerChore.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizationPlan.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitNormalizationPlan.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizerOnCluster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/EmptyNormalizationPlan.java


 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 1.2.0

 Attachments: HBASE-13103-branch-1.v3.patch, HBASE-13103-v0.patch, 
 HBASE-13103-v1.patch, HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-22 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596185#comment-14596185
 ] 

Sean Busbey commented on HBASE-13103:
-

I'd like to see this in 1.2, but feature freeze is nigh. I'll leave this 
targeting until I actually cut the RC this afternoon/evening. Feel free to bump 
out to 1.3 if you don't think things will be ready.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, 
 HBASE-13103-v2.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-22 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596251#comment-14596251
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Looks great [~mantonov], +1 ship it. Only thing is you're using java language 
{{assert}} in a couple places in test; instead use JUnit's {{assertTrue}}, but 
that can be fixed on commit.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, 
 HBASE-13103-v2.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-22 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595466#comment-14595466
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Test failures don't seem to be related (TestRegionRebalancing is generally 
flaky and fails for me on and off, and visibility labels tests pass on my 
local). Checkstyles (lack of final on 2 classes) I'll add in next version of 
patch.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, 
 HBASE-13103-v2.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-22 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596309#comment-14596309
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Thanks [~ndimiduk] (will fix this remaining nits on commit), [~busbey] - 
thanks! I'll commit this shortly in an hour or two unless there's objections.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, 
 HBASE-13103-v2.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-22 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597046#comment-14597046
 ] 

Ted Yu commented on HBASE-13103:


Latest patch should be good to go.
There is room for improvement which can be addressed in follow-on JIRAs.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, 
 HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-22 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597089#comment-14597089
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Thanks [~te...@apache.org], agree there's room to further improve. I'm going to 
commit v3 to master, branch-1 and branch-1.2 then.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, 
 HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596982#comment-14596982
 ] 

Hadoop QA commented on HBASE-13103:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12741160/HBASE-13103-v3.patch
  against master branch at commit d51a184051d968dc3bdc00b1c9257c0a9e5ff8a6.
  ATTACHMENT ID: 12741160

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14511//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14511//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14511//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14511//console

This message is automatically generated.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, 
 HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-22 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597180#comment-14597180
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Committed to master, will commit version for branch-1 and branch-1.2 shortly

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, 
 HBASE-13103-v2.patch, HBASE-13103-v3.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594427#comment-14594427
 ] 

Hadoop QA commented on HBASE-13103:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12740774/HBASE-13103-v2.patch
  against master branch at commit db08013ebeeaa85802d9795cc72b4c29c5338a47.
  ATTACHMENT ID: 12740774

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
1908 checkstyle errors (more than the master's current 1906 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestRegionRebalancing

 {color:red}-1 core zombie tests{color}.  There are 5 zombie test(s):   
at 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testVisibilityLabelsThatDoesNotPassTheCriteria(TestVisibilityLabels.java:231)
at 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testVisibilityLabelsInGetThatDoesNotMatchAnyDefinedLabels(TestVisibilityLabels.java:400)
at 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes.testVisibilityLabelsWithDeleteColumnsWithNoMatchVisExpWithMultipleVersionsNoTimestamp(TestVisibilityLabelsWithDeletes.java:376)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14477//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14477//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14477//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14477//console

This message is automatically generated.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch, 
 HBASE-13103-v2.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-19 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594050#comment-14594050
 ] 

Mikhail Antonov commented on HBASE-13103:
-

bq. Yeah, there should be some upper bound on the total number of regions, 
which I assume would be something like $MAX_REGIONS_PER_SERVER * $NUM_SERVERS, 
where max regions per server is configurable.

I thought the limit is the other way around, i.e. the total number of regions 
is more or less fixed (as assignment manager won't handle that properly), 
hence, increasing number of region servers would inevitably decrease number of 
regions per RS - larger and larger regions?

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-19 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594149#comment-14594149
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Yes there's a global upper bound as well, much more discussion over on 
HBASE-11165.

I'm talking about a local upper bound on an individual region server's ability 
to maintain active regions online for reads and writes. Usually this is 
confined by memory pressure due to active memstores of open regions.

I dunno though. Maybe [~lhofhansl] and [~phobos182] want to meter this upper 
bound differently?

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-19 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593220#comment-14593220
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Yeah. We used to mention here that region has some ideal size and we should 
try to get each region to this size, and I think we mentioned that ideal size 
might be a fixed fraction of max size or something like that. May'be needs to 
be more configurable.

I guess you assume here that every large table is supposed to be spread across 
all RSs, and not just some subset (group?) of them? Also, to make sure I 
understand right, when you say 250 regions per RS, you mean 250regions of 
each table, or across all tables? Also this number of regions per RS.. I 
suppose we can derive it dynamically like (max number of regions total in 
cluster, as limited by AM performance, see issue about scaling to 1M regions) / 
# of RS? Total max number of regions could be set in config,like 100k or 300k?

I'm thinking about roughly same logic for lower and upper ends (for lower end 
another implicit threshold would be max size of each region, and for upper 
limit I think there should be 2 more guards - 1) should check that total number 
of regions doesn't approach the limits of AM and 2) we don't break table into 
ridiculously small regions (less than N hdfs blocks?).
 

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-19 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593235#comment-14593235
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Thanks for review, will fix remaining items and update patch (do you think 
what's discussed here about ideal size should go there, or in subsequent 
ticket?)

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-19 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593598#comment-14593598
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Max of 250 total regions on a region server, not per table. This is a rough 
guideline, and will vary based on individual cluster configuration. Yes, this 
is definitely related to the 1M regions ticket.

bq. 1) should check that total number of regions doesn't approach the limits of 
AM

Yeah, there should be some upper bound on the total number of regions, which I 
assume would be something like {{$MAX_REGIONS_PER_SERVER * $NUM_SERVERS}}, 
where max regions per server is configurable.

bq. 2) we don't break table into ridiculously small regions (less than N hdfs 
blocks?)

Generally yes, but there is the counter case example i mentioned above, where 
I'm new to HBase and my big table is only a single region on a single host. 
We want the beginners to have a good experience too. More, smaller regions 
spread over an overpowered cluster should result in everything being cached and 
a better intro experience.

bq. do you think what's discussed here about ideal size should go there, or in 
subsequent ticket?

I'm fine with improvements on the normalizer algorithms going in with 
subsequent patches. I think your harness here is enough to let people get 
started -- for instance, Nasron from the user list thread titled Stochastic 
Balancer by tables.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592689#comment-14592689
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Left some comments on RB.

Further thinking about the use-case, this chore is aiming for an ideal state of 
even cluster utilization. We seem to think of this in terms of (1) evenly 
distributed load and (2) region servers are not hosting more regions than they 
can hold -- regions are sized just right. We assume schema design results in 
a natural application load over keys, so (1) can be approximated by uniform 
region size and count. Uniform count/server is handled by the Balancer, which 
leaves the Normalizer to worry about overall count and size. Too few overall 
and you have unused hosts (i just stood up a 10 node cluster but only one host 
is doing work!), too many and you end up with 1k regions/server.

At the lower end, we probably want to split relatively empty tables toward a 
goal of {{# of regions = 2x number of region servers}}. Or maybe 3x or 5x?

At the upper end, we want to push toward a target of ~250 regions per region 
server and those regions being of uniform size if possible.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592702#comment-14592702
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Oh, and probably we'll want coprocessor hooks pre- and post balancer 
invocation, but that can be a follow-on ticket.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585051#comment-14585051
 ] 

Hadoop QA commented on HBASE-13103:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12739481/HBASE-13103-v1.patch
  against master branch at commit 682b8ab8a542a903e5807053282693e3a96bad2d.
  ATTACHMENT ID: 12739481

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
1908 checkstyle errors (more than the master's current 1907 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.regionnormalizer.TestSimpleRegionNormalizerOnCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14403//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14403//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14403//artifact/patchprocess/checkstyle-aggregate.html

Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14403//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14403//console

This message is automatically generated.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-06-14 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14585016#comment-14585016
 ] 

Mikhail Antonov commented on HBASE-13103:
-

bq. it'll be good to get this out for folks to start playing with it in 1.1.0.
That's what I'm thinking too. Apparently nobody is going to turn it on in 
production env (yet); thinking what would be the most conservative yet usable 
strategy folks may want to play with in some sandbox clusters?

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch, HBASE-13103-v1.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-05-21 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554925#comment-14554925
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Ping [~toffer], this is the ticket I mentioned.

In another conversation I had recently, it occurred to me that this would be 
really handy for folks running elastic deployments, environments like EC2, 
YARN/Slider or Mesos where clusters are intentionally growing and shrinking 
capacity as business requirements change (cc [~clehene], [~stmcpherson], 
[~ste...@apache.org])

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-28 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516547#comment-14516547
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Yeah :( that still needs some more work from my side to incorporate the 
feedbacks and probably several more rounds of reviews to get something ready 
for folks to try.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-16 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497774#comment-14497774
 ] 

Mikhail Antonov commented on HBASE-13103:
-

All right, let me revise the suggestions here and I'll try to roll out next 
version of patch next week or so.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-16 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498445#comment-14498445
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Thanks [~mantonov], it'll be good to get this out for folks to start playing 
with it in 1.1.0.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Balancer, Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-09 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486831#comment-14486831
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Also could have a variation of reshaping which doesn't really take any 
action, but writes down recommended merge/splits command into a file and makes 
it available somewhere?

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-09 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486884#comment-14486884
 ] 

Mikhail Antonov commented on HBASE-13103:
-

bq. Yeah, meant the reshaping after I identified that something is odd/bad 
about a table. But maybe it's better to just automate, otherwise nobody would 
use it, as you say.

I could have a switch like auto (chore + admin rpc calls accepted), manual 
(no chore, admin calls accepted), disabled (no chore, no rpc calls allowed) 
in hbase config for master. Or just auto and manual. Also thinking may be 
exposing more params to adjust the aggressiveness of reshaping would help 
people to adopt it. Probably better have policy which improves cluster state 
little bit, which many people are willing to turn on and forget about, rather 
than a policy, which could theoretically improve cluster state a lot, which 
most of production users would be afraid to turn on.

As you said (and many users would likely agree!) that you'd be hesitant to turn 
it on unless you know that it takes nearly perfect decision. What if we try to 
formalize these rules, like - 

 - only normalize tables which opted in (like in table descriptor)
 - don't touch regions which served writes in last N minutes, or served more 
than X reads last hour
 - don't normalize if balancer is in progress, or any splits/merges are in 
progress
 - don't normalize if RS hosting regions we want to split/merge is under high 
load (need to define it)

May be you could list some more? Thanks for highlighting that point. W/o 
proper/configurable safeguarding probably many people won't have it enabled.



 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-09 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486812#comment-14486812
 ] 

Lars Hofhansl commented on HBASE-13103:
---

Just reading through the comments here. Unless the reshaper perfectly takes 
all factors into account I'd be very hesitant to run it on our clusters. With 
perfect I mean that it knows about load, disk IO, etc. Since that's hard (or 
impossible) I think I'd prefer to trigger this manually, as suggested in the 
description. But maybe I am overly cautious.

Split decisions can be made locally and are rarely bad (unless really 
excessive). Merge decisions need (a) global knowledge - not all regions may on 
the same server and (b) can possibly lead to worse performance (hot regions 
merged together, etc)
What we can automate is merging empty regions away.


 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-09 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486827#comment-14486827
 ] 

Mikhail Antonov commented on HBASE-13103:
-

[~larsh]

bq. I think I'd prefer to trigger this manually
You mean - you'd prefer to do splits and merges manually, or you'd prefer to 
kick the reshaping manually (via admin command, rather than letting it run as 
a chore)?

I'm thinking what could be done to make it safer and more conservative, while 
still reliving cluster admin of at least some housekeeping tasks. If this isn't 
safe, most people probably just won't turn it on..

 - since as you said, split decisions are generally safer than merge decisions, 
could have policy which is much more conservative in merging, than it is in 
splitting
 - regarding the load..What's there in ServerLoad and RegionLoad won't suffice 
you think? Would it help it we grab some OS-level info in ServerLoad (or 
similar class)?

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-09 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486841#comment-14486841
 ] 

Lars Hofhansl commented on HBASE-13103:
---

Yeah, meant the reshaping after I identified that something is odd/bad about 
a table.
But maybe it's better to just automate, otherwise nobody would use it, as you 
say.

Splits already happen automatically with nice simple local-only logic do we 
need more logic for those? (but we could get rid of 
IncreasingToUpperBoundRegionSplitPolicy and combine it all in one class, which 
would be nice).

bq. could have policy which is much more conservative in merging, than it is in 
splitting
I think that'd be nice. With IncreasingToUpperBoundRegionSplitPolicy it's 
possible that we get a 2x size difference between regions for a bit. Hard to 
say whether a region will be written to in the future, and avoid an early 
merge. Maybe we can track the age of a region? And then favor older regions for 
merges unless they're hot...

bq. ServerLoad and RegionLoad won't suffice you think?
You're right, that's probably all the information we need. And if not, we'd add 
it.


 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-09 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486829#comment-14486829
 ] 

Mikhail Antonov commented on HBASE-13103:
-

I meant [~lhofhansl], sorry, mistyped..

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-08 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484842#comment-14484842
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Thanks for review guys!

bq. I'd like this to eventually be a globally enabled feature, with opt-out via 
table configuration. For it's initial commit, it should probably be opt-in 
instead. Having a global kill switch is probably a good idea too.

How about adding a boolean flag it table descriptor (false by default) to bring 
a table to normalizers attention? Global switch would be still in place to turn 
everything off if so desired (probably just like with balancer, there shoud be 
admin rpc call to turn balancer on/off?)

.bq Yes, priorities will become a useful feature. I think what you have here is 
a nice, committable first pass though.
Thanks, yeah - that would probably be subsequent patch (as well as running in 
parallel on all tables opted-in)

.bq Why no touch regions with fewer than 3 regions?
2 considerations I had in mind - tables with fewer than 3 regions probably too 
small to be hot spots in terms of cluster throughout (that may be not true?), 
and also they would require more thought-out rules. Let's say you have table 
with 2 regions, 10 and 70 gb. This current logic would say - well, 70 is  than 
40 * 2, so no normalization for you :) What you may really want is 2 regions of 
40, or 5 regions by 16. Need to have ideal region size? 






 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-08 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484844#comment-14484844
 ] 

Mikhail Antonov commented on HBASE-13103:
-

(sorry, formatting slided off)

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-08 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484867#comment-14484867
 ] 

Mikhail Antonov commented on HBASE-13103:
-

[~phobos182] thanks for feedback! Very useful. I guess I have a lot of 
questions I'd like to ask, if you don't mind, to better understand the real 
needs.

bq.  Given the time difference between when the commands were run, this could 
end up with different region boundaries between the clusters – which is not 
desired. So I second the idea of generates reshaping plan so it can be 
applied in the same manner on the slave cluster.

 - How strictly consistent are you master and slave clusters? How much can they 
diverge? Is second cluster mostly for long-running analytics, which only dumps 
output in some other table?
 - So you don't have automatic splits now, as I understand, only pre-split 
tables? Otherwise how are you ensuring that the region boundaries are exactly 
the same? What's the avg region size?
-  Do you want region boundaries to be exactly the same, or approximately the 
same?

Current patch has notion of reshaping plan, which includes params like split 
point (currently not computed though :) ).  It'd be feasible to send these 
plans to normalizer on the other side (or rather, expose normalize() call, 
which accepts serialized reshaping plan, in master rpc services, but, the 
region names wouldn't be the same anyway)

bq. Probably should think about performing a major compaction operation before 
the normalize policy runs.
Yeah, that makes sense. Though I think most people run major compactions 
infrequently, so making this prerequisite would change that operational 
practice? How often do you run major compactions?

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-08 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485581#comment-14485581
 ] 

Nick Dimiduk commented on HBASE-13103:
--

[~mantonov]:

bq. probably just like with balancer, there shoud be admin rpc call to turn 
balancer on/off?

Yes, that would be good. Exposure through shell would be desirable as well, and 
a get status as well.

bq. Need to have ideal region size?

That's a good point. Probably ideal size is some percentage (70% ?) of the 
max region size, with a close enough allowance (ie, this normalizer's target 
region size is 70 +/- 5% of {{hbase.hregion.max.filesize}}.

Thanks for coming around [~phobos182]!

bq. Since this operation is pretty impactful on performance...

I see this as not a single operation you run to normalize a table all at once, 
but rather something that happens in the background all the time, a kind of 
active anti-entropy happening behind the scenes to nudge a table into an 
ideal state. You think even a single split/merge operation is too heavy-weight 
to be done without premeditation?

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-07 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483558#comment-14483558
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Nice work [~mantonov]. I left some comments over on RB.

bq. being able to choose which table to normalize

I'd like this to eventually be a globally enabled feature, with opt-out via 
table configuration. For it's initial commit, it should probably be opt-in 
instead. Having a global kill switch is probably a good idea too.

bq. need to define normalization rules more strictly (including priority of 
operations? if table has both types of outlier in the ranks of its regions - 
too small and too big regions, then what action is more urgent)

Yes, priorities will become a useful feature. I think what you have here is a 
nice, committable first pass though.

bq. run normalization across several tables in parallel - is that something we 
should/shouldn't do

Probably that's something we can and should do. Can be future patch though.

bq. detecting currently running merges and splits. Current simple rules are 
just that we don't touch system tables and tables with less than 3 regions.

Why no touch regions with fewer than 3 regions?

These are all good questions for our operator friends. [~eclark], [~toffer], 
[~lhofhansl] any opinions here? Think you fellas may be interested in this 
feature.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-07 Thread Jeremy Carroll (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484275#comment-14484275
 ] 

Jeremy Carroll commented on HBASE-13103:


Few comments from my side of things. As a setup, in our architecture, we run 
master / slave HBase clusters with replication setup between them.

- Since this operation is pretty impactful on performance most likely we would 
do this on the slave cluster first. Switch the roles between master / slave. 
Then run the command on the master. Given the time difference between when the 
commands were run, this could end up with different region boundaries between 
the clusters -- which is not desired. So I second the idea of  generates 
reshaping plan so it can be applied in the same manner on the slave cluster.
- Probably should think about performing a major compaction operation before 
the normalize policy runs. We have a lot of tombstones on some of our clusters, 
which can inflate the region size by 60%. So splitting / merging in this 
condition is not ideal since the condition is temporary. Though after a 
compaction where you have the steady state is more realistic.

I think it's a great feature. Though most of our clusters are balanced for QPS 
distribution, as CPU is one of our primary capacity planning metrics. Any tool 
which makes it easier to recover from pre-splitting mistakes is welcome.


 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-06 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482210#comment-14482210
 ] 

Mikhail Antonov commented on HBASE-13103:
-

[~ndimiduk] any thoughts on the patch? :)

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-02 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393411#comment-14393411
 ] 

Mikhail Antonov commented on HBASE-13103:
-

https://reviews.apache.org/r/32790/ - up on RB

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-01 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390744#comment-14390744
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Since that's the draft, many obviously needed things are missing, namely:

 - being able to choose which table to normalize
 - need to define normalization rules more strictly (including priority of 
operations? if table has both types of outlier in the ranks of its regions - 
too small and too big regions, then what action is more urgent)
 - run normalization across several tables in parallel - is that something we 
should/shouldn't do
 - detecting currently running merges and splits. Current simple rules are just 
that we don't touch system tables and tables with less than 3 regions.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-01 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391934#comment-14391934
 ] 

Mikhail Antonov commented on HBASE-13103:
-

TestRegionServerObserver failure is unrelated, javadoc  checkstyle will be 
fixed in v1 patch

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-04-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390977#comment-14390977
 ] 

Hadoop QA commented on HBASE-13103:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708703/HBASE-13103-v0.patch
  against master branch at commit 874aa9eb85077a4e5ab42d06820692ed379775ca.
  ATTACHMENT ID: 12708703

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
10 warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
1928 checkstyle errors (more than the master's current 1924 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 release 
audit warnings (more than the master's current 0 warnings).

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hbase.coprocessor.TestRegionServerObserver.testCoprocessorHooksInRegionsMerge(TestRegionServerObserver.java:100)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13520//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13520//artifact/patchprocess/patchReleaseAuditWarnings.txt
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13520//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13520//artifact/patchprocess/checkstyle-aggregate.html

Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13520//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/13520//console

This message is automatically generated.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov
 Attachments: HBASE-13103-v0.patch


 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-03-20 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14370963#comment-14370963
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Is it fair to say this one supersedes the one I linked?

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk

 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-03-20 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372509#comment-14372509
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Ha! Seems of identical intent. Good find. Might as well mark this one as a dupe 
and carry on.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk
Assignee: Mikhail Antonov

 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-03-17 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364754#comment-14364754
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Some notes as I'm sketching the draft:

 - generally looks like the basic balancing architecture would just work - a 
chore on master, interface + pluggable initialization, main invocation gets 
kicked in HMaster, period and cutoff time limits.
 - runs on per-table basis (for first cut - could just do all or nothing, then 
add normalization params to table level configuration if needed)
 - normalizer computes list of normalization plans (which are simply, either 
split R1, or merge R1 and R1), those plans then executed one by one, I guess we 
don't want more than one merge or split going on the table in most cases? 
Execution of plan is simply figuring currently assigned HRS and requesting 
split over rpc.
 - whole thing is stateless, if master crashed during normalization, on next 
scheduled iteration it will be recomputed anyway

[~ndimiduk] thoughts?

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk

 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-03-17 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366112#comment-14366112
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Yep, that's how I see it too.

Practically, I think, splitting of excessively large regions (in environments 
with very high max region size or pre-split policy in place) would be more 
frequent usecase, than merging, since tables tend to grow up over time, not 
shrink down.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk

 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-03-17 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366193#comment-14366193
 ] 

Nick Dimiduk commented on HBASE-13103:
--

I've been thinking a lot about this feature in the context of a user who's 
upgrading from 0.92 or 0.94 with little 1-2g regions. Suddenly they have way 
more regions than necessary, so the region merge becomes very useful.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk

 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-03-17 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366195#comment-14366195
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Huh, right. I didn't think from that perspective. Thanks for the pointer.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk

 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-03-17 Thread Dave Latham (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366255#comment-14366255
 ] 

Dave Latham commented on HBASE-13103:
-

We also have cases where some regions of data are correlated with time and also 
have a TTL.  Over time those regions end up with little data in them and other 
regions end up with more data.  It would be great for the small regions to 
merge.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk

 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-03-17 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366012#comment-14366012
 ] 

Nick Dimiduk commented on HBASE-13103:
--

Sounds about right! I'd say the plan can come up with a single action to 
execute per table, and then sleep again -- merging has impact on availability. 
Perhaps it wakes up more frequently than the BalancerChore, maybe every minute 
instead of every 5 by default. Should also have some back-pressure mechanism, 
so it shouldn't start a new merge if existing merge is gummed up. I don't think 
we can do that without HBASE-12439. Maybe it cannot be stateless is that's the 
case, or perhaps it can check if any merge operations are in flight for the 
target table and if so, go back to sleep (or move onto the next table).

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk

 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-03-11 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357769#comment-14357769
 ] 

Mikhail Antonov commented on HBASE-13103:
-

Doing some prototyping here, may have something to show next week..

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk

 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-03-10 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355288#comment-14355288
 ] 

Nick Dimiduk commented on HBASE-13103:
--

[~mantonov] yeah, RegionEqualizer sounds like the right idea. Maybe 
normalizer? But yes, a background chore invokes something that analyzes 
current state and makes efforts to move toward an ideal state.

UI improvements across the board are always welcome and very much encouraged. 
We don't get enough love in that department. A (graphical?) representation of 
region sizes and distribution would be awesome.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk

 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13103) [ergonomics] add region size balancing as a feature of master

2015-03-10 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354522#comment-14354522
 ] 

Mikhail Antonov commented on HBASE-13103:
-

[~ndimiduk] if this chore (RegionEqualizer? :) or better name?) is pluggable 
the guarding logic would be easy to experiment with. If I could help with this 
one, I'd be glad to.

As a complementary idea - we can have web UI showing unevenly split 
regions/alerts etc.

 [ergonomics] add region size balancing as a feature of master
 -

 Key: HBASE-13103
 URL: https://issues.apache.org/jira/browse/HBASE-13103
 Project: HBase
  Issue Type: Brainstorming
  Components: Usability
Reporter: Nick Dimiduk

 Often enough, folks miss-judge split points or otherwise end up with a 
 suboptimal number of regions. We should have an automated, reliable way to 
 reshape or balance a table's region boundaries. This would be for tables 
 that contain existing data. This might look like:
 {noformat}
 Admin#reshapeTable(TableName, int numSplits);
 {noformat}
 or from the shell:
 {noformat}
  reshape TABLE, numSplits
 {noformat}
 Better still would be to have a maintenance process, similar to the existing 
 Balancer that runs AssignmentManager on an interval, to run the above 
 reshape operation on an interval. That way, the cluster will automatically 
 self-correct toward a desirable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)