[jira] Updated: (HBASE-3586) Improve the selection of regions to balance

2011-03-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3586:
--

Attachment: hbase-3586-table-creation.txt

 Improve the selection of regions to balance
 ---

 Key: HBASE-3586
 URL: https://issues.apache.org/jira/browse/HBASE-3586
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.90.2

 Attachments: HBASE-3586-by-region-age.patch, 
 HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, 
 hbase-3586-with-sort.txt


 Currently LoadBalancer goes through the list of regions per RS and grabs the 
 few first ones to balance. This is not bad, but that list is often sorted 
 naturally since the a RS that boots will open the regions in a sequential and 
 sorted order (since it comes from .META.) which means that we're balancing 
 regions starting in an almost sorted fashion.
 We discovered that because one of our internal users created a new table 
 starting with letter p which has now grown to 100 regions in the last few 
 hours and they are all served by 1 region server. Looking at the master's 
 log, the balancer has moved as many regions from that region server but they 
 are all from the same table that starts with letter a (and the regions that 
 were moved all come one after the other).
 The part of the code that should be modified is:
 {code}
 for (HRegionInfo hri: regions) {
   // Don't rebalance meta regions.
   if (hri.isMetaRegion()) continue; 
   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
   numTaken++;
   if (numTaken = numToOffload) break;
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

2011-03-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003065#comment-13003065
 ] 

Ted Yu commented on HBASE-3586:
---

Although balanceCluster() can be made more complex by considering both old and 
new regions, the new patch achieves the same effect.
When creating a table with multiple regions, I check whether there're online 
region servers which don't carry region (this can be relaxed by introducing a 
threshold which separates overloaded and underloaded servers). If there're such 
servers, balance() is called to balance the (relatively old) regions.
Since assignmentManager.assignUserRegions() uses round-robin assignment, 
cluster would still be balanced when createTable() returns.

 Improve the selection of regions to balance
 ---

 Key: HBASE-3586
 URL: https://issues.apache.org/jira/browse/HBASE-3586
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.90.2

 Attachments: HBASE-3586-by-region-age.patch, 
 HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, 
 hbase-3586-with-sort.txt


 Currently LoadBalancer goes through the list of regions per RS and grabs the 
 few first ones to balance. This is not bad, but that list is often sorted 
 naturally since the a RS that boots will open the regions in a sequential and 
 sorted order (since it comes from .META.) which means that we're balancing 
 regions starting in an almost sorted fashion.
 We discovered that because one of our internal users created a new table 
 starting with letter p which has now grown to 100 regions in the last few 
 hours and they are all served by 1 region server. Looking at the master's 
 log, the balancer has moved as many regions from that region server but they 
 are all from the same table that starts with letter a (and the regions that 
 were moved all come one after the other).
 The part of the code that should be modified is:
 {code}
 for (HRegionInfo hri: regions) {
   // Don't rebalance meta regions.
   if (hri.isMetaRegion()) continue; 
   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
   numTaken++;
   if (numTaken = numToOffload) break;
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

2011-03-05 Thread ryan rawson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003074#comment-13003074
 ] 

ryan rawson commented on HBASE-3586:


Can you try a random approach? Often random can be more predictable and not
have weak edge cases that different use patterns can tickle.


 Improve the selection of regions to balance
 ---

 Key: HBASE-3586
 URL: https://issues.apache.org/jira/browse/HBASE-3586
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.90.2

 Attachments: HBASE-3586-by-region-age.patch, 
 HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, 
 hbase-3586-with-sort.txt


 Currently LoadBalancer goes through the list of regions per RS and grabs the 
 few first ones to balance. This is not bad, but that list is often sorted 
 naturally since the a RS that boots will open the regions in a sequential and 
 sorted order (since it comes from .META.) which means that we're balancing 
 regions starting in an almost sorted fashion.
 We discovered that because one of our internal users created a new table 
 starting with letter p which has now grown to 100 regions in the last few 
 hours and they are all served by 1 region server. Looking at the master's 
 log, the balancer has moved as many regions from that region server but they 
 are all from the same table that starts with letter a (and the regions that 
 were moved all come one after the other).
 The part of the code that should be modified is:
 {code}
 for (HRegionInfo hri: regions) {
   // Don't rebalance meta regions.
   if (hri.isMetaRegion()) continue; 
   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
   numTaken++;
   if (numTaken = numToOffload) break;
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

2011-03-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003075#comment-13003075
 ] 

Ted Yu commented on HBASE-3586:
---

@Ryan: I think you're referring to my first patch. I partially agree. We may 
provide (at least two) policies - one favoring moving young regions and the 
other doing random region selection.

I think my second patch establishes condition for the first to function as 
expected.
I would like to hear about other use patterns which are not covered by both 
patches.

 Improve the selection of regions to balance
 ---

 Key: HBASE-3586
 URL: https://issues.apache.org/jira/browse/HBASE-3586
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.90.2

 Attachments: HBASE-3586-by-region-age.patch, 
 HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, 
 hbase-3586-with-sort.txt


 Currently LoadBalancer goes through the list of regions per RS and grabs the 
 few first ones to balance. This is not bad, but that list is often sorted 
 naturally since the a RS that boots will open the regions in a sequential and 
 sorted order (since it comes from .META.) which means that we're balancing 
 regions starting in an almost sorted fashion.
 We discovered that because one of our internal users created a new table 
 starting with letter p which has now grown to 100 regions in the last few 
 hours and they are all served by 1 region server. Looking at the master's 
 log, the balancer has moved as many regions from that region server but they 
 are all from the same table that starts with letter a (and the regions that 
 were moved all come one after the other).
 The part of the code that should be modified is:
 {code}
 for (HRegionInfo hri: regions) {
   // Don't rebalance meta regions.
   if (hri.isMetaRegion()) continue; 
   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
   numTaken++;
   if (numTaken = numToOffload) break;
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HBASE-3581) hbase rpc should send size of response

2011-03-05 Thread Benoit Sigoure (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003104#comment-13003104
 ] 

Benoit Sigoure commented on HBASE-3581:
---

Yeah I also like the flag suggestion better.  It's easy to implement both in 
the server and in the client.

 hbase rpc should send size of response
 --

 Key: HBASE-3581
 URL: https://issues.apache.org/jira/browse/HBASE-3581
 Project: HBase
  Issue Type: Improvement
Reporter: ryan rawson
Assignee: ryan rawson
 Fix For: 0.92.0

 Attachments: HBASE-rpc-response.txt


 The RPC reply from Server-Client does not include the size of the payload, 
 it is framed like so:
 i32 callId
 byte errorFlag
 byte[] data
 The data segment would contain enough info about how big the response is so 
 that it could be decoded by a writable reader.
 This makes it difficult to write buffering clients, who might read the entire 
 'data' then pass it to a decoder. While less memory efficient, if you want to 
 easily write block read clients (eg: nio) it would be necessary to send the 
 size along so that the client could snarf into a local buf.
 The new proposal is:
 i32 callId
 i32 size
 byte errorFlag
 byte[] data
 the size being sizeof(data) + sizeof(errorFlag).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HBASE-3586) Improve the selection of regions to balance

2011-03-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003110#comment-13003110
 ] 

Ted Yu commented on HBASE-3586:
---

Although random selection statistically achieves balance, I still prefer a 
deterministic approach.
Consider the following variant for my patches:
Instead of using the following loop to fill out regionsToMove:
{code}
  for (int i = sz-1; i = 0; i--) {
  HRegionInfo hri = regions.get(i);
{code}
We alternate between the head and tail of regions (picking both young and old 
ones).

We still keep the following sort():
{code}
Collections.sort(regionsToMove, rpComparator);
{code}
We then iterate through underloadedServers repeatedly, doing the following 
action alternately:
picking one region from head of regionsToMove in passes 1, 3, 5, etc
picking one region from tail of regionsToMove in passes 2, 4, 6, etc.

E.g. suppose RS1 has regions with region Ids of 42, 54, 105 and 201
RS2 has regions with region Ids of 34, 104, 110 and 154
Suppose we need to offload some regions to RS3 and RS4 which didn't carry 
regions.

regionsToMove would contain regions (201, 42, 154, 34) before sorting and 
regions (201, 154, 42, 34) after sorting.
Then we assign
region 201 to RS3, region 154 to RS4
region 34 to RS3, region 42 to RS4

Please comment.

 Improve the selection of regions to balance
 ---

 Key: HBASE-3586
 URL: https://issues.apache.org/jira/browse/HBASE-3586
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.90.2

 Attachments: HBASE-3586-by-region-age.patch, 
 HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, 
 hbase-3586-with-sort.txt


 Currently LoadBalancer goes through the list of regions per RS and grabs the 
 few first ones to balance. This is not bad, but that list is often sorted 
 naturally since the a RS that boots will open the regions in a sequential and 
 sorted order (since it comes from .META.) which means that we're balancing 
 regions starting in an almost sorted fashion.
 We discovered that because one of our internal users created a new table 
 starting with letter p which has now grown to 100 regions in the last few 
 hours and they are all served by 1 region server. Looking at the master's 
 log, the balancer has moved as many regions from that region server but they 
 are all from the same table that starts with letter a (and the regions that 
 were moved all come one after the other).
 The part of the code that should be modified is:
 {code}
 for (HRegionInfo hri: regions) {
   // Don't rebalance meta regions.
   if (hri.isMetaRegion()) continue; 
   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
   numTaken++;
   if (numTaken = numToOffload) break;
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira