[jira] Updated: (HBASE-3586) Improve the selection of regions to balance
[ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-3586: -- Attachment: hbase-3586-table-creation.txt Improve the selection of regions to balance --- Key: HBASE-3586 URL: https://issues.apache.org/jira/browse/HBASE-3586 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Ted Yu Priority: Critical Fix For: 0.90.2 Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion. We discovered that because one of our internal users created a new table starting with letter p which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter a (and the regions that were moved all come one after the other). The part of the code that should be modified is: {code} for (HRegionInfo hri: regions) { // Don't rebalance meta regions. if (hri.isMetaRegion()) continue; regionsToMove.add(new RegionPlan(hri, serverInfo, null)); numTaken++; if (numTaken = numToOffload) break; } {code} -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3586) Improve the selection of regions to balance
[ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003065#comment-13003065 ] Ted Yu commented on HBASE-3586: --- Although balanceCluster() can be made more complex by considering both old and new regions, the new patch achieves the same effect. When creating a table with multiple regions, I check whether there're online region servers which don't carry region (this can be relaxed by introducing a threshold which separates overloaded and underloaded servers). If there're such servers, balance() is called to balance the (relatively old) regions. Since assignmentManager.assignUserRegions() uses round-robin assignment, cluster would still be balanced when createTable() returns. Improve the selection of regions to balance --- Key: HBASE-3586 URL: https://issues.apache.org/jira/browse/HBASE-3586 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Ted Yu Priority: Critical Fix For: 0.90.2 Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion. We discovered that because one of our internal users created a new table starting with letter p which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter a (and the regions that were moved all come one after the other). The part of the code that should be modified is: {code} for (HRegionInfo hri: regions) { // Don't rebalance meta regions. if (hri.isMetaRegion()) continue; regionsToMove.add(new RegionPlan(hri, serverInfo, null)); numTaken++; if (numTaken = numToOffload) break; } {code} -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3586) Improve the selection of regions to balance
[ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003074#comment-13003074 ] ryan rawson commented on HBASE-3586: Can you try a random approach? Often random can be more predictable and not have weak edge cases that different use patterns can tickle. Improve the selection of regions to balance --- Key: HBASE-3586 URL: https://issues.apache.org/jira/browse/HBASE-3586 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Ted Yu Priority: Critical Fix For: 0.90.2 Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion. We discovered that because one of our internal users created a new table starting with letter p which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter a (and the regions that were moved all come one after the other). The part of the code that should be modified is: {code} for (HRegionInfo hri: regions) { // Don't rebalance meta regions. if (hri.isMetaRegion()) continue; regionsToMove.add(new RegionPlan(hri, serverInfo, null)); numTaken++; if (numTaken = numToOffload) break; } {code} -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3586) Improve the selection of regions to balance
[ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003075#comment-13003075 ] Ted Yu commented on HBASE-3586: --- @Ryan: I think you're referring to my first patch. I partially agree. We may provide (at least two) policies - one favoring moving young regions and the other doing random region selection. I think my second patch establishes condition for the first to function as expected. I would like to hear about other use patterns which are not covered by both patches. Improve the selection of regions to balance --- Key: HBASE-3586 URL: https://issues.apache.org/jira/browse/HBASE-3586 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Ted Yu Priority: Critical Fix For: 0.90.2 Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion. We discovered that because one of our internal users created a new table starting with letter p which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter a (and the regions that were moved all come one after the other). The part of the code that should be modified is: {code} for (HRegionInfo hri: regions) { // Don't rebalance meta regions. if (hri.isMetaRegion()) continue; regionsToMove.add(new RegionPlan(hri, serverInfo, null)); numTaken++; if (numTaken = numToOffload) break; } {code} -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3581) hbase rpc should send size of response
[ https://issues.apache.org/jira/browse/HBASE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003104#comment-13003104 ] Benoit Sigoure commented on HBASE-3581: --- Yeah I also like the flag suggestion better. It's easy to implement both in the server and in the client. hbase rpc should send size of response -- Key: HBASE-3581 URL: https://issues.apache.org/jira/browse/HBASE-3581 Project: HBase Issue Type: Improvement Reporter: ryan rawson Assignee: ryan rawson Fix For: 0.92.0 Attachments: HBASE-rpc-response.txt The RPC reply from Server-Client does not include the size of the payload, it is framed like so: i32 callId byte errorFlag byte[] data The data segment would contain enough info about how big the response is so that it could be decoded by a writable reader. This makes it difficult to write buffering clients, who might read the entire 'data' then pass it to a decoder. While less memory efficient, if you want to easily write block read clients (eg: nio) it would be necessary to send the size along so that the client could snarf into a local buf. The new proposal is: i32 callId i32 size byte errorFlag byte[] data the size being sizeof(data) + sizeof(errorFlag). -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3586) Improve the selection of regions to balance
[ https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003110#comment-13003110 ] Ted Yu commented on HBASE-3586: --- Although random selection statistically achieves balance, I still prefer a deterministic approach. Consider the following variant for my patches: Instead of using the following loop to fill out regionsToMove: {code} for (int i = sz-1; i = 0; i--) { HRegionInfo hri = regions.get(i); {code} We alternate between the head and tail of regions (picking both young and old ones). We still keep the following sort(): {code} Collections.sort(regionsToMove, rpComparator); {code} We then iterate through underloadedServers repeatedly, doing the following action alternately: picking one region from head of regionsToMove in passes 1, 3, 5, etc picking one region from tail of regionsToMove in passes 2, 4, 6, etc. E.g. suppose RS1 has regions with region Ids of 42, 54, 105 and 201 RS2 has regions with region Ids of 34, 104, 110 and 154 Suppose we need to offload some regions to RS3 and RS4 which didn't carry regions. regionsToMove would contain regions (201, 42, 154, 34) before sorting and regions (201, 154, 42, 34) after sorting. Then we assign region 201 to RS3, region 154 to RS4 region 34 to RS3, region 42 to RS4 Please comment. Improve the selection of regions to balance --- Key: HBASE-3586 URL: https://issues.apache.org/jira/browse/HBASE-3586 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Ted Yu Priority: Critical Fix For: 0.90.2 Attachments: HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, hbase-3586-table-creation.txt, hbase-3586-with-sort.txt Currently LoadBalancer goes through the list of regions per RS and grabs the few first ones to balance. This is not bad, but that list is often sorted naturally since the a RS that boots will open the regions in a sequential and sorted order (since it comes from .META.) which means that we're balancing regions starting in an almost sorted fashion. We discovered that because one of our internal users created a new table starting with letter p which has now grown to 100 regions in the last few hours and they are all served by 1 region server. Looking at the master's log, the balancer has moved as many regions from that region server but they are all from the same table that starts with letter a (and the regions that were moved all come one after the other). The part of the code that should be modified is: {code} for (HRegionInfo hri: regions) { // Don't rebalance meta regions. if (hri.isMetaRegion()) continue; regionsToMove.add(new RegionPlan(hri, serverInfo, null)); numTaken++; if (numTaken = numToOffload) break; } {code} -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira