[jira] [Commented] (HBASE-17110) Improve SimpleLoadBalancer to always take server-level balance into account
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992622#comment-15992622 ] Charlie Qiangeng Xu commented on HBASE-17110: - Having several sticky works on hand now, yet I will definitely squeeze time for this :) > Improve SimpleLoadBalancer to always take server-level balance into account > --- > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Fix For: 2.0.0 > > Attachments: HBASE-17110.patch, HBASE-17110-V2.patch, > HBASE-17110-V3.patch, HBASE-17110-V4.patch, HBASE-17110-V5.patch, > HBASE-17110-V6.patch, HBASE-17110-V7.patch, HBASE-17110-V8.patch > > > Currently with bytable strategy there might still be server-level imbalance > and we will improve this in this JIRA. > Some more background: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. And this is the goal this JIRA tries > to achieve. > Two UTs will be added as well with the last one demonstrating advantage of > the new strategy. Also, a onConfigurationChange method will be implemented to > hot control the "slop" variable. > We have been using the strategy on our largest cluster for several months, so > the effect could be assured to some extent. > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17110) Improve SimpleLoadBalancer to always take server-level balance into account
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711552#comment-15711552 ] Charlie Qiangeng Xu commented on HBASE-17110: - Checked the Integrated build, the failure is inside Stochastic Balance. Shouldn't be related, since there is no change inside STB, and the interface change has no effect on STB. Ran several times on this test , all past on my local build. > Improve SimpleLoadBalancer to always take server-level balance into account > --- > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Fix For: 2.0.0 > > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110-V6.patch, > HBASE-17110-V7.patch, HBASE-17110-V8.patch, HBASE-17110.patch > > > Currently with bytable strategy there might still be server-level imbalance > and we will improve this in this JIRA. > Some more background: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. And this is the goal this JIRA tries > to achieve. > Two UTs will be added as well with the last one demonstrating advantage of > the new strategy. Also, a onConfigurationChange method will be implemented to > hot control the "slop" variable. > We have been using the strategy on our largest cluster for several months, so > the effect could be assured to some extent. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704794#comment-15704794 ] Charlie Qiangeng Xu edited comment on HBASE-17110 at 11/29/16 11:07 AM: Checked the failed test "TestHRegionWithInMemoryFlush", it's unrelated to the patch , past on my local build, should be fine was (Author: xharlie): Checked the failed test "TestHRegionWithInMemoryFlush", it's unrelated to the patch and on my local build, should be fine > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110-V6.patch, > HBASE-17110-V7.patch, HBASE-17110-V8.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704794#comment-15704794 ] Charlie Qiangeng Xu commented on HBASE-17110: - Checked the failed test "TestHRegionWithInMemoryFlush", it's unrelated to the patch and on my local build, should be fine > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110-V6.patch, > HBASE-17110-V7.patch, HBASE-17110-V8.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Attachment: HBASE-17110-V8.patch > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110-V6.patch, > HBASE-17110-V7.patch, HBASE-17110-V8.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Attachment: HBASE-17110-V7.patch Fixed a problem > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110-V6.patch, > HBASE-17110-V7.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Attachment: HBASE-17110-V6.patch Finally, I think adding a function setClusterLoad to the LoadBalancer interface would be the best choice. Uploaded a new patch, the code looks much cleaner > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110-V6.patch, > HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694615#comment-15694615 ] Charlie Qiangeng Xu commented on HBASE-17110: - Good suggestion! Should I add the following to the Interface LoadBalancer? Would be a neat solution {noformat} /** * Perform the major balance operation * @param tableName * @param clusterState * @param regionStates * @return List of plans */ List balanceCluster(TableName tableName, MapclusterState, RegionStates regionStates) throws HBaseIOException; {noformat} what do you think [~tedyu] [~anoop.hbase] Thanks > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Comment: was deleted (was: Hi Anoop, ideal I shouldn't add extra for SLB, but only reuse the original code flow in HMaster can't satisfy the need for this new requirement. Irrespective of the type of balancer, the original flow works as: Step 1 Hmaster:balance() call-> RegionStates:getAssignmentsByTable() Step 2 RegionStates:getAssignmentsByTable() check if it is bytable or not: (1). if bytable: return a map of> of the whole cluster (2). if not bytable: return a map of Map > , but all TableName would be replaced by "hbase:ensemble" step 3 A for loop to execute the plan generation: for (Entry > e : assignmentsByTable.entrySet()) { add balance plan for the table of this TableName- } So if it is bytable, every time the Balancer class generate plan for a specific table. However for not bytable, since there is only one table called "hbase:ensemble", the for loop block would loop once and do everything together. For the enhancement, we want to do every table separately but still aware the the server loads of whole cluster as well. So for SLB, I add logic in HMaster to get whole server loads. If we decide not going this way, there are two condition: (1) user's configuration is not bytable: SLB will get a Map > with only one table name: "hbase:ensemble" we need to first get load for each server from this map. Then, with the code logic copy from getAssignmentsByTable into SLB, we can separate the map to entries with real tablename, then loop through that new map. (2) user's configuration is bytable: I will call RegionStates:getAssignmentsByTable() to get server load info at the first loop. Another problem would be I have to check the users' configuration to decide (1) or (2) should I execute. ) > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot
[jira] [Commented] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693274#comment-15693274 ] Charlie Qiangeng Xu commented on HBASE-17110: - Hi Anoop, ideal I shouldn't add extra for SLB, but only reuse the original code flow in HMaster can't satisfy the need for this new requirement. Irrespective of the type of balancer, the original flow works as: Step 1 Hmaster:balance() call-> RegionStates:getAssignmentsByTable() Step 2 RegionStates:getAssignmentsByTable() check if it is bytable or not: (1). if bytable: return a map of> of the whole cluster (2). if not bytable: return a map of Map > , but all TableName would be replaced by "hbase:ensemble" step 3 A for loop to execute the plan generation: for (Entry > e : assignmentsByTable.entrySet()) { add balance plan for the table of this TableName- } So if it is bytable, every time the Balancer class generate plan for a specific table. However for not bytable, since there is only one table called "hbase:ensemble", the for loop block would loop once and do everything together. For the enhancement, we want to do every table separately but still aware the the server loads of whole cluster as well. So for SLB, I add logic in HMaster to get whole server loads. If we decide not going this way, there are two condition: (1) user's configuration is not bytable: SLB will get a Map > with only one table name: "hbase:ensemble" we need to first get load for each server from this map. Then, with the code logic copy from getAssignmentsByTable into SLB, we can separate the map to entries with real tablename, then loop through that new map. (2) user's configuration is bytable: I will call RegionStates:getAssignmentsByTable() to get server load info at the first loop. Another problem would be I have to check the users' configuration to decide (1) or (2) should I execute. > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented
[jira] [Commented] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693275#comment-15693275 ] Charlie Qiangeng Xu commented on HBASE-17110: - Hi Anoop, ideal I shouldn't add extra for SLB, but only reuse the original code flow in HMaster can't satisfy the need for this new requirement. Irrespective of the type of balancer, the original flow works as: Step 1 Hmaster:balance() call-> RegionStates:getAssignmentsByTable() Step 2 RegionStates:getAssignmentsByTable() check if it is bytable or not: (1). if bytable: return a map of> of the whole cluster (2). if not bytable: return a map of Map > , but all TableName would be replaced by "hbase:ensemble" step 3 A for loop to execute the plan generation: for (Entry > e : assignmentsByTable.entrySet()) { add balance plan for the table of this TableName- } So if it is bytable, every time the Balancer class generate plan for a specific table. However for not bytable, since there is only one table called "hbase:ensemble", the for loop block would loop once and do everything together. For the enhancement, we want to do every table separately but still aware the the server loads of whole cluster as well. So for SLB, I add logic in HMaster to get whole server loads. If we decide not going this way, there are two condition: (1) user's configuration is not bytable: SLB will get a Map > with only one table name: "hbase:ensemble" we need to first get load for each server from this map. Then, with the code logic copy from getAssignmentsByTable into SLB, we can separate the map to entries with real tablename, then loop through that new map. (2) user's configuration is bytable: I will call RegionStates:getAssignmentsByTable() to get server load info at the first loop. Another problem would be I have to check the users' configuration to decide (1) or (2) should I execute. > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented
[jira] [Issue Comment Deleted] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Comment: was deleted (was: Hi Anoop, ideal I shouldn't add extra for SLB, but only reuse the original code flow in HMaster can't satisfy the need for this new requirement. Irrespective of the type of balancer, the original flow works as: Step 1 Hmaster:balance() call-> RegionStates:getAssignmentsByTable() Step 2 RegionStates:getAssignmentsByTable() check if it is bytable or not: (1). if bytable: return a map of> of the whole cluster (2). if not bytable: return a map of Map > , but all TableName would be replaced by "hbase:ensemble" step 3 A for loop to execute the plan generation: for (Entry > e : assignmentsByTable.entrySet()) { add balance plan for the table of this TableName- } So if it is bytable, every time the Balancer class generate plan for a specific table. However for not bytable, since there is only one table called "hbase:ensemble", the for loop block would loop once and do everything together. For the enhancement, we want to do every table separately but still aware the the server loads of whole cluster as well. So for SLB, I add logic in HMaster to get whole server loads. If we decide not going this way, there are two condition: (1) user's configuration is not bytable: SLB will get a Map > with only one table name: "hbase:ensemble" we need to first get load for each server from this map. Then, with the code logic copy from getAssignmentsByTable into SLB, we can separate the map to entries with real tablename, then loop through that new map. (2) user's configuration is bytable: I will call RegionStates:getAssignmentsByTable() to get server load info at the first loop. Another problem would be I have to check the users' configuration to decide (1) or (2) should I execute. ) > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: Improvement > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot
[jira] [Comment Edited] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15689060#comment-15689060 ] Charlie Qiangeng Xu edited comment on HBASE-17110 at 11/23/16 6:10 AM: --- Hi [~zghaobac], thanks for pointing these out. {quote} If we decide this is a default strategy, this method seems doesn't need 2 arguments? {quote} We indeed don't need two variables for simpleLoadBalancer, but unfortunately the method is shared by StochasticLoadBalancer and other balancers as well. Even if StochasticLoadBalancer doesn't need "byTable" anymore, we still at least should accommodate to some existing customized balancers that some users may have in place. {quote} This config is not necessary if this is default strategy? {quote} This config here is for the strategy itself and would be helpful for a power user. I deliberately added this one since it provides better control to the threshold of the cluster level load difference, which, usually is more tolerable than the table level. For most of the user, overallSlop is just same as slop by default. was (Author: xharlie): Hi Guanghao, thanks for pointing these out. {quote} If we decide this is a default strategy, this method seems doesn't need 2 arguments? {quote} We indeed don't need two variable for simpleLoadBalancer, but unfortunately the method is shared by stochasticLoadBalancer and other balancers as well. Even if stochasticLoadBalancer doesn't need "byTable" anymore, we still at least should accommodate to some existing customized balancers that some users may have in place. {quote} This config is not necessary if this is default strategy? {quote} This config here is for the strategy itself and would be helpful for a power user. I deliberately add this one since it provides better control to the threshold of the cluster level load difference, which, usually is more tolerable than the table level. For most of the user, overallSlop is just same as slop by default. > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the
[jira] [Commented] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15689060#comment-15689060 ] Charlie Qiangeng Xu commented on HBASE-17110: - Hi Guanghao, thanks for pointing these out. {quote} If we decide this is a default strategy, this method seems doesn't need 2 arguments? {quote} We indeed don't need two variable for simpleLoadBalancer, but unfortunately the method is shared by stochasticLoadBalancer and other balancers as well. Even if stochasticLoadBalancer doesn't need "byTable" anymore, we still at least should accommodate to some existing customized balancers that some users may have in place. {quote} This config is not necessary if this is default strategy? {quote} This config here is for the strategy itself and would be helpful for a power user. I deliberately add this one since it provides better control to the threshold of the cluster level load difference, which, usually is more tolerable than the table level. For most of the user, overallSlop is just same as slop by default. > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17159) Improve the assignment plan when server aborted or creating tables, etc.
[ https://issues.apache.org/jira/browse/HBASE-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688833#comment-15688833 ] Charlie Qiangeng Xu commented on HBASE-17159: - Among the comment for the roundRobinAssignment function in the class of BaseLoadBalancer.java: {noformat} * Currently implemented as a round-robin assignment. Same invariant as load * balancing, all servers holding floor(avg) or ceiling(avg). * * TODO: Use block locations from HDFS to place regions with their blocks {noformat} and inside the body of roundRobinAssignment function: {noformat} // TODO: instead of retainAssignment() and roundRobinAssignment(), we should just run the // normal LB.balancerCluster() with unassignedRegions. We only need to have a candidate // generator for AssignRegionAction. The LB will ensure the regions are mostly local // and balanced. This should also run fast with fewer number of iterations. {noformat} > Improve the assignment plan when server aborted or creating tables, etc. > > > Key: HBASE-17159 > URL: https://issues.apache.org/jira/browse/HBASE-17159 > Project: HBase > Issue Type: New Feature > Components: Balancer, Region Assignment >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu > > When master processing a dead serve or creating a new table, the assignment > plan would be generated based on the roundRobinAssignment method of balancer. > Yet if these operations happen a lot, the cluster would be out of balance > both on table level and server level. Balancer would be triggered and may > cause huge amount of region moves( This is what we observed). > Ideally, the assignment should be able to consider the table or cluster level > balance as well as locality(for the case of dead server). > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688729#comment-15688729 ] Charlie Qiangeng Xu commented on HBASE-17110: - Thanks for reviewing the patch sir, fixed the warning and resubmiting > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Status: Patch Available (was: Open) > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 1.2.4, 2.0.0 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Status: Open (was: Patch Available) > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 1.2.4, 2.0.0 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Attachment: HBASE-17110-V5.patch > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110-V5.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15686529#comment-15686529 ] Charlie Qiangeng Xu edited comment on HBASE-17110 at 11/22/16 11:57 AM: Folks, thanks for sharing your ideas [~anoop.hbase],[~enis],[~carp84] Just upload a V4 to make the overall strategy as an enhancement of bytable, not another strategy. Besides, to Yu Li's point, I have opened another JIRA with regard to the root cause of "huge amount of moves in Balance" and linked to this JIRA. was (Author: xharlie): upload a V4 to make the overall strategy as an enhancement of bytable, not another strategy > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Status: Patch Available (was: Open) > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 1.2.4, 2.0.0 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Status: Open (was: Patch Available) upload a V4 to make the overall strategy as an enhancement of bytable, not another strategy > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 1.2.4, 2.0.0 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Attachment: HBASE-17110-V4.patch > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110-V4.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17159) Better the Assignment plan when server aborted or creating tables, etc.
Charlie Qiangeng Xu created HBASE-17159: --- Summary: Better the Assignment plan when server aborted or creating tables, etc. Key: HBASE-17159 URL: https://issues.apache.org/jira/browse/HBASE-17159 Project: HBase Issue Type: New Feature Components: Balancer, Region Assignment Affects Versions: 1.2.4, 2.0.0 Reporter: Charlie Qiangeng Xu When master processing a dead serve or creating a new table, the assignment plan would be generated based on the roundRobinAssignment method of balancer. Yet if these operations happen a lot, the cluster would be out of balance both on table level and server level. Balancer would be triggered and may cause huge amount of region moves( This is what we observed). Ideally, the assignment should be able to consider the table or cluster level balance as well as locality(for the case of dead server). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Status: Patch Available (was: Open) > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 1.2.4, 2.0.0 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Attachment: HBASE-17110-V3.patch > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110-V3.patch, > HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Attachment: (was: SimpleBalancerBytableOverall.V1) > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110.patch > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15672733#comment-15672733 ] Charlie Qiangeng Xu commented on HBASE-17110: - Just uploaded to review board [~tedyu] and [~zghaobac] > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110.patch, > SimpleBalancerBytableOverall.V1 > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Attachment: HBASE-17110-V2.patch > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110-V2.patch, HBASE-17110.patch, > SimpleBalancerBytableOverall.V1 > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15672653#comment-15672653 ] Charlie Qiangeng Xu commented on HBASE-17110: - Thank you for looking into the code [~tedyu] and [~anoop.hbase], I've changed the strategy to be default. And also follow [~tedyu]'s suggestion I change the format and replace the variable initialnumRegions wherever possible. After a second thought, I think using standard deviation might be some how redundant. I've already used "slop" to set the threshold, thus another criterion would over-complicate the control(hard for user, also have to introduce another conf setting if not be hardcoded to 10). So I remove it as well. A new patch HBASE-17110-V2.patch has been uploaded :) > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110.patch, SimpleBalancerBytableOverall.V1 > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, > table2 and table3 still keep balanced. > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > Also, a onConfigurationChange method has been implemented to hot control the > "slop" variable. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Description: This jira is about an enhancement of simpleLoadBalancer. Here we introduce a new strategy: "bytableOverall" which could be controlled by adding: {noformat} hbase.master.loadbalance.bytableOverall true {noformat} We have been using the strategy on our largest cluster for several months. it's proven to be very helpful and stable, especially, the result is quite visible to the users. Here is the reason why it's helpful: When operating large scale clusters(our case), some companies still prefer to use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan generation, etc. Current SimpleLoadBalancer has two modes: 1. byTable, which only guarantees that the regions of one table could be uniformly distributed. 2. byCluster, which ignores the distribution within tables and balance the regions all together. If the pressures on different tables are different, the first byTable option is the preferable one in most case. Yet, this choice sacrifice the cluster level balance and would cause some servers to have significantly higher load, e.g. 242 regions on server A but 417 regions on server B.(real world stats) Consider this case, a cluster has 3 tables and 4 servers: {noformat} server A has 3 regions: table1:1, table2:1, table3:1 server B has 3 regions: table1:2, table2:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 0 regions. {noformat} >From the byTable strategy's perspective, the cluster has already been >perfectly balanced on table level. But a perfect status should be like: {noformat} server A has 2 regions: table2:1, table3:1 server B has 2 regions: table1:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 2 regions: table1:1, table2:2 {noformat} We can see the server loads change from 3,3,3,0 to 2,2,3,2, while the table1, table2 and table3 still keep balanced. And this is what the new mode "byTableOverall" can achieve. Two UTs have been added as well and the last one demonstrates the advantage of the new strategy. Also, a onConfigurationChange method has been implemented to hot control the "slop" variable. was: This jira is about an enhancement of simpleLoadBalancer. Here we introduce a new strategy: "bytableOverall" which could be controlled by adding: {noformat} hbase.master.loadbalance.bytableOverall true {noformat} We have been using the strategy on our largest cluster for several months. it's proven to be very helpful and stable, especially, the result is quite visible to the users. Here is the reason why it's helpful: When operating large scale clusters(our case), some companies still prefer to use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan generation, etc. Current SimpleLoadBalancer has two modes: 1. byTable, which only guarantees that the regions of one table could be uniformly distributed. 2. byCluster, which ignores the distribution within tables and balance the regions all together. If the pressures on different tables are different, the first byTable option is the preferable one in most case. Yet, this choice sacrifice the cluster level balance and would cause some servers to have significantly higher load, e.g. 242 regions on server A but 417 regions on server B.(real world stats) Consider this case, a cluster has 3 tables and 4 servers: {noformat} server A has 3 regions: table1:1, table2:1, table3:1 server B has 3 regions: table1:2, table2:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 0 regions. {noformat} >From the byTable strategy's perspective, the cluster has already been >perfectly balanced on table level. But a perfect status should be like: {noformat} server A has 2 regions: table2:1, table3:1 server B has 2 regions: table1:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 2 regions: table1:1, table2:2 {noformat} And this is what the new mode "byTableOverall" can achieve. Two UTs have been added as well and the last one demonstrates the advantage of the new strategy. > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110.patch, SimpleBalancerBytableOverall.V1 > > > This jira is about an enhancement of
[jira] [Commented] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669931#comment-15669931 ] Charlie Qiangeng Xu commented on HBASE-17110: - Whoops, my bad, thank you for correcting that :) > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17110.patch, SimpleBalancerBytableOverall.V1 > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > {noformat} > > hbase.master.loadbalance.bytableOverall > true > > {noformat} > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use {{SimpleLoadBalancer}} due to its simplicity, quick balance plan > generation, etc. Current SimpleLoadBalancer has two modes: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is the preferable one in most case. Yet, this choice sacrifice the cluster > level balance and would cause some servers to have significantly higher load, > e.g. 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > {noformat} > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > {noformat} > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > {noformat} > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > {noformat} > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Attachment: SimpleBalancerBytableOverall.V1 > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: SimpleBalancerBytableOverall.V1 > > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > > hbase.master.loadbalance.bytableOverall > true > > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to > use SimpleLoadBalancer due to its simplicity, quick balance plan generation, > etc. Current SimpleLoadBalancer has two mode: > 1. byTable, which only guarantees that the regions of one table could be > uniformly distributed. > 2. byCluster, which ignores the distribution within tables and balance the > regions all together. > If the pressures on different tables are different, the first byTable option > is preferable one in most case. Yet, this choice sacrifice the cluster level > balance and would cause some servers to have significantly higher load, e.g. > 242 regions on server A but 417 regions on server B.(real world stats) > Consider this case, a cluster has 3 tables and 4 servers: > server A has 3 regions: table1:1, table2:1, table3:1 > server B has 3 regions: table1:2, table2:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 0 regions. > From the byTable strategy's perspective, the cluster has already been > perfectly balanced on table level. But a perfect status should be like: > server A has 2 regions: table2:1, table3:1 > server B has 2 regions: table1:2, table3:2 > server C has 3 regions: table1:3, table2:3, table3:3 > server D has 2 regions: table1:1, table2:2 > And this is what the new mode "byTableOverall" can achieve. > Two UTs have been added as well and the last one demonstrates the advantage > of the new strategy. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Description: This jira is about an enhancement of simpleLoadBalancer. Here we introduce a new strategy: "bytableOverall" which could be controlled by adding: hbase.master.loadbalance.bytableOverall true We have been using the strategy on our largest cluster for several months. it's proven to be very helpful and stable, especially, the result is quite visible to the users. Here is the reason why it's helpful: When operating large scale clusters(our case), some companies still prefer to use SimpleLoadBalancer due to its simplicity, quick balance plan generation, etc. Current SimpleLoadBalancer has two mode: 1. byTable, which only guarantees that the regions of one table could be uniformly distributed. 2. byCluster, which ignores the distribution within tables and balance the regions all together. If the pressures on different tables are different, the first byTable option is preferable one in most case. Yet, this choice sacrifice the cluster level balance and would cause some servers to have significantly higher load, e.g. 242 regions on server A but 417 regions on server B.(real world stats) Consider this case, a cluster has 3 tables and 4 servers: server A has 3 regions: table1:1, table2:1, table3:1 server B has 3 regions: table1:2, table2:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 0 regions. >From the byTable strategy's perspective, the cluster has already been >perfectly balanced on table level. But a perfect status should be like: server A has 2 regions: table2:1, table3:1 server B has 2 regions: table1:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 2 regions: table1:1, table2:2 And this is what the new mode "byTableOverall" can achieve. Two UTs have been added as well and the last one demonstrates the advantage of the new strategy. was: This jira is about an enhancement of simpleLoadBalancer. Here we introduce a new strategy: "bytableOverall" which could be controlled by adding: hbase.master.loadbalance.bytableOverall true We have been using the strategy on our largest cluster for several months. it's proven to be very helpful and stable, especially, the result is quite visible to the users. Here is the reason why it's helpful: When operating large scale clusters(our case), some companies still prefer to use SimpleLoadBalancer due to its simplicity, quick balance plan generation, etc. Current SimpleLoadBalancer has two mode: 1. byTable, which only guarantees that the regions of one table could be uniformly distributed. 2. byCluster, which ignores the distribution within tables and balance the regions all together. If the pressures on different tables are different, the first byTable option is preferable one in most case. Yet, this choice sacrifice the cluster level balance and would cause some servers to have significantly higher load, e.g. 242 regions on server A but 417 regions on server B.(real case stats) Consider this case, a cluster has 3 tables and 4 servers: server A has 3 regions: table1:1, table2:1, table3:1 server B has 3 regions: table1:2, table2:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 0 regions. >From the byTable strategy's perspective, the cluster has already been >perfectly balanced on table level. But a perfect status should be like: server A has 2 regions: table2:1, table3:1 server B has 2 regions: table1:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 2 regions: table1:1, table2:2 And this is what the new mode "byTableOverall" can achieve. Two UTs have been added as well and the last one demonstrates the advantage of the new strategy. > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > > hbase.master.loadbalance.bytableOverall > true > > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies
[jira] [Updated] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17110: Description: This jira is about an enhancement of simpleLoadBalancer. Here we introduce a new strategy: "bytableOverall" which could be controlled by adding: hbase.master.loadbalance.bytableOverall true We have been using the strategy on our largest cluster for several months. it's proven to be very helpful and stable, especially, the result is quite visible to the users. Here is the reason why it's helpful: When operating large scale clusters(our case), some companies still prefer to use SimpleLoadBalancer due to its simplicity, quick balance plan generation, etc. Current SimpleLoadBalancer has two mode: 1. byTable, which only guarantees that the regions of one table could be uniformly distributed. 2. byCluster, which ignores the distribution within tables and balance the regions all together. If the pressures on different tables are different, the first byTable option is preferable one in most case. Yet, this choice sacrifice the cluster level balance and would cause some servers to have significantly higher load, e.g. 242 regions on server A but 417 regions on server B.(real case stats) Consider this case, a cluster has 3 tables and 4 servers: server A has 3 regions: table1:1, table2:1, table3:1 server B has 3 regions: table1:2, table2:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 0 regions. >From the byTable strategy's perspective, the cluster has already been >perfectly balanced on table level. But a perfect status should be like: server A has 2 regions: table2:1, table3:1 server B has 2 regions: table1:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 2 regions: table1:1, table2:2 And this is what the new mode "byTableOverall" can achieve. Two UTs have been added as well and the last one demonstrates the advantage of the new strategy. was: This jira is about an enhancement of simpleLoadBalancer. Here we introduce a new strategy: "bytableOverall" which could be controlled by adding: hbase.master.loadbalance.bytableOverall true We have been using the strategy on our largest cluster for several months. it's proven to be very helpful and stable, especially, the result is quite visible to the users. Here is the reason why it's helpful: When operating large scale clusters(our case), some companies still prefer to use SimpleLoadBalancer due to its simplicity, quick balance plan generation, etc. Current SimpleLoadBalancer has two mode: 1. byTable, which only guarantees that the regions of one table could be uniformly distributed. 2. byCluster, which ignores the distribution within tables and balance the regions all together. If the pressures on different tables are different, the first byTable option is preferable one in most case. Yet, this choice sacrifice the cluster level balance and would cause some servers to have significantly higher load, e.g. 240 regions on server A but 410 regions on server B. Consider this case, a cluster has 3 tables and 4 servers: server A has 3 regions: table1:1, table2:1, table3:1 server B has 3 regions: table1:2, table2:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 0 regions. >From the byTable strategy's perspective, the cluster has already been >perfectly balanced on table level. But a perfect status should be like: server A has 2 regions: table2:1, table3:1 server B has 2 regions: table1:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 2 regions: table1:1, table2:2 And this is what the new mode "byTableOverall" can achieve. Two UTs have been added as well and the last one demonstrates the advantage of the new strategy. > Add an "Overall Strategy" option(balanced both on table level and server > level) to SimpleLoadBalancer > - > > Key: HBASE-17110 > URL: https://issues.apache.org/jira/browse/HBASE-17110 > Project: HBase > Issue Type: New Feature > Components: Balancer >Affects Versions: 2.0.0, 1.2.4 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > > This jira is about an enhancement of simpleLoadBalancer. Here we introduce a > new strategy: "bytableOverall" which could be controlled by adding: > > hbase.master.loadbalance.bytableOverall > true > > We have been using the strategy on our largest cluster for several months. > it's proven to be very helpful and stable, especially, the result is quite > visible to the users. > Here is the reason why it's helpful: > When operating large scale clusters(our case), some companies still prefer to
[jira] [Created] (HBASE-17110) Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer
Charlie Qiangeng Xu created HBASE-17110: --- Summary: Add an "Overall Strategy" option(balanced both on table level and server level) to SimpleLoadBalancer Key: HBASE-17110 URL: https://issues.apache.org/jira/browse/HBASE-17110 Project: HBase Issue Type: New Feature Components: Balancer Affects Versions: 1.2.4, 2.0.0 Reporter: Charlie Qiangeng Xu Assignee: Charlie Qiangeng Xu This jira is about an enhancement of simpleLoadBalancer. Here we introduce a new strategy: "bytableOverall" which could be controlled by adding: hbase.master.loadbalance.bytableOverall true We have been using the strategy on our largest cluster for several months. it's proven to be very helpful and stable, especially, the result is quite visible to the users. Here is the reason why it's helpful: When operating large scale clusters(our case), some companies still prefer to use SimpleLoadBalancer due to its simplicity, quick balance plan generation, etc. Current SimpleLoadBalancer has two mode: 1. byTable, which only guarantees that the regions of one table could be uniformly distributed. 2. byCluster, which ignores the distribution within tables and balance the regions all together. If the pressures on different tables are different, the first byTable option is preferable one in most case. Yet, this choice sacrifice the cluster level balance and would cause some servers to have significantly higher load, e.g. 240 regions on server A but 410 regions on server B. Consider this case, a cluster has 3 tables and 4 servers: server A has 3 regions: table1:1, table2:1, table3:1 server B has 3 regions: table1:2, table2:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 0 regions. >From the byTable strategy's perspective, the cluster has already been >perfectly balanced on table level. But a perfect status should be like: server A has 2 regions: table2:1, table3:1 server B has 2 regions: table1:2, table3:2 server C has 3 regions: table1:3, table2:3, table3:3 server D has 2 regions: table1:1, table2:2 And this is what the new mode "byTableOverall" can achieve. Two UTs have been added as well and the last one demonstrates the advantage of the new strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
[ https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649678#comment-15649678 ] Charlie Qiangeng Xu commented on HBASE-17039: - Hi Enis, The reason that we don't use Stochastic one is because we used to find the regions were moved erratically. And the conf parameter that control the StochasticLB is a bit hard to fine-tune.(the chance that we can play back and forth on large scale production cluster is limited) Another factor that might contribute to the decision would be that simpleBalancer can guarantee table level region balance, which is very crucial in our use case. Those are old conclusions though, may be outdated, since StochasticLB has been updated by several patches after we observe those disadvantages. We will definitely retry StochasticLB after these busy months and will give you feed back after that :) > SimpleLoadBalancer schedules large amount of invalid region moves > - > > Key: HBASE-17039 > URL: https://issues.apache.org/jira/browse/HBASE-17039 > Project: HBase > Issue Type: Bug > Components: Balancer >Affects Versions: 2.0.0, 1.2.3, 1.1.7 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17039.patch > > > After increasing one of our clusters to 1600 nodes, we observed a large > amount of invalid region moves(more than 30k moves) fired by the balance > chore. Thus we simulated the problem and printed out the balance plan, only > to find out many servers that had two regions for a certain table(we use by > table strategy), sent out both regions to other two servers that have zero > region. > In the SimpleLoadBalancer's balanceCluster function, > the code block that determines the underLoadedServers might have a problem: > {code} > if (load >= min && load > 0) { > continue; // look for other servers which haven't reached min > } > int regionsToPut = min - load; > if (regionsToPut == 0) > { > regionsToPut = 1; > } > {code} > if min is zero, some server that has load of zero, which equals to min would > be marked as underloaded, which would cause the phenomenon mentioned above. > Since we increased the cluster's size to 1600+, many tables that only have > 1000 regions, now would encounter such issue. > By fixing it up, the balance plan went back to normal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
[ https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15644764#comment-15644764 ] Charlie Qiangeng Xu edited comment on HBASE-17039 at 11/7/16 5:21 PM: -- Just skimmed through the historical changes for this part, I found the code causing problem right now could be attributed to HBASE-7060. The issue described in that Jira has been handled nicely by other part of current simpleLoadBalancer logic, thus the code block aforementioned is not necessary, yet problematic. [~yuzhih...@gmail.com], it seems you were involved in that JIRA, any interest to take a look at this one? was (Author: xharlie): Just skimmed through the historical changes for this part, I found the code causing problem right now could be attributed to HBASE-7060. The problem mentioned in that Jira has been handled nicely by other part of current balancer logic, yet the code block aforementioned would only cause problem right now. [~yuzhih...@gmail.com], it seems you were involved in that JIRA, any interest to take a look at this one? > SimpleLoadBalancer schedules large amount of invalid region moves > - > > Key: HBASE-17039 > URL: https://issues.apache.org/jira/browse/HBASE-17039 > Project: HBase > Issue Type: Bug > Components: Balancer >Affects Versions: 2.0.0, 1.2.3, 1.1.7 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17039.patch > > > After increasing one of our clusters to 1600 nodes, we observed a large > amount of invalid region moves(more than 30k moves) fired by the balance > chore. Thus we simulated the problem and printed out the balance plan, only > to find out many servers that had two regions for a certain table(we use by > table strategy), sent out both regions to other two servers that have zero > region. > In the SimpleLoadBalancer's balanceCluster function, > the code block that determines the underLoadedServers might have a problem: > {code} > if (load >= min && load > 0) { > continue; // look for other servers which haven't reached min > } > int regionsToPut = min - load; > if (regionsToPut == 0) > { > regionsToPut = 1; > } > {code} > if min is zero, some server that has load of zero, which equals to min would > be marked as underloaded, which would cause the phenomenon mentioned above. > Since we increased the cluster's size to 1600+, many tables that only have > 1000 regions, now would encounter such issue. > By fixing it up, the balance plan went back to normal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
[ https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15644764#comment-15644764 ] Charlie Qiangeng Xu commented on HBASE-17039: - Just skimmed through the historical changes for this part, I found the code causing problem right now could be attributed to HBASE-7060. The problem mentioned in that Jira has been handled nicely by other part of current balancer logic, yet the code block aforementioned would only cause problem right now. [~yuzhih...@gmail.com], it seems you were involved in that JIRA, any interest to take a look at this one? > SimpleLoadBalancer schedules large amount of invalid region moves > - > > Key: HBASE-17039 > URL: https://issues.apache.org/jira/browse/HBASE-17039 > Project: HBase > Issue Type: Bug > Components: Balancer >Affects Versions: 2.0.0, 1.2.3, 1.1.7 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17039.patch > > > After increasing one of our clusters to 1600 nodes, we observed a large > amount of invalid region moves(more than 30k moves) fired by the balance > chore. Thus we simulated the problem and printed out the balance plan, only > to find out many servers that had two regions for a certain table(we use by > table strategy), sent out both regions to other two servers that have zero > region. > In the SimpleLoadBalancer's balanceCluster function, > the code block that determines the underLoadedServers might have a problem: > {code} > if (load >= min && load > 0) { > continue; // look for other servers which haven't reached min > } > int regionsToPut = min - load; > if (regionsToPut == 0) > { > regionsToPut = 1; > } > {code} > if min is zero, some server that has load of zero, which equals to min would > be marked as underloaded, which would cause the phenomenon mentioned above. > Since we increased the cluster's size to 1600+, many tables that only have > 1000 regions, now would encounter such issue. > By fixing it up, the balance plan went back to normal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
[ https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15643194#comment-15643194 ] Charlie Qiangeng Xu commented on HBASE-17039: - Just Uploaded the patch for 2.0 :) > SimpleLoadBalancer schedules large amount of invalid region moves > - > > Key: HBASE-17039 > URL: https://issues.apache.org/jira/browse/HBASE-17039 > Project: HBase > Issue Type: Bug > Components: Balancer >Affects Versions: 2.0.0, 1.2.3, 1.1.7 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17039.patch > > > After increasing one of our clusters to 1600 nodes, we observed a large > amount of invalid region moves(more than 30k moves) fired by the balance > chore. Thus we simulated the problem and printed out the balance plan, only > to find out many servers that had two regions for a certain table(we use by > table strategy), sent out both regions to other two servers that have zero > region. > In the SimpleLoadBalancer's balanceCluster function, > the code block that determines the underLoadedServers might have a problem: > {code} > if (load >= min && load > 0) { > continue; // look for other servers which haven't reached min > } > int regionsToPut = min - load; > if (regionsToPut == 0) > { > regionsToPut = 1; > } > {code} > if min is zero, some server that has load of zero, which equals to min would > be marked as underloaded, which would cause the phenomenon mentioned above. > Since we increased the cluster's size to 1600+, many tables that only have > 1000 regions, now would encounter such issue. > By fixing it up, the balance plan went back to normal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
[ https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17039: Attachment: HBASE-17039.patch > SimpleLoadBalancer schedules large amount of invalid region moves > - > > Key: HBASE-17039 > URL: https://issues.apache.org/jira/browse/HBASE-17039 > Project: HBase > Issue Type: Bug > Components: Balancer >Affects Versions: 2.0.0, 1.2.3, 1.1.7 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17039.patch > > > After increasing one of our clusters to 1600 nodes, we observed a large > amount of invalid region moves(more than 30k moves) fired by the balance > chore. Thus we simulated the problem and printed out the balance plan, only > to find out many servers that had two regions for a certain table(we use by > table strategy), sent out both regions to other two servers that have zero > region. > In the SimpleLoadBalancer's balanceCluster function, > the code block that determines the underLoadedServers might have a problem: > {code} > if (load >= min && load > 0) { > continue; // look for other servers which haven't reached min > } > int regionsToPut = min - load; > if (regionsToPut == 0) > { > regionsToPut = 1; > } > {code} > if min is zero, some server that has load of zero, which equals to min would > be marked as underloaded, which would cause the phenomenon mentioned above. > Since we increased the cluster's size to 1600+, many tables that only have > 1000 regions, now would encounter such issue. > By fixing it up, the balance plan went back to normal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
[ https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17039: Attachment: (was: HBASE-17039_V1.patch) > SimpleLoadBalancer schedules large amount of invalid region moves > - > > Key: HBASE-17039 > URL: https://issues.apache.org/jira/browse/HBASE-17039 > Project: HBase > Issue Type: Bug > Components: Balancer >Affects Versions: 2.0.0, 1.2.3, 1.1.7 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > > After increasing one of our clusters to 1600 nodes, we observed a large > amount of invalid region moves(more than 30k moves) fired by the balance > chore. Thus we simulated the problem and printed out the balance plan, only > to find out many servers that had two regions for a certain table(we use by > table strategy), sent out both regions to other two servers that have zero > region. > In the SimpleLoadBalancer's balanceCluster function, > the code block that determines the underLoadedServers might have a problem: > {code} > if (load >= min && load > 0) { > continue; // look for other servers which haven't reached min > } > int regionsToPut = min - load; > if (regionsToPut == 0) > { > regionsToPut = 1; > } > {code} > if min is zero, some server that has load of zero, which equals to min would > be marked as underloaded, which would cause the phenomenon mentioned above. > Since we increased the cluster's size to 1600+, many tables that only have > 1000 regions, now would encounter such issue. > By fixing it up, the balance plan went back to normal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
[ https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17039: Attachment: HBASE-17039_V1.patch > SimpleLoadBalancer schedules large amount of invalid region moves > - > > Key: HBASE-17039 > URL: https://issues.apache.org/jira/browse/HBASE-17039 > Project: HBase > Issue Type: Bug > Components: Balancer >Affects Versions: 2.0.0, 1.2.3, 1.1.7 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Attachments: HBASE-17039_V1.patch > > > After increasing one of our clusters to 1600 nodes, we observed a large > amount of invalid region moves(more than 30k moves) fired by the balance > chore. Thus we simulated the problem and printed out the balance plan, only > to find out many servers that had two regions for a certain table(we use by > table strategy), sent out both regions to other two servers that have zero > region. > In the SimpleLoadBalancer's balanceCluster function, > the code block that determines the underLoadedServers might have a problem: > {code} > if (load >= min && load > 0) { > continue; // look for other servers which haven't reached min > } > int regionsToPut = min - load; > if (regionsToPut == 0) > { > regionsToPut = 1; > } > {code} > if min is zero, some server that has load of zero, which equals to min would > be marked as underloaded, which would cause the phenomenon mentioned above. > Since we increased the cluster's size to 1600+, many tables that only have > 1000 regions, now would encounter such issue. > By fixing it up, the balance plan went back to normal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
[ https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17039: Description: After increasing one of our clusters to 1600 nodes, we observed a large amount of invalid region moves(more than 30k moves) fired by the balance chore. Thus we simulated the problem and printed out the balance plan, only to find out many servers that had two regions for a certain table(we use by table strategy), sent out both regions to other two servers that have zero region. In the SimpleLoadBalancer's balanceCluster function, the code block that determines the underLoadedServers might have a problem: if (load >= min && load > 0) { continue; // look for other servers which haven't reached min } int regionsToPut = min - load; if (regionsToPut == 0) { regionsToPut = 1; } if min is zero, some server that has load of zero, which equals to min would be marked as underloaded, which would cause the phenomenon mentioned above. Since we increased the cluster's size to 1600+, many tables that only have 1000 regions, now would encounter such issue. By fixing it up, the balance plan went back to normal. was: After increasing one of our clusters to 1600 nodes, we observed a large amount of invalid region moves(more than 30k moves) fired by the balance chore. Thus we simulated the problem and printed out the balance plan, only to find out many servers that had two regions for a certain table(we use by table strategy), sent out both regions to other two servers that have zero region. In the SimpleLoadBalancer's balanceCluster function, the code block that determines the underLoadedServers might have a problem: if (load >= min && load > 0) { continue; // look for other servers which haven't reached min } int regionsToPut = min - load; if (regionsToPut == 0) { regionsToPut = 1; } if min is zero, some server that has load of zero, which equals to min would be marked as underloaded, which would cause the phenomenon mentioned above. Since we increased the cluster's size to 1600+, many tables that only have 1000 regions, now would encounter such issue. > SimpleLoadBalancer schedules large amount of invalid region moves > - > > Key: HBASE-17039 > URL: https://issues.apache.org/jira/browse/HBASE-17039 > Project: HBase > Issue Type: Bug > Components: Balancer >Affects Versions: 2.0.0, 1.1.6, 1.2.3 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Fix For: 2.0.0, 1.1.6, 1.2.3 > > > After increasing one of our clusters to 1600 nodes, we observed a large > amount of invalid region moves(more than 30k moves) fired by the balance > chore. Thus we simulated the problem and printed out the balance plan, only > to find out many servers that had two regions for a certain table(we use by > table strategy), sent out both regions to other two servers that have zero > region. > In the SimpleLoadBalancer's balanceCluster function, > the code block that determines the underLoadedServers might have a problem: > if (load >= min && load > 0) { > continue; // look for other servers which haven't reached min > } > int regionsToPut = min - load; > if (regionsToPut == 0) > { > regionsToPut = 1; > } > if min is zero, some server that has load of zero, which equals to min would > be marked as underloaded, which would cause the phenomenon mentioned above. > Since we increased the cluster's size to 1600+, many tables that only have > 1000 regions, now would encounter such issue. > By fixing it up, the balance plan went back to normal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
[ https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17039: Description: After increasing one of our clusters to 1600 nodes, we observed a large amount of invalid region moves(more than 30k moves) fired by the balance chore. Thus we simulated the problem and printed out the balance plan, only to find out many servers that had two regions for a certain table(we use by table strategy), sent out both regions to other two servers that have zero region. In the SimpleLoadBalancer's balanceCluster function, the code block that determines the underLoadedServers might have a problem: if (load >= min && load > 0) { continue; // look for other servers which haven't reached min } int regionsToPut = min - load; if (regionsToPut == 0) { regionsToPut = 1; } if min is zero, some server that has load of zero, which equals to min would be marked as underloaded, which would cause the phenomenon mentioned above. Since we increased the cluster's size to 1600+, many tables that only have 1000 regions, now would encounter such issue. was: After increasing one of our clusters to 1600 nodes, we observed a large amount of invalid region moves(more than 30k moves) fired by balance chore. Thus we simulated the problem and printed out the balance plan, only to find out many server that had two regions for a certain table(we use by table strategy), sent out both regions to other two servers that have zero regions. In the SimpleLoadBalancer's balanceCluster function, the code block that determines the underLoadedServers might have a problem: if (load >= min && load > 0) { continue; // look for other servers which haven't reached min } int regionsToPut = min - load; if (regionsToPut == 0) { regionsToPut = 1; } if min is zero, some server that has load of zero, which equals to min would be marked as underloaded, which would cause such problem mentioned above. Since we increase the cluster's size to 1600+, many table only have 1000 regions, now would encounter such issue. > SimpleLoadBalancer schedules large amount of invalid region moves > - > > Key: HBASE-17039 > URL: https://issues.apache.org/jira/browse/HBASE-17039 > Project: HBase > Issue Type: Bug > Components: Balancer >Affects Versions: 2.0.0, 1.1.6, 1.2.3 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Fix For: 2.0.0, 1.1.6, 1.2.3 > > > After increasing one of our clusters to 1600 nodes, we observed a large > amount of invalid region moves(more than 30k moves) fired by the balance > chore. Thus we simulated the problem and printed out the balance plan, only > to find out many servers that had two regions for a certain table(we use by > table strategy), sent out both regions to other two servers that have zero > region. > In the SimpleLoadBalancer's balanceCluster function, > the code block that determines the underLoadedServers might have a problem: > if (load >= min && load > 0) { > continue; // look for other servers which haven't reached min > } > int regionsToPut = min - load; > if (regionsToPut == 0) > { > regionsToPut = 1; > } > if min is zero, some server that has load of zero, which equals to min would > be marked as underloaded, which would cause the phenomenon mentioned above. > Since we increased the cluster's size to 1600+, many tables that only have > 1000 regions, now would encounter such issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
[ https://issues.apache.org/jira/browse/HBASE-17039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-17039: Description: After increasing one of our clusters to 1600 nodes, we observed a large amount of invalid region moves(more than 30k moves) fired by balance chore. Thus we simulated the problem and printed out the balance plan, only to find out many server that had two regions for a certain table(we use by table strategy), sent out both regions to other two servers that have zero regions. In the SimpleLoadBalancer's balanceCluster function, the code block that determines the underLoadedServers might have a problem: if (load >= min && load > 0) { continue; // look for other servers which haven't reached min } int regionsToPut = min - load; if (regionsToPut == 0) { regionsToPut = 1; } if min is zero, some server that has load of zero, which equals to min would be marked as underloaded, which would cause such problem mentioned above. Since we increase the cluster's size to 1600+, many table only have 1000 regions, now would encounter such issue. was: After increasing one of our clusters to 1600 nodes, we observed a large amount of invalid region moves(more than 3 thousand moves) fired by balance chore. Thus we simulated the problem and printed out the balance plan, only to find out many server that had two regions for a certain table(we use by table strategy), sent out both regions to other two servers that have zero regions. In the SimpleLoadBalancer's balanceCluster function, the code block that determines the underLoadedServers might have a problem: if (load >= min && load > 0) { continue; // look for other servers which haven't reached min } int regionsToPut = min - load; if (regionsToPut == 0) { regionsToPut = 1; } if min is zero, some server that has load of zero, which equals to min would be marked as underloaded, which would cause such problem mentioned above. Since we increase the cluster's size to 1600+, many table only have 1000 regions, now would encounter such issue. > SimpleLoadBalancer schedules large amount of invalid region moves > - > > Key: HBASE-17039 > URL: https://issues.apache.org/jira/browse/HBASE-17039 > Project: HBase > Issue Type: Bug > Components: Balancer >Affects Versions: 2.0.0, 1.1.6, 1.2.3 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Fix For: 2.0.0, 1.1.6, 1.2.3 > > > After increasing one of our clusters to 1600 nodes, we observed a large > amount of invalid region moves(more than 30k moves) fired by balance chore. > Thus we simulated the problem and printed out the balance plan, only to find > out many server that had two regions for a certain table(we use by table > strategy), sent out both regions to other two servers that have zero regions. > In the SimpleLoadBalancer's balanceCluster function, > the code block that determines the underLoadedServers might have a problem: > if (load >= min && load > 0) { > continue; // look for other servers which haven't reached min > } > int regionsToPut = min - load; > if (regionsToPut == 0) > { > regionsToPut = 1; > } > if min is zero, some server that has load of zero, which equals to min would > be marked as underloaded, which would cause such problem mentioned above. > Since we increase the cluster's size to 1600+, many table only have 1000 > regions, now would encounter such issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17039) SimpleLoadBalancer schedules large amount of invalid region moves
Charlie Qiangeng Xu created HBASE-17039: --- Summary: SimpleLoadBalancer schedules large amount of invalid region moves Key: HBASE-17039 URL: https://issues.apache.org/jira/browse/HBASE-17039 Project: HBase Issue Type: Bug Components: Balancer Affects Versions: 1.2.3, 1.1.6, 2.0.0 Reporter: Charlie Qiangeng Xu Assignee: Charlie Qiangeng Xu Fix For: 2.0.0, 1.2.3, 1.1.6 After increasing one of our clusters to 1600 nodes, we observed a large amount of invalid region moves(more than 3 thousand moves) fired by balance chore. Thus we simulated the problem and printed out the balance plan, only to find out many server that had two regions for a certain table(we use by table strategy), sent out both regions to other two servers that have zero regions. In the SimpleLoadBalancer's balanceCluster function, the code block that determines the underLoadedServers might have a problem: if (load >= min && load > 0) { continue; // look for other servers which haven't reached min } int regionsToPut = min - load; if (regionsToPut == 0) { regionsToPut = 1; } if min is zero, some server that has load of zero, which equals to min would be marked as underloaded, which would cause such problem mentioned above. Since we increase the cluster's size to 1600+, many table only have 1000 regions, now would encounter such issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16626) User customized RegionScanner from 1.X is incompatible with 2.0.0's off-heap part
[ https://issues.apache.org/jira/browse/HBASE-16626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500183#comment-15500183 ] Charlie Qiangeng Xu commented on HBASE-16626: - First patch in community! Thank you Anoop and all others, you guys are very supportive and responsive :) > User customized RegionScanner from 1.X is incompatible with 2.0.0's off-heap > part > - > > Key: HBASE-16626 > URL: https://issues.apache.org/jira/browse/HBASE-16626 > Project: HBase > Issue Type: Sub-task > Components: Offheaping >Affects Versions: 2.0.0 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Fix For: 2.0.0 > > Attachments: HBASE-16626-v1.patch > > > Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a > new interface Shipper which contains a "shipped()" method. > In our case, some application user defined customized scanner that implements > Interface RegionScanner in 1.X . > After we back ported the Off Heap feature of 2.0.0, > RegionScannerShippedCallBack throws a "java.lang.AbstractMethodError" when > executing scanner.shipped(). It is because the customized scanner didn't > override the shipped method yet. > Instead of forcing every user to add a empty implementation(if they don't > really need to scan the file or the RS don't use L2 cache, they don't need to > do anything in shipped method) , adding a default method of shipped in > Interface RegionScanner might be a better way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16626) User customized RegionScanner from 1.X is incompatible with 2.0.0's off-heap part
[ https://issues.apache.org/jira/browse/HBASE-16626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-16626: Attachment: HBASE-16626-v1.patch attach v1 > User customized RegionScanner from 1.X is incompatible with 2.0.0's off-heap > part > - > > Key: HBASE-16626 > URL: https://issues.apache.org/jira/browse/HBASE-16626 > Project: HBase > Issue Type: Sub-task > Components: Offheaping >Affects Versions: 1.2.2, 1.1.6 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Fix For: 2.0.0 > > Attachments: HBASE-16626-v1.patch > > > Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a > new interface Shipper which contains a "shipped()" method. > In our case, some application user defined customized scanner that implements > Interface RegionScanner in 1.X . > After we back ported the Off Heap feature of 2.0.0, > RegionScannerShippedCallBack throws a "java.lang.AbstractMethodError" when > executing scanner.shipped(). It is because the customized scanner didn't > override the shipped method yet. > Instead of forcing every user to add a empty implementation(if they don't > really need to scan the file or the RS don't use L2 cache, they don't need to > do anything in shipped method) , adding a default method of shipped in > Interface RegionScanner might be a better way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16626) User customized RegionScanner from 1.X is incompatible with 2.0.0's off-heap part
[ https://issues.apache.org/jira/browse/HBASE-16626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489260#comment-15489260 ] Charlie Qiangeng Xu commented on HBASE-16626: - Thank you [~tedyu] for adding me! Just added a empty implementation default method in Interface RegionScanner. Or maybe should I add this to Interface Shipper so that the patch could cover more sub scanner interfaces? One concern would be, if added to Interface Shipper, the user extending e.g. Interface KeyValueScanner(one that also extends Shipper) might forget to create their shipped function to invoke returnBlockToCache. Any thought [~anoop.hbase]? > User customized RegionScanner from 1.X is incompatible with 2.0.0's off-heap > part > - > > Key: HBASE-16626 > URL: https://issues.apache.org/jira/browse/HBASE-16626 > Project: HBase > Issue Type: Sub-task > Components: Offheaping >Affects Versions: 1.2.2, 1.1.6 >Reporter: Charlie Qiangeng Xu >Assignee: Charlie Qiangeng Xu > Fix For: 2.0.0 > > > Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a > new interface Shipper which contains a "shipped()" method. > In our case, some application user defined customized scanner that implements > Interface RegionScanner in 1.X . > After we back ported the Off Heap feature of 2.0.0, > RegionScannerShippedCallBack throws a "java.lang.AbstractMethodError" when > executing scanner.shipped(). It is because the customized scanner didn't > override the shipped method yet. > Instead of forcing every user to add a empty implementation(if they don't > really need to scan the file or the RS don't use L2 cache, they don't need to > do anything in shipped method) , adding a default method of shipped in > Interface RegionScanner might be a better way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16626) User customized RegionScanner from 1.X is incompatible with 2.0.0's off-heap part
[ https://issues.apache.org/jira/browse/HBASE-16626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-16626: Description: Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a new interface Shipper which contains a "shipped()" method. In our case, some application user defined customized scanner that implements Interface RegionScanner in 1.X . After we back ported the Off Heap feature of 2.0.0, RegionScannerShippedCallBack throws a "java.lang.AbstractMethodError" when executing scanner.shipped(). It is because the customized scanner didn't override the shipped method yet. Instead of forcing every user to add a empty implementation(if they don't really need to scan the file or the RS don't use L2 cache, they don't need to do anything in shipped method) , adding a default method of shipped in Interface RegionScanner might be a better way. was: Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a new interface Shipper which contains a "shipped()" method. In our case, some application user defined customized scanner that implements Interface RegionScanner in 1.X . After we back ported the Off Heap feature of 2.0.0, RegionScannerShippedCallBack throws a "java.lang.AbstractMethodError" when executing scanner.shipped(). It is because the customized scanner didn't override the shipped method yet. Instead of forcing every user to add a empty implementation(if they don't really need to scan the file or the RS don't use L2 cache, they don't need to do anything in shipped method) , adding a default method of shipped in Interface RegionScanner might be a better way.(since HBASE-15624 we decide to use JAVA8 only in 2.0.0) > User customized RegionScanner from 1.X is incompatible with 2.0.0's off-heap > part > - > > Key: HBASE-16626 > URL: https://issues.apache.org/jira/browse/HBASE-16626 > Project: HBase > Issue Type: Sub-task > Components: Offheaping >Affects Versions: 1.2.2, 1.1.6 >Reporter: Charlie Qiangeng Xu > Fix For: 2.0.0 > > > Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a > new interface Shipper which contains a "shipped()" method. > In our case, some application user defined customized scanner that implements > Interface RegionScanner in 1.X . > After we back ported the Off Heap feature of 2.0.0, > RegionScannerShippedCallBack throws a "java.lang.AbstractMethodError" when > executing scanner.shipped(). It is because the customized scanner didn't > override the shipped method yet. > Instead of forcing every user to add a empty implementation(if they don't > really need to scan the file or the RS don't use L2 cache, they don't need to > do anything in shipped method) , adding a default method of shipped in > Interface RegionScanner might be a better way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16626) User customized RegionScanner from 1.X is incompatible with 2.0.0's off-heap part
[ https://issues.apache.org/jira/browse/HBASE-16626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-16626: Summary: User customized RegionScanner from 1.X is incompatible with 2.0.0's off-heap part (was: User customized RegionScanner from 1.X is incompatible in 2.0.0's off heap part) > User customized RegionScanner from 1.X is incompatible with 2.0.0's off-heap > part > - > > Key: HBASE-16626 > URL: https://issues.apache.org/jira/browse/HBASE-16626 > Project: HBase > Issue Type: Sub-task > Components: Offheaping >Affects Versions: 1.2.2, 1.1.6 >Reporter: Charlie Qiangeng Xu > Fix For: 2.0.0 > > > Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a > new interface Shipper which contains a "shipped()" method. > In our case, some application user defined customized scanner that implements > Interface RegionScanner in 1.X . > After we back ported the Off Heap feature of 2.0.0, > RegionScannerShippedCallBack throws a "java.lang.AbstractMethodError" when > executing scanner.shipped(). It is because the customized scanner didn't > override the shipped method yet. > Instead of forcing every user to add a empty implementation(if they don't > really need to scan the file or the RS don't use L2 cache, they don't need to > do anything in shipped method) , adding a default method of shipped in > Interface RegionScanner might be a better way.(since HBASE-15624 we decide > to use JAVA8 only in 2.0.0) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16626) User customized RegionScanner from 1.X is incompatible in 2.0.0's off heap part
[ https://issues.apache.org/jira/browse/HBASE-16626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-16626: Description: Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a new interface Shipper which contains a "shipped()" method. In our case, some application user defined customized scanner that implements Interface RegionScanner in 1.X . After we back ported the Off Heap feature of 2.0.0, RegionScannerShippedCallBack throws a "java.lang.AbstractMethodError" when executing scanner.shipped(). It is because the customized scanner didn't override the shipped method yet. Instead of forcing every user to add a empty implementation(if they don't really need to scan the file or the RS don't use L2 cache, they don't need to do anything in shipped method) , adding a default method of shipped in Interface RegionScanner might be a better way.(since HBASE-15624 we decide to use JAVA8 only in 2.0.0) was: Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a new interface Shipper which contains a "shipped()" method. In our case, some application user defined customized scanner that implements Interface RegionScanner in 1.X . After we back ported the Off Heap feature of 2.0.0, RegionScannerShippedCallBack throws a "java.lang.AbstractMethodError" when executing scanner.shipped(); Instead of forcing every one of our users to add a empty implementation(if they don't really need to scan the file or the RS don't use L2 cache) , adding a default method of shipped might be a better way.(since HBASE-15624 we decide to use JAVA8 only in 2.0.0) > User customized RegionScanner from 1.X is incompatible in 2.0.0's off heap > part > --- > > Key: HBASE-16626 > URL: https://issues.apache.org/jira/browse/HBASE-16626 > Project: HBase > Issue Type: Sub-task > Components: Offheaping >Affects Versions: 1.2.2, 1.1.6 >Reporter: Charlie Qiangeng Xu > Fix For: 2.0.0 > > > Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a > new interface Shipper which contains a "shipped()" method. > In our case, some application user defined customized scanner that implements > Interface RegionScanner in 1.X . > After we back ported the Off Heap feature of 2.0.0, > RegionScannerShippedCallBack throws a "java.lang.AbstractMethodError" when > executing scanner.shipped(). It is because the customized scanner didn't > override the shipped method yet. > Instead of forcing every user to add a empty implementation(if they don't > really need to scan the file or the RS don't use L2 cache, they don't need to > do anything in shipped method) , adding a default method of shipped in > Interface RegionScanner might be a better way.(since HBASE-15624 we decide > to use JAVA8 only in 2.0.0) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16626) User customized RegionScanner from 1.X is incompatible in 2.0.0's off heap part
[ https://issues.apache.org/jira/browse/HBASE-16626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charlie Qiangeng Xu updated HBASE-16626: Remaining Estimate: (was: 48h) Original Estimate: (was: 48h) > User customized RegionScanner from 1.X is incompatible in 2.0.0's off heap > part > --- > > Key: HBASE-16626 > URL: https://issues.apache.org/jira/browse/HBASE-16626 > Project: HBase > Issue Type: Sub-task > Components: Offheaping >Affects Versions: 1.2.2, 1.1.6 >Reporter: Charlie Qiangeng Xu > Fix For: 2.0.0 > > > Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a > new interface Shipper which contains a "shipped()" method. > In our case, some application user defined customized scanner that > implements Interface RegionScanner in 1.X . After we back ported the Off Heap > feature of 2.0.0, RegionScannerShippedCallBack throws a > "java.lang.AbstractMethodError" when executing scanner.shipped(); > Instead of forcing every one of our users to add a empty implementation(if > they don't really need to scan the file or the RS don't use L2 cache) , > adding a default method of shipped might be a better way.(since HBASE-15624 > we decide to use JAVA8 only in 2.0.0) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16626) User customized RegionScanner from 1.X is incompatible in 2.0.0's off heap part
Charlie Qiangeng Xu created HBASE-16626: --- Summary: User customized RegionScanner from 1.X is incompatible in 2.0.0's off heap part Key: HBASE-16626 URL: https://issues.apache.org/jira/browse/HBASE-16626 Project: HBase Issue Type: Sub-task Components: Offheaping Affects Versions: 1.1.6, 1.2.2 Reporter: Charlie Qiangeng Xu Fix For: 2.0.0 Introduced by 2.0.0's off-heap feature, the interface RegionScanner extends a new interface Shipper which contains a "shipped()" method. In our case, some application user defined customized scanner that implements Interface RegionScanner in 1.X . After we back ported the Off Heap feature of 2.0.0, RegionScannerShippedCallBack throws a "java.lang.AbstractMethodError" when executing scanner.shipped(); Instead of forcing every one of our users to add a empty implementation(if they don't really need to scan the file or the RS don't use L2 cache) , adding a default method of shipped might be a better way.(since HBASE-15624 we decide to use JAVA8 only in 2.0.0) -- This message was sent by Atlassian JIRA (v6.3.4#6332)