[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293190#comment-16293190
 ] 

Hudson commented on HBASE-18946:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4231 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4231/])
HBASE-18946 Stochastic load balancer assigns replica regions to the same 
(stack: rev 010012cbcb3064b78b9e184a2808bbd26ea80903)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/EnableTableProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RecoverMetaProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateTableProcedure.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/AbstractTestDLS.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
* (edit) 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionReplicasWithRestartScenarios.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/ZKNamespaceManager.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestAssignmentManager.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestRegionsOnMasterOptions.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionMergeTransactionOnCluster.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenInitializing.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CloneSnapshotProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/snapshot/TestAssignProcedure.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java


> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: stack
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, 
> HBASE-18946.master.002.patch, HBASE-18946.master.003.patch, 
> HBASE-18946.master.004.patch, HBASE-18946.master.005.patch, 
> HBASE-18946.master.006.patch, HBASE-18946.master.007.patch, 
> HBASE-18946.master.008.patch, HBASE-18946.master.009.patch, 
> HBASE-18946.master.010.patch, HBASE-18946.master.011.patch, 
> HBASE-18946.master.012.patch, HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292173#comment-16292173
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
8s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 9 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
56s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
13s{color} | {color:red} hbase-server: The patch generated 10 new + 735 
unchanged - 9 fixed = 745 total (was 744) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
30s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
19m  6s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0-beta1. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
0s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 94m 
26s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}137m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12902218/HBASE-18946.master.012.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 1884365e82e3 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / deba43b156 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-14 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292062#comment-16292062
 ] 

stack commented on HBASE-18946:
---

Failure was TestMasterFailover. It failed to write its xml because it timed out 
twice. Looking that the test, it tries to set zk node states and move meta 
regions off master, neither of which makes sense in AMv2. I refactored the 
TestMasterFailover test that does nonesense.

I see other timeouts though it looks like most other tests just pass. Here is 
what I see in console:

TestDLSFSHLog
TestStochasticLoadBalancer
TestReplicationZKNodeCleaner
TestLogsCleaner

These all pass locally w/o issue EXCEPT TestDLSFSHLog. It looks sick, stuck. 
Digging, indeed, its the fault of this patch. We try to keep sending state 
change messages to master as long as we can only the thread is not daemon so it 
keeps the RS up. Ugh! Fixed.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: stack
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, 
> HBASE-18946.master.002.patch, HBASE-18946.master.003.patch, 
> HBASE-18946.master.004.patch, HBASE-18946.master.005.patch, 
> HBASE-18946.master.006.patch, HBASE-18946.master.007.patch, 
> HBASE-18946.master.008.patch, HBASE-18946.master.009.patch, 
> HBASE-18946.master.010.patch, HBASE-18946.master.011.patch, 
> HBASE-18946.patch, HBASE-18946.patch, HBASE-18946_2.patch, 
> HBASE-18946_2.patch, HBASE-18946_simple_7.patch, HBASE-18946_simple_8.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291928#comment-16291928
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
47s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
12s{color} | {color:red} hbase-server: The patch generated 9 new + 721 
unchanged - 7 fixed = 730 total (was 728) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
24s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
19m  0s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0-beta1. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
2s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 12s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12902177/HBASE-18946.master.011.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 0678ddafa22b 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 6ab8ce9829 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-14 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291801#comment-16291801
 ] 

stack commented on HBASE-18946:
---

Another interesting test failure. I can't get it to fail locally which is a 
bummer. Looking at the test, for the case where all servers carry Regions -- 
i.e. Master too -- I see that we were not running the balancer because we had 
RIT. Let me fix that in latest version of patch.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: stack
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, 
> HBASE-18946.master.002.patch, HBASE-18946.master.003.patch, 
> HBASE-18946.master.004.patch, HBASE-18946.master.005.patch, 
> HBASE-18946.master.006.patch, HBASE-18946.master.007.patch, 
> HBASE-18946.master.008.patch, HBASE-18946.master.009.patch, 
> HBASE-18946.master.010.patch, HBASE-18946.master.011.patch, 
> HBASE-18946.patch, HBASE-18946.patch, HBASE-18946_2.patch, 
> HBASE-18946_2.patch, HBASE-18946_simple_7.patch, HBASE-18946_simple_8.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291737#comment-16291737
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
51s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
14s{color} | {color:red} hbase-server: The patch generated 9 new + 704 
unchanged - 7 fixed = 713 total (was 711) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
27s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
19m 13s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0-beta1. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
1s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}107m 44s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.balancer.TestRegionsOnMasterOptions |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12902145/HBASE-18946.master.010.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux f2560ea0a1ec 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / 2c9ef8a471 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291475#comment-16291475
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 9s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  1m 
50s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
37s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 37s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
59s{color} | {color:red} hbase-server: The patch generated 9 new + 704 
unchanged - 7 fixed = 713 total (was 711) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  2m 
53s{color} | {color:red} patch has 15 errors when building our shaded 
downstream artifacts. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  4m 
24s{color} | {color:red} The patch causes 16 errors with Hadoop v2.6.5. {color} 
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  5m 
57s{color} | {color:red} The patch causes 16 errors with Hadoop v2.7.4. {color} 
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  7m 
33s{color} | {color:red} The patch causes 16 errors with Hadoop v3.0.0-beta1. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
54s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 35s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12902133/HBASE-18946.master.009.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 88cfccd6cadc 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-14 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291400#comment-16291400
 ] 

stack commented on HBASE-18946:
---

.009 disables the failing TestRSKilledWhenInitializing (See HBASE-19515). Its a 
medium test. Lets see what other goodies this patch turns up.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: stack
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, 
> HBASE-18946.master.002.patch, HBASE-18946.master.003.patch, 
> HBASE-18946.master.004.patch, HBASE-18946.master.005.patch, 
> HBASE-18946.master.006.patch, HBASE-18946.master.007.patch, 
> HBASE-18946.master.008.patch, HBASE-18946.master.009.patch, 
> HBASE-18946.patch, HBASE-18946.patch, HBASE-18946_2.patch, 
> HBASE-18946_2.patch, HBASE-18946_simple_7.patch, HBASE-18946_simple_8.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-14 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291393#comment-16291393
 ] 

stack commented on HBASE-18946:
---

The test failure is a really good one. We've exposed a latent problem in assign 
supposedly fixed years ago in 0.96 with HBASE-9593. See HBASE-19515 for detail. 
For now going to disable this failing test (fails sometimes but the test needs 
to be beefed up as it is a little sloppy verifying supposed fix).

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: stack
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, 
> HBASE-18946.master.002.patch, HBASE-18946.master.003.patch, 
> HBASE-18946.master.004.patch, HBASE-18946.master.005.patch, 
> HBASE-18946.master.006.patch, HBASE-18946.master.007.patch, 
> HBASE-18946.master.008.patch, HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290511#comment-16290511
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
8s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
1s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
53s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
10s{color} | {color:red} hbase-server: The patch generated 9 new + 569 
unchanged - 6 fixed = 578 total (was 575) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
25s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
18m 53s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0-beta1. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
3s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 36s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}139m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestRSKilledWhenInitializing |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12902000/HBASE-18946.master.008.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 10bb8ecca696 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / ba5f9ac380 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290373#comment-16290373
 ] 

stack commented on HBASE-18946:
---

.008 Adds in a bunch of doc around how retention works, when to use round robin 
assignment creation vs old-style. Fixes some other minor findings. The  new 
bulk comes from port of doc done over in the now-shattered HBASE-19501. Here is 
commit message.

Added new bulk assign createRoundRobinAssignProcedure to complement
the existing createAssignProcedure. The former asks the balancer for
target servers to set into the created AssignProcedures. The latter
sets no target server into AssignProcedure. When no target server
is specified, we make effort at assign-time at trying to deploy the
region to its old location if there was one.

The new round robin assign procedure creator does not do this. Use
the new round robin method on table create or reenabling offline
regions. Use the old assign in ServerCrashProcedure or in
EnableTable so there is a chance we retain locality.

Bulk preassigning passing all to-be-assigned to the balancer in one
go is good for ensuring good distribution especially when read
replicas in the mix.

The old assign was single-assign scoped so region replicas could
end up on the same server.

M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
 Cleanup around forceNewPlan. Was confusing.
 Added a Comparator to sort AssignProcedures so meta and system tables
 come ahead of user-space tables.

M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Remove the forceNewPlan argument on createAssignProcedure. Didn't make
 sense given we were creating a new AssignProcedure; the arg had no
 effect.

 (createRoundRobinAssignProcedures) Recast to feed all regions to the 
balancer in
 bulk and to sort the return so meta and system tables take precedence.

Miscellaneous fixes including keeping the Master around until all
RegionServers are down, documentation on how assignment retention
works, etc.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: stack
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, 
> HBASE-18946.master.002.patch, HBASE-18946.master.003.patch, 
> HBASE-18946.master.004.patch, HBASE-18946.master.005.patch, 
> HBASE-18946.master.006.patch, HBASE-18946.master.007.patch, 
> HBASE-18946.master.008.patch, HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290263#comment-16290263
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
8s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
45s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
6s{color} | {color:red} hbase-server: The patch generated 4 new + 229 unchanged 
- 7 fixed = 233 total (was 236) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
23s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
19m  1s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0-beta1. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
58s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}124m 41s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 32s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestRSKilledWhenInitializing |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12901964/HBASE-18946.master.007.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux ad7f3aa297cc 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 104afd74a6 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290123#comment-16290123
 ] 

stack commented on HBASE-18946:
---

TestRestartCluster#testRetainAssignmentOnRestart is broken because when we 
create AssignProcedures now, we populate them w/ target servers. Doing this 
blows our retaining old locations (if no target specified, when AssignProcedure 
runs, it will try to use the new form of the old location -- which is how we 
get our retention). Over in HBASE-19501 we noticed that ServerCrashProcedure 
depends on this and made a version of assign procedures that works for SCP. Let 
me do the same here by reinstituting the old bulk assign method for SCP to use.

This seems to have fixed all tests. Let me post new patch to be sure.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, 
> HBASE-18946.master.002.patch, HBASE-18946.master.003.patch, 
> HBASE-18946.master.004.patch, HBASE-18946.master.005.patch, 
> HBASE-18946.master.006.patch, HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289840#comment-16289840
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  3m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
49s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
5s{color} | {color:red} hbase-server: The patch generated 3 new + 149 unchanged 
- 6 fixed = 152 total (was 155) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
24s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
18m 53s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0-beta1. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
59s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 35s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m 34s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.TestRestartCluster |
|   | hadoop.hbase.regionserver.TestRSKilledWhenInitializing |
|   | hadoop.hbase.regionserver.TestRegionReplicasAreDistributed |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12901923/HBASE-18946.master.006.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux f73947df98b8 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289668#comment-16289668
 ] 

stack commented on HBASE-18946:
---

[~ram_krish] You +1 if tests pass? Thanks sir.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, 
> HBASE-18946.master.002.patch, HBASE-18946.master.003.patch, 
> HBASE-18946.master.004.patch, HBASE-18946.master.005.patch, 
> HBASE-18946.master.006.patch, HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289628#comment-16289628
 ] 

stack commented on HBASE-18946:
---

I included the async wal setting by mistake. Fixing. Need a +1 here please.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, 
> HBASE-18946.master.002.patch, HBASE-18946.master.003.patch, 
> HBASE-18946.master.004.patch, HBASE-18946.master.005.patch, 
> HBASE-18946.patch, HBASE-18946.patch, HBASE-18946_2.patch, 
> HBASE-18946_2.patch, HBASE-18946_simple_7.patch, HBASE-18946_simple_8.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288977#comment-16288977
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
58s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
7s{color} | {color:red} hbase-server: The patch generated 5 new + 155 unchanged 
- 6 fixed = 160 total (was 161) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
40s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
53m 40s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
5s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}129m 51s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}208m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.TestRestartCluster |
|   | hadoop.hbase.wal.TestWALOpenAfterDNRollingStart |
|   | hadoop.hbase.regionserver.TestRSKilledWhenInitializing |
|   | hadoop.hbase.regionserver.TestRegionReplicasAreDistributed |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12901827/HBASE-18946.master.005.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 8da561481ac1 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288970#comment-16288970
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
34s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
5s{color} | {color:red} hbase-server: The patch generated 5 new + 155 unchanged 
- 6 fixed = 160 total (was 161) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
34s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
54m 59s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
35s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}130m 55s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}214m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.regionserver.TestWalAndCompactingMemStoreFlush |
|   | hadoop.hbase.wal.TestWALOpenAfterDNRollingStart |
|   | hadoop.hbase.regionserver.TestRegionReplicasAreDistributed |
|   | hadoop.hbase.master.TestRestartCluster |
|   | hadoop.hbase.regionserver.wal.TestLogRolling |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12901826/HBASE-18946.master.004.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux b04190b07c38 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288937#comment-16288937
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
8s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
47s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
5s{color} | {color:red} hbase-server: The patch generated 4 new + 155 unchanged 
- 6 fixed = 159 total (was 161) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
32s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
53m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
0s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}129m  0s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}206m 55s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestHRegion |
|   | hadoop.hbase.master.TestRestartCluster |
|   | hadoop.hbase.wal.TestWALOpenAfterDNRollingStart |
|   | hadoop.hbase.regionserver.TestRegionReplicasAreDistributed |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12901825/HBASE-18946.master.003.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 52e30776008f 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288754#comment-16288754
 ] 

stack commented on HBASE-18946:
---

.004 fixes tests and adds in [~ram_krish] 's test.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, 
> HBASE-18946.master.002.patch, HBASE-18946.master.003.patch, 
> HBASE-18946.master.004.patch, HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288183#comment-16288183
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
29s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
4s{color} | {color:red} hbase-server: The patch generated 8 new + 155 unchanged 
- 6 fixed = 163 total (was 161) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
23s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
50m 43s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 15s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}123m 21s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.coprocessor.TestCoprocessorShortCircuitRPC |
|   | hadoop.hbase.regionserver.TestHRegionFileSystem |
|   | hadoop.hbase.master.locking.TestLockManager |
|   | hadoop.hbase.master.locking.TestLockProcedure |
|   | hadoop.hbase.procedure.TestProcedureManager |
|   | hadoop.hbase.master.balancer.TestRegionLocationFinder |
|   | hadoop.hbase.ipc.TestNettyRpcServer |
|   | hadoop.hbase.TestCheckTestClasses |
|   | hadoop.hbase.client.TestClientClusterStatus |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12901730/HBASE-18946.master.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 7552cc1b172a 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 11467ef111 |
| maven | version: Apache Maven 3.5.2 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287968#comment-16287968
 ] 

stack commented on HBASE-18946:
---

.002 should be good to go. Trying it against qa.

Changed core of AM#createAssignProcedure so we pass list of Regions to
assign to the balancer en masse, in one lump. Let the balancer figure
what to do with the fat assign.

We get back a Map of servers to regions. We then transform that into
an array of AssignProcedures to pass to the Assign executor. We sort
the array so that meta and system tables are passed to the executor
first (and so replicas are clumped together...).

Internally the AM executor may divvy up the work into queues but
all will be pre-assigned so we should have good distribution
(round-robin) regardless of how the queue is processed.

M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignProcedure.java
 Cleanup around forceNewPlan. Was confusing.
 Added a Comparator to sort AssignProcedures so meta and system tables
 come ahead of user-space tables.

M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
 Remove the forceNewPlan argument on createAssignProcedure. Didn't make
 sense given we were creating a new AssignProcedure; the arg had no
 effect.

 (createAssignProcedures) Recast to feed all regions to the balancer in
 bulk and to sort the return so meta and system tables take precedence.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, 
> HBASE-18946.master.002.patch, HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287181#comment-16287181
 ] 

stack commented on HBASE-18946:
---

.001 is NOT done yet. Posting WIP. Builds on [~ram_krish] findings and 
explorations. Added method to AM to bulk create AssignProcedures. Passes bulk 
to balancer to get placement. Some cleanup of a dodgy force assign flag. Will 
doc better tomorrow. Just posting what I have at mo.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.master.001.patch, HBASE-18946.patch, 
> HBASE-18946.patch, HBASE-18946_2.patch, HBASE-18946_2.patch, 
> HBASE-18946_simple_7.patch, HBASE-18946_simple_8.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-06 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280070#comment-16280070
 ] 

Anoop Sam John commented on HBASE-18946:


No no.. When the LB round robin is called with all replicas passed together. 
Till then we had only one region as part of the procedure.  So RS been selected 
for diff replicas together in one shot.  And then the sub procedures for each 
of the replica regions and that will NOT do the previous steps like contacting 
the LB.  

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-06 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280054#comment-16280054
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


I can get what you are saying. 
Adding to META is not an issue, we can do that any time. We can create 
regioninfos from the default.
Sub procs can create new sub procs - true. But my point is this 
Say first default region create a sub proc, that in turn created one sub proc 
for rpelica 1. Now you call assign for that replica 1. This assign wil call 
roundrobin in LB. LB will just assign to RS 1. Now again the parent sub proc 
wil run - ie the default reigon. Here again round robin in LB will be called. 
It will again assign RS 1 because it does not know that replica 1 is in RS 1. 
I will spend some more time in this - to see if I come up with any other 
alternate way. In case am wrong in the above stmts.
[~saint@gmail.com]
You can have a go too. Sorry I missed replying to your earlier comment if you 
can try this out. No problem.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-06 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280011#comment-16280011
 ] 

Anoop Sam John commented on HBASE-18946:


Ram.. What I say is at 1st steps, there wont be 3 regions at all (consider 3 
replica count and we say abt one region)..  Till the LB is been contacted. 
There only the replicas are been generated and asked for servers.  Now this 
single region processing itself is a sub procedure.  Now we have 3 regions 
under this and so have to generate 3 sub procedures.  My Q was whether that is 
possible. A sub procedure under another (in the f/w level) and Stack says yes.. 
The issue of grouping comes as we did the replica regions create at 1st.  What 
comes always to my mind is can we NOT do that then but later.  May be am 
totally wrong as I did not do code study fully. Just putting up an 
option/Question

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278997#comment-16278997
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq.Yes same thing I ask/suggest from long time.
this is what am trying to do from patch 1. And saying the same. Group the 
replicas.
bq.Can we again make sub procedures for each of the replica regions under a 
region sub procedure?
Not possible currently because AM is independent of what it is processing and 
same is with LB. what ever procs you create it will just go as individual 
regions to AM and AM will just process if it has something in queue. 
So before putting it to AM group it. Either with procs or some queue.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278988#comment-16278988
 ] 

stack commented on HBASE-18946:
---

bq. Can we again make sub procedures for each of the replica regions under a 
region sub procedure?

A subprocedure of a subprocedure? Yes, you can do that.

Let me try and take a look at this informed by Ram's work above See if I 
can help.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-05 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278985#comment-16278985
 ] 

Anoop Sam John commented on HBASE-18946:


bq.Can we give the replicas as a group to the LB and have it queue them as a 
group? I
Yes same thing I ask/suggest from long time.. As discussed with Ram, we have 
some issue in there with the usage of Procedure way.  But my call there would 
be , if this need a change in the Procedure f/w on the sub procedure ways or 
so, lets do that.
Creating the assign individual procedures for each of the replicas at 1st step 
and then ask LB is the issue maker. In assign procedure, first we add entry 
to META.  What if there we dont have the replica regions(objects) been created 
by that time?  I think it may be ok as we dont have row per replica region in 
META. It is just one row only for a region and as per the replicas some new 
columns (cf:Q) comes in only.   Then while contacting LB, we can create the 
replica regions and pass it as a List to get the servers for regions.  Already 
the LB APIs allow to pass a list of regions so as to assign servers for each.   
Now the Q is already each of the region assign is a sub procedure within the 
Table create procedure.   Can we again make sub procedures for each of the 
replica regions under a region sub procedure?  [~stack]..   Not knowing the 
internals of Procedure f/w that well ...  But will this direction help?

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278975#comment-16278975
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq.and we do NOT want global coordination because we want the assignment queues 
to run w/o need of coordination so they run fast – seems to be root problem?
Yes that is the problem. and pls note that none of the LB is responsible for 
this. If you pass all the replica regions as a group it will work fine becuae 
it will just do a pure round robin and nothing based on the actual replica 
knowledge.


> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278969#comment-16278969
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq.Can we give the replicas as a group to the LB and have it queue them as a 
group? I've not looked. But replicas being split over assignment queues w/o a 
global coordination –
Yes. If you see the first patch that is what I try to do  but that adds some 
state. 

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278916#comment-16278916
 ] 

stack commented on HBASE-18946:
---

Thanks [~anoop.hbase] for the intervention. I was thinking this progress and 
though I understood what is happening here but plain now that I did not.

So back to square one.

Can we give the replicas as a group to the LB and have it queue them as a 
group? I've not looked. But replicas being split over assignment queues w/o a 
global coordination --
 and we do NOT want global coordination because we want the assignment queues 
to run w/o need of coordination so they run fast -- seems to be root problem?

Will I have a go at it [~ram_krish]?

Thanks boss.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278462#comment-16278462
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


That is what I was saying now we do the round robin decision even before. But 
one thing to note is that since now the flow will go with retainAssignment it 
will still go with retainAssignment what ever be the LB configured.
So stochastic LB case this will work as per the title of the JIRA.
I think even before this patch if we have some other LB configured it will 
still not work as expected for replicas (and we already  know that 
FavouredNodeBalancer) does not take care of replicas for sure. I don't mean 
this patch is solving all case and that is why I am hesitant to commit this 
stil because it is not asking the LB to pick the servers. The patch _2 is 
slightly better but again I think other Region balancers will not be able to 
understand replica. 
All replicas have been working fine only during actual balancing and not during 
initial assignment.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-05 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278364#comment-16278364
 ] 

Anoop Sam John commented on HBASE-18946:


If the region replica placing not working as expected (no 2 replicas on same 
server) with RS grouping balancer way, that is a bug with RS grouping , not 
considering the replica feature.But for placing the regions ,while creating 
the table, NOT contacting the LB will be another problem.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-05 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16278359#comment-16278359
 ] 

Anoop Sam John commented on HBASE-18946:


Before the patch, the regions are created and the AssignProcedure for them. 
There is no target server in the procedure and we will contact the LB 
(RoundRobin or RetainAssign calls).   Now say when a table is created, then 
itself the target servers for each of the regions are selected NOT contacting 
the LB but much earlier by AM. So the LB is not been asked at all.. What if the 
cluster is having the RS group feature and using that LB?  There is all chance 
that the regions go to some servers where the admin dont want this table to go! 
 I think we can not do this way.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277768#comment-16277768
 ] 

stack commented on HBASE-18946:
---

Ok. Got it. +1 for commit. Thanks Ram.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-04 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277175#comment-16277175
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq.The patch is still an advance because you've moved the problem of assign to 
AM which is where it belongs; that the implementation has problems is something 
we can work on ...
Yes that was my concern. My early patch HBASE-18946_2 was actually working on a 
completed assigned set of regions. That is why I was not looking at moving the 
selection of servers to the Create table proc stage.
bq.Say some more on this please.
I was saying the same thing as you stated above advancing the target server 
selection to the Create table stage and not internal to AM.
bq.Should it be other way around?
May be am missing your intent. The idea is that if tableDescriptor has 
replication then create Assign procs for the entire set of regions (including 
replicas) in a round robin mode with the available list of servers. If there 
are no replicas just go with the existing way. 

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277107#comment-16277107
 ] 

stack commented on HBASE-18946:
---

So, looking at v8 again in light of your comment, won't we end up putting 
region replicas on the same servers each time we assign?

The patch is still an advance because you've moved the problem of assign to AM 
which is where it belongs; that the implementation has problems is something we 
can work on ...

bq. I said this because now we set the target server so the round robin logic 
is not with the AM. Generally AM decides if the assignment shouldbe round robin 
or retain assignment. So here it is always retain assignment. 

Say some more on this please.

Is this right?

110   if (tableDescriptor.getRegionReplication() > 1) {
111 addChildProcedure(
112   
env.getAssignmentManager().createRoundRobinAssignProcedures(newRegions));
113   } else {
114 
addChildProcedure(env.getAssignmentManager().createAssignProcedures(newRegions));
115   }

Should it be other way around?

I still think this patch really nice.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-12-04 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16276483#comment-16276483
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq.But the problem is no longer we call roundrobinAssignment from LB instead we 
will call retainAssignment() only. This happens only when replicas are created
I said this because now we set the target server so the round robin logic is 
not with the AM. Generally AM decides if the assignment shouldbe round robin or 
retain assignment. So here it is always retain assignment. 
As I said LB knows about replica and it does the replica assignment pretty well 
but that is when it has the entire cluster available. During the course of 
assignment it cannot make a judgement on it. I don't find any other simple way 
to make LB aware of it unless some where we need a grouping of the regions to 
be assigned. So are you ok with patch for commit [~saint@gmail.com]? 

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272220#comment-16272220
 ] 

stack commented on HBASE-18946:
---

This last patch is very nice. Clean.

What you mean by this sir: "But the problem is no longer we call 
roundrobinAssignment from LB instead we will call retainAssignment() only. This 
happens only when replicas are created." Do we need to add more  knowledge of 
replicas to the LB?

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, HBASE-18946_simple_7.patch, 
> HBASE-18946_simple_8.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16266747#comment-16266747
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
33s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
5s{color} | {color:red} hbase-server: The patch generated 4 new + 31 unchanged 
- 0 fixed = 35 total (was 31) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 0s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
54m  7s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
40s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 12s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}180m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12899364/HBASE-18946_simple_8.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux aa6b75705164 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / f521000d78 |
| maven | version: Apache Maven 3.5.2 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16266740#comment-16266740
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
50s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
35s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
8s{color} | {color:red} hbase-server: The patch generated 2 new + 31 unchanged 
- 0 fixed = 33 total (was 31) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 2s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
54m 24s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
39s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 99m 
43s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}181m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12899363/HBASE-18946_simple_7.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux d6cff5ddaba4 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / f521000d78 |
| maven | version: 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-23 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264225#comment-16264225
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq.Man. We want round-robin but with caveats.. "Round-robin except..."
I think you are worried about this part. But I see this way. 
We are still doing round robin.
1) Assign primary replicas (round robin)
2) assign sec replicas ( again round robin just avoid primary replica servers).
3) assign tertiary replicas (again round robin but avoid servers in first two 
steps).
and so on. ..
So you think why not do a round robin with the entire set of regions. That is 
true round robin. But for that some where we need to hold up the regions and 
bulk it up before going for Assign procs. 
Otherwise another thing we can do is that,
Add a API in AM that will do tentative round robin with the available servers 
and regions. And in the CreateTableProc before creating the AssignProcs add the 
target server also to it. So that we go ahead and assign accordingly only. 
bq.Could we pass the AM the new table regions and ask it to return us plans to 
use assigning?
Reading this comment of yours -you suggest the same I believe? Can we do that 
only for tables with replicas?

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-22 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263832#comment-16263832
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq.Looking in the patch, are we doing assign placement inside in 
CreateTableProcedure still?
No we are not doing assign placement in create table proc. We are just 
seperating out the regions so that primary are assigned and then replicas. 
There is no state maintained in LB or in the Assign proces like in first patch.
bq. Could we pass the AM the new table regions and ask it to return us plans to 
use assigning?
I think that is what we are doing now in this patch right - we pass the regions 
and get the right server for them ensuring replicas don't sit together.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-22 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263317#comment-16263317
 ] 

stack commented on HBASE-18946:
---

Man.  We want round-robin but with caveats.. "Round-robin except..."

Ok w/ the above points yeah, not sure how it changes patch.

Looking in the patch, are we doing assign placement inside in 
CreateTableProcedure still? Could we pass the AM the new table regions and ask 
it to return us plans to use assigning?

I like your speculation on how region replicas will pile up on rolling restart. 
Yeah, need to make sure we do the right thing.

Thanks.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-22 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262960#comment-16262960
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


[~saint@gmail.com]
Before I update the patch you ok with the above points ?

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-21 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261986#comment-16261986
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


Thanks for the detailed review.
bq.We only do this when it a region with replicas or do we do it always (would 
be good if former, we want assignment to run fast).
Yes only for the replica regions.
bq.Please remind me what is the rule for replica assign? Just that they need to 
be on different servers? Nothing about ordering? (Hmm... seems like replica has 
to go out first). How does the patch to the balancer ensure this ordering?
Our initial requirement is that replicas for sure should be in different 
servers if there are enough number of servers. Ordering is not of importance. 
Coming to the balancer, in our code base only StochasticLB knows about replicas 
while actually balancing the cluster. We have tried FavoredStocasticLB and it 
does not know about replicas and infact messes with the replica assignment 
itself (by corrupting the META entries for replicas). That is a big change 
which we need to do later. We have confirmed this with [~enis] also offline 
some time back.
Also as in said in previous comment balancer does not come into picture while 
doing round robin assignment of a new table reigons. It just tries to do round 
robin based on available servers. 
bq.is there a hole where you can't see an ongoing Assigment? It has been 
queue'd and is being worked on but but you have no means of querying where a 
region is being assigned
Yes exactly. We don know about it. It not only applies for replica regions any 
new create table regions has the same issue. The assignment queued just uses 
the current regions in the queue to do the assignments.  But for those regions 
it is ok we don't mind how they are distributed but for replicas it is very 
important. when we have enough servers if the replicas are not distributed then 
we don server the purpose of replicas. If the servers are less than the 
replicas then it is ok to assign the replicas to the same RS. In future we are 
planning to even avoid this and fail the assignments itself.
bq.If round robin, are we not moving through the list of servers? Is the issue 
only when cluster is small – three servers or so?
Hope you mean before this patch right? We are moving through the list of 
servers but all the regions (including replicas) do not go to the assignment 
queue together. So what ever is getting processed from the assignment queue 
there it does round robin but the next set of regions that is processed again 
does round robin and we end up in same RS.
bq.On patch, don't renumber protobuf fields.
Oh yes. I did that so that the steps are in order. Will change it and will try 
to remove some duplicate code.
bq.If NOT isDefaultReplica and NOT replicaAvailable, we just fall through?
Yes. If it is a normal region we just go with the old code only and if the 
replica is not avaliable in the existing code there is way to assign all such 
region that don't find a suitable server to some servers randomly. Which is 
fine for us too because replicas are more than the available number of servers.
Actually there is more to do with AM and replicas. We know the issues but not 
yet ready with patches. Like on a rolling restart like case the AM will keep 
moving the replicas to RS that are running. So finally when the last one is 
closed all the region would have moved there and META will only have that 
entry. Now when new RS are started it will try to do retain assignment and 
again replica regions may get colocated and only a balancer can solve it. We 
need to see how best we can do in these cases. But all that later (out of scope 
here).


> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261768#comment-16261768
 ] 

stack commented on HBASE-18946:
---

bq. While doing roundrobinAssignment contact the AM to know the current state 
of replica regions and choose a server accordingly. 

We only do this when it a region with replicas or do we do it always (would be 
good if former, we want assignment to run fast).

Yeah, if round robin, its round robin (smile).

Please remind me what is the rule for replica assign? Just that they need to be 
on different servers? Nothing about ordering? (Hmm... seems like replica has to 
go out first). How does the patch to the balancer ensure this ordering?

is there a hole where you can't see an ongoing Assigment? It has been queue'd 
and is being worked on but but you have no means of querying where a region is 
being assigned (i.e. we are about to assign a replica and we want to avoid 
assigning to the same location as where we just assigned?).

If round robin, are we not moving through the list of servers? Is the issue 
only when cluster is small -- three servers or so?


On patch, don't renumber protobuf fields.

What is happening here (BTW, repeats code):
{code}
1263List serverRegions =
1264assignments.computeIfAbsent(serverName, k -> new 
ArrayList<>());
1265if (!RegionReplicaUtil.isDefaultReplica(region)) {
1266  if (!replicaAvailable(region, serverName)) {
1267assignRegionToServer(cluster, serverName, serverRegions, 
region);
1268serverIdx = (j + serverIdx + 1) % numServers;
1269assigned = true;
1270break;
1271  }
1272} else if (!cluster.wouldLowerAvailability(region, serverName)) 
{
1273  assignRegionToServer(cluster, serverName, serverRegions, 
region);
1274  serverIdx = (j + serverIdx + 1) % numServers; // remain from 
next server
...
{code}

If NOT isDefaultReplica and NOT replicaAvailable, we just fall through?


Good stuff.




> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-20 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259586#comment-16259586
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


Even I too felt intially should we check for all the replicas equivalent to 
tableDescriptor.getReplicas but I think with the current logic that is not 
needed and also we really don know what is the max replicas configured at the 
AM level.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-20 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259583#comment-16259583
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


Thanks for the review. 
Since now the CreateTable assigns each replica as a batch the purpose of the 
code is to check if the previous replicas have been assigned and where is the 
location of that replica.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259496#comment-16259496
 ] 

Ted Yu commented on HBASE-18946:


{code}
+int replicaId = info.getReplicaId();
+for (int i = 0; i < replicaId; i++) {
{code}
replicaId may not be the max number of replicas 
(tableDescriptor.getRegionReplication()). What's the purpose of the for loop ?





> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259357#comment-16259357
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
33s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
6s{color} | {color:red} hbase-server: The patch generated 1 new + 61 unchanged 
- 0 fixed = 62 total (was 61) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
57s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
53m 29s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
0m 59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
28s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 23s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}181m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12898474/HBASE-18946_2.patch |
| Optional Tests |  asflicense  cc  unit  hbaseprotoc  javac  javadoc  findbugs 
 shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 4204523348a1 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259356#comment-16259356
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
8s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
 2s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
3s{color} | {color:red} hbase-server: The patch generated 3 new + 61 unchanged 
- 0 fixed = 64 total (was 61) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
50s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
51m  5s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
0m 59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
28s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 96m 
37s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}170m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12898476/HBASE-18946_2.patch |
| Optional Tests |  asflicense  cc  unit  hbaseprotoc  javac  javadoc  findbugs 
 shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 069931356f1e 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-20 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259175#comment-16259175
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


BTW let me check one flow - like are we able to assign if the number of servers 
are less than the number of replicas after ths patch. Will be back here.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-20 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259171#comment-16259171
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq.Now I am very sure the same change has to be done in ServerCrashProcedure to 
enable the failed tests becuase there also we just go with round robin and the 
flow will not contact the LB.
I mean this issue - https://issues.apache.org/jira/browse/HBASE-19268


> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-11-17 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256922#comment-16256922
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


Some updates here
-> In roundRobinAssignment the LB does not even care for the replicas and its 
assignment. So we need to make it aware of it. It knows that the region has a 
replica but it does not check if same replica is being assigned only balancer 
knows about it.

-> Say we want to assign replica 3 then we have to ensure that the primary and 
the secondary replica are already assigned and from that determine a plan. 
So to do that we may have to enable the cost functions or we should add some 
new methods to make LB in roundrobin to be aware of replicas. 
-> Next is that the way we do in the current patch attached here actually 
solved the issue because for the balancer it got all the regions (incluidng 
replicas) and just applied the round robin part on it. 
Will be back here.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-24 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216905#comment-16216905
 ] 

Anoop Sam John commented on HBASE-18946:


bq.When the Balancer is passed a List, could it look at the list first to find 
replicas and group?
Similar thing I was also asking above.
bq. Should we make it with all the replica regions go in one request? While 
reading the code on region replica, that time itself this Q came to me. Then 
fat logic for not putting 2 regions of same replica in one RS can be avoided. 
Not sue how easy/difficult it is. Just saying
Right now to the LB we pass each region one by one and ask it to find a RS for 
assign.  Instead will it be possible to pass all replicas of a given region 
together in a List and make the LB to find RS for each of it?

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216381#comment-16216381
 ] 

stack commented on HBASE-18946:
---

The batching at RPC level is our new 'bulk assign' mechanism. It is bulking per 
Server though so i suppose this is no good to you. Can the Balancer learn about 
Replicas? Seems like a good thing for it to know about. Could it have a plugin 
that did anti-affinity?

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-24 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216366#comment-16216366
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq.When the Balancer is passed a List, could it look at the list first to find 
replicas and group?
It does. But I don't know if it is able to do it across two RPC for the same 
table. Will check the balancer logic and see how it can be improved/fixed.
bq.The AssignProcedure is about assigning a single Procedure, nothing else. If 
we start bulking it up with other concerns, we'll be back to the fuzzy AMv1 
story.
Yes. I am also skeptical about this change and that is the reason why did not 
go forward with this patch. I also doubt other issues.
bq.See in RPC where it is batching requests.
Will check. But what I see in AM waitOnAssignQueue() we collect the batched 
regions and go with the assignment. Balancer does know about region replicas 
but I fear it is still assuming the earlier bulk assign logic. Will be back.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216359#comment-16216359
 ] 

stack commented on HBASE-18946:
---

bq. But still in these cases the bulking mechanism is not a logical bulking 
instead it depends on the timed wait and the size of the queue. 

See in RPC where it is batching requests.

But if you want discernment regards where Regions go on a cluster, thats the 
Balancer's job. It has all the sources to pull on. Can't it tell members of a 
ReadReplica set? Can't it do lookup to see where the other replicas are out on 
the cluster before it makes a plan for current replia? The AssignProcedure is 
about assigning a single Procedure, nothing else. If we start bulking it up 
with other concerns, we'll be back to the fuzzy AMv1 story.

When the Balancer is passed a List, could it look at the list first to find 
replicas and group?





> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-23 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216347#comment-16216347
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


[~saint@gmail.com] 
Thanks for the comment. Yes in a way I agree that fixing it in Balancer is 
best. But still in these cases the bulking mechanism is not a logical bulking 
instead it depends on the timed wait and the size of the queue. 
So the balancer may not really know what has been balanced by the time the next 
bulked set of region comes in. Any suggestions?
I can still check if it is possible to make balancer aware of this. But this 
mechanism solves some more issues in other related areas since we know the set 
of regions to be balanced at one shot.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216340#comment-16216340
 ] 

stack commented on HBASE-18946:
---

We already have a bulking mechanism below AssignProcedure in the RPC.  Why 
ain't we making this change in the balancer. It knows all. Doing it in the 
AssignProcedure makes it carry state when we've been doing our best to make it 
a simple machine.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-20 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212288#comment-16212288
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq. As with some of regions with known location, there are new ones which needs 
to be assigned, these new ones could be assigned to the same region server 
which hosts the primary or other replica region.
Yes agree with you. So i need to fix the issue with EnableTableHandler first. 
bq.The previous logic is that when the first region is queued, it starts to 
wait assignDispatchWaitMillis to start the real work. With the patch, the whole 
batch is added at once, it skipped the addFirstOne logic.
Will read and understand your comment and will be back here.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-19 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211811#comment-16211811
 ] 

huaxiang sun commented on HBASE-18946:
--

For the case with increased replica count, even with  HBASE-19017, it needs the 
same logic. As with some of regions with known location, there are new ones 
which needs to be assigned, these new ones could be assigned to the same region 
server which hosts the primary or other replica region. wdyt? [~ram_krish]

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-19 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211449#comment-16211449
 ] 

huaxiang sun commented on HBASE-18946:
--

Thanks [~ram_krish]. One possible slowdown here with the approach is that if 
queueAll() queues more than assignDispatchWaitQueueMaxSize regions, with the 
current logic, it still needs to wait a bit, please see

https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java#L1639.

The previous logic is that when the first region is queued, it starts to wait 
assignDispatchWaitMillis to start the real work. With the patch, the whole 
batch is added at once, it skipped the addFirstOne logic. I think it can be 
changed to avoid this case.

{code}
  private HashMap waitOnAssignQueue() {
HashMap regions = null;

assignQueueLock.lock();
try {
  if (pendingAssignQueue.isEmpty() && isRunning()) {
assignQueueFullCond.await();
  }

  if (!isRunning()) return null;
  +if (pendingAssignQueue.size() < assignDispatchWaitQueueMaxSize) {
  +  assignQueueFullCond.await(assignDispatchWaitMillis, 
TimeUnit.MILLISECONDS);
  +}
  -assignQueueFullCond.await(assignDispatchWaitMillis, 
TimeUnit.MILLISECONDS);
  regions = new HashMap(pendingAssignQueue.size());
  for (RegionStateNode regionNode: pendingAssignQueue) {
regions.put(regionNode.getRegionInfo(), regionNode);
  }
  pendingAssignQueue.clear();
} catch (InterruptedException e) {
  LOG.warn("got interrupted ", e);
  Thread.currentThread().interrupt();
} finally {
  assignQueueLock.unlock();
}
return regions;
  }

{code}

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-19 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210767#comment-16210767
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq.Do we need to apply the same collection logic to EnableTableProcedure (to 
address the TODO in the comment)? 
Yes it could be. But with HBASE-19017 it may not be needed because we know 
where the assignment would go.
But before I commit this I would like to spend some more time to understand if 
there could be any other issues with how the procedure pool executes and we 
could starve other threads from adding to the assignQueue and only the other 
threads keep retrying which I think happened when I tried doing the same for 
enableTablePRocedure. But HBASE-19017 we are safe. 
And thanks a lot for the review [~huaxiang].

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-17 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208547#comment-16208547
 ] 

huaxiang sun commented on HBASE-18946:
--

Do we need to apply the same collection logic to EnableTableProcedure (to 
address the TODO in the comment)? Otherwise, +1.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-16 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207050#comment-16207050
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


Since HBASE-19017 is resolved I think this issue is better now. Will wait for 
reviews and mean while will check if the patch can cause other issues. 
[~huaxiang] - Thanks for your time.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-16 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206684#comment-16206684
 ] 

huaxiang sun commented on HBASE-18946:
--

Sorry, [~ramkrishna], busy with something else. going through the changes now 
and will post the update, thanks.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-16 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16205746#comment-16205746
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


bq.Better assert that the RegionStateNode's added are replica of the same 
region.
Ok will add.
I think test case failures are because of timeout issue. Will try out again.
[~huaxiang]
Can you have a look and provide some suggestion/feedback?
I think if we solve HBASE-19017 then for enable we don't need this type of bulk 
assignment. 

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203706#comment-16203706
 ] 

Ted Yu commented on HBASE-18946:


{code}
47public synchronized void addRegion(RegionStateNode node) {
{code}
Better assert that the RegionStateNode's added are replica of the same region.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203655#comment-16203655
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
15s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
59s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
47m  0s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 28s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m  8s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | 
org.apache.hadoop.hbase.client.TestScanWithoutFetchingData |
|   | org.apache.hadoop.hbase.regionserver.wal.TestSecureWALReplay |
|   | org.apache.hadoop.hbase.master.TestMasterMetricsWrapper |
|   | org.apache.hadoop.hbase.master.procedure.TestDisableTableProcedure |
|   | org.apache.hadoop.hbase.regionserver.TestRowTooBig |
|   | org.apache.hadoop.hbase.regionserver.wal.TestAsyncWALReplay |
|   | org.apache.hadoop.hbase.regionserver.TestSplitLogWorker |
|   | org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedure |
|   | org.apache.hadoop.hbase.master.procedure.TestModifyTableProcedure |
|   | org.apache.hadoop.hbase.master.procedure.TestDeleteTableProcedure |
|   | org.apache.hadoop.hbase.master.procedure.TestEnableTableProcedure |
|   | org.apache.hadoop.hbase.master.procedure.TestCreateTableProcedure |
|   | org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence |
|   | org.apache.hadoop.hbase.coprocessor.TestHTableWrapper |
|   | 

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203454#comment-16203454
 ] 

Hadoop QA commented on HBASE-18946:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
18s{color} | {color:red} Docker failed to build yetus/hbase:5d60123. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-18946 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12892052/HBASE-18946.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9098/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-12 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201852#comment-16201852
 ] 

Anoop Sam John commented on HBASE-18946:


Should we make it with all the replica regions go in one request?  While 
reading the code on region replica, that time itself this Q came to me.  Then 
fat logic for not putting 2 regions of same replica in one RS can be avoided.  
Not sue how easy/difficult it is.  Just saying

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-12 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201809#comment-16201809
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


Checked in other branches (branch -1.x). There we have a bulk Assigner in AM. 
So all the regions to be created while doing CreateTable are collected and 
passed to the balancer along with the available servers. So we get a plan that 
is truly distributing the replicas. We don't have similar way in trunk as every 
AssignProcedure is individual and it just adds to the 'pendingAssignQueue' one 
by one.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-10 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198281#comment-16198281
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


One way to fix is that for CreateTableProcedure atleast we should not split the 
assignments - we should ensure that all the regions are added as a unit and the 
assignment should proceed. 

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-10 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198278#comment-16198278
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


I now got the real issue why this happens. Need to check in other branches 
which is not PRocV2.
Suppose we create a table with 20 regions and say replica as 3. Now we create 
60 Assign procedures. Now all these assign procedures are added to a queue 
'pendingAssignQueue' in AM. 
There is a thread that executes the assignment of the regions in this queue. 
So when ever all the 60 regions are added to this queue and the assignment 
thread assigns them we have no problem. The Stochastic LB uses the replica 
concept to ensure the regions are assigned properly. The Cluster and Cost 
functions created per 'roundRobinAssignment' ensures that happens.
But when the multi threaded model executes differently like the 60 regions are 
executed with 45 and 15 regions each then we end up in this issue every time. 
Because the roundRobinAssignment Cluster creation is not global and it is per 
assignment. 

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-09 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198177#comment-16198177
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


[~huaxiang]
I tried your fix by adding the default regions to the beginning of the list 
passed to AM. But still this issue seems to occur. I had a doubt with the LB 
and how the region replica awareness is passed on to the LB. Am reading that 
code to understand. will be back here.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194151#comment-16194151
 ] 

ramkrishna.s.vasudevan commented on HBASE-18946:


[~huaxiang]
Thanks for the update. Great that you already feel this change will solve the 
problem. I was checking this issue but got pulled to something else. Will be 
back here and to check your suggestion. And yes we need to check all related 
areas and also other related branches also.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-05 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193404#comment-16193404
 ] 

huaxiang sun commented on HBASE-18946:
--

By the way, we need the fix to be backported to branch-1 as well, I checked the 
branch-1.2, the logic is same as current master branch.

> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

2017-10-05 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193395#comment-16193395
 ] 

huaxiang sun commented on HBASE-18946:
--

Thanks [~ram_krish] for the finding! I checked the code, I think it is caused 
by the fact replica regions are added first, then all default regions are added 
at the end of the list.

https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RegionReplicaUtil.java#L187

For example, 3 RSes and two regions with 3 replicas.

{code}
regionId-replicaId
0-10-2   1-11-2   0-0   1-0

R0   R1  R2
0-1 0-2   1-1
1-2 0-0   1-0
{code}

We can see that for R1 and R2, replicas for same regions are assigned to the 
same RS.

If the logic can be changed a bit as follows, it can fix this issue. Other 
places need to be checked as well.
{code}

  public static List addReplicas(final TableDescriptor 
tableDescriptor,
  final List regions, int oldReplicaCount, int newReplicaCount) 
{
if ((newReplicaCount - 1) <= 0) {
  return regions;
}
List hRegionInfos = new ArrayList<>((newReplicaCount) * 
regions.size());
for (int i = 0; i < regions.size(); i++) {
  if (RegionReplicaUtil.isDefaultReplica(regions.get(i))) {
// region level replica index starts from 0. So if oldReplicaCount was 
2 then the max replicaId for
// the existing regions would be 1
hRegionInfos.add(regions.get(i));
for (int j = oldReplicaCount; j < newReplicaCount; j++) {
  
hRegionInfos.add(RegionReplicaUtil.getRegionInfoForReplica(regions.get(i), j));
}
  }
}
   // hRegionInfos.addAll(regions);
{code}


> Stochastic load balancer assigns replica regions to the same RS
> ---
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)