[jira] [Updated] (HBASE-21378) [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master is not initialized

2018-10-26 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21378:
-
Attachment: (was: 0001-add-a-skip-option.patch)

> [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when 
> master is not initialized
> --
>
> Key: HBASE-21378
> URL: https://issues.apache.org/jira/browse/HBASE-21378
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools, hbck2
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: 
> 0001-HBASE-21378-hbck2-checkHBCKSupport-blocks-assigning-.patch
>
>
> When I encounter the scenario that hbase:namespace is not online.
> {code}
> 2018-10-24,14:38:16,910 WARN org.apache.hadoop.hbase.master.HMaster: 
> hbase:namespace,,1529933109115.7e0801c8232b2dc15face54532056076. is NOT 
> online; state={7e0801c8232b2dc15face54532056076 state=OPEN, ts=1540363033384, 
> server=c4-hadoop-tst-st30.bj,29100,1540348649479}; 
> ServerCrashProcedures=false. Master startup cannot progress, in 
> holding-pattern until region onlined.
> {code}
> Then I tried to assign it manually, but it throws PleaseHoldException.
> {code}
> Wed Oct 24 15:26:52 CST 2018, 
> RpcRetryingCaller{globalStartTime=1540365754487, pause=200, maxAttempts=16}, 
> org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:144)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3133)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3125)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterMetrics(HBaseAdmin.java:2161)
>   at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:98)
>   at org.apache.hbase.HBCK2.run(HBCK2.java:364)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hbase.HBCK2.main(HBCK2.java:447)
> Caused by: org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:361)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:349)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:101)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
> {code}
> Then I check the code and found it is because of checkHBCKSupport(), I assign 
> hbase:namespace successfully 

[jira] [Updated] (HBASE-21378) [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master is not initialized

2018-10-26 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21378:
-
Attachment: 0001-HBASE-21378-hbck2-checkHBCKSupport-blocks-assigning-.patch

> [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when 
> master is not initialized
> --
>
> Key: HBASE-21378
> URL: https://issues.apache.org/jira/browse/HBASE-21378
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools, hbck2
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: 
> 0001-HBASE-21378-hbck2-checkHBCKSupport-blocks-assigning-.patch
>
>
> When I encounter the scenario that hbase:namespace is not online.
> {code}
> 2018-10-24,14:38:16,910 WARN org.apache.hadoop.hbase.master.HMaster: 
> hbase:namespace,,1529933109115.7e0801c8232b2dc15face54532056076. is NOT 
> online; state={7e0801c8232b2dc15face54532056076 state=OPEN, ts=1540363033384, 
> server=c4-hadoop-tst-st30.bj,29100,1540348649479}; 
> ServerCrashProcedures=false. Master startup cannot progress, in 
> holding-pattern until region onlined.
> {code}
> Then I tried to assign it manually, but it throws PleaseHoldException.
> {code}
> Wed Oct 24 15:26:52 CST 2018, 
> RpcRetryingCaller{globalStartTime=1540365754487, pause=200, maxAttempts=16}, 
> org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:144)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3133)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3125)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterMetrics(HBaseAdmin.java:2161)
>   at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:98)
>   at org.apache.hbase.HBCK2.run(HBCK2.java:364)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hbase.HBCK2.main(HBCK2.java:447)
> Caused by: org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:361)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:349)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:101)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
> {code}
> Then I check the code and found it is because of checkHBCKSupport(), I assign 
> 

[jira] [Updated] (HBASE-21378) [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master is not initialized

2018-10-26 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21378:
-
Attachment: (was: 
0001-HBASE-21378-checkHBCKSupport-blocks-assigning-hbase-.patch)

> [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when 
> master is not initialized
> --
>
> Key: HBASE-21378
> URL: https://issues.apache.org/jira/browse/HBASE-21378
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools, hbck2
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: 0001-add-a-skip-option.patch
>
>
> When I encounter the scenario that hbase:namespace is not online.
> {code}
> 2018-10-24,14:38:16,910 WARN org.apache.hadoop.hbase.master.HMaster: 
> hbase:namespace,,1529933109115.7e0801c8232b2dc15face54532056076. is NOT 
> online; state={7e0801c8232b2dc15face54532056076 state=OPEN, ts=1540363033384, 
> server=c4-hadoop-tst-st30.bj,29100,1540348649479}; 
> ServerCrashProcedures=false. Master startup cannot progress, in 
> holding-pattern until region onlined.
> {code}
> Then I tried to assign it manually, but it throws PleaseHoldException.
> {code}
> Wed Oct 24 15:26:52 CST 2018, 
> RpcRetryingCaller{globalStartTime=1540365754487, pause=200, maxAttempts=16}, 
> org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:144)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3133)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3125)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterMetrics(HBaseAdmin.java:2161)
>   at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:98)
>   at org.apache.hbase.HBCK2.run(HBCK2.java:364)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hbase.HBCK2.main(HBCK2.java:447)
> Caused by: org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:361)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:349)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:101)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
> {code}
> Then I check the code and found it is because of checkHBCKSupport(), I assign 
> hbase:namespace successfully by 

[jira] [Updated] (HBASE-21378) [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master is not initialized

2018-10-26 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21378:
-
Attachment: 0001-add-a-skip-option.patch

> [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when 
> master is not initialized
> --
>
> Key: HBASE-21378
> URL: https://issues.apache.org/jira/browse/HBASE-21378
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools, hbck2
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: 0001-add-a-skip-option.patch
>
>
> When I encounter the scenario that hbase:namespace is not online.
> {code}
> 2018-10-24,14:38:16,910 WARN org.apache.hadoop.hbase.master.HMaster: 
> hbase:namespace,,1529933109115.7e0801c8232b2dc15face54532056076. is NOT 
> online; state={7e0801c8232b2dc15face54532056076 state=OPEN, ts=1540363033384, 
> server=c4-hadoop-tst-st30.bj,29100,1540348649479}; 
> ServerCrashProcedures=false. Master startup cannot progress, in 
> holding-pattern until region onlined.
> {code}
> Then I tried to assign it manually, but it throws PleaseHoldException.
> {code}
> Wed Oct 24 15:26:52 CST 2018, 
> RpcRetryingCaller{globalStartTime=1540365754487, pause=200, maxAttempts=16}, 
> org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:144)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3133)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3125)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterMetrics(HBaseAdmin.java:2161)
>   at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:98)
>   at org.apache.hbase.HBCK2.run(HBCK2.java:364)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hbase.HBCK2.main(HBCK2.java:447)
> Caused by: org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:361)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:349)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:101)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
> {code}
> Then I check the code and found it is because of checkHBCKSupport(), I assign 
> hbase:namespace successfully by skipping this check. Thus I think the tool 
> 

[jira] [Commented] (HBASE-21191) Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).

2018-10-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665922#comment-16665922
 ] 

Hadoop QA commented on HBASE-21191:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 7s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
53s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
22s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} hbase-server: The patch generated 0 new + 397 
unchanged - 5 fixed = 397 total (was 402) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
55s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 10s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}115m 
42s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}150m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21191 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945831/HBASE-21191.branch-2.0.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux be2071208471 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / a3b2686114 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14881/testReport/ |
| Max. process+thread count | 4198 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14881/console |
| Powered by | Apache 

[jira] [Commented] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere

2018-10-26 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665915#comment-16665915
 ] 

Guanghao Zhang commented on HBASE-21325:


[~Apache9] Any more comments?

And ping [~stack] for branch-2.0.

> Force to terminate regionserver when abort hang in somewhere
> 
>
> Key: HBASE-21325
> URL: https://issues.apache.org/jira/browse/HBASE-21325
> Project: HBase
>  Issue Type: Improvement
>Reporter: Duo Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Attachments: HBASE-21325.master.001.patch, 
> HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, 
> HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, 
> HBASE-21325.master.005.patch
>
>
> When testing sync replication, I found that, if I transit the remote cluster 
> to DA, while the local cluster is still in A, the region server will hang 
> when shutdown. As the fsOk flag only test the local cluster(which is 
> reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is 
> broken(the remote wal directory is gone)  so we will never succeed. And this 
> lead to an infinite wait inside waitOnAllRegionsToClose.
> So I think here we should have an upper bound for the wait time in 
> waitOnAllRegionsToClose method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21365) Throw exception when user put data with skip wal to a table which may be replicated

2018-10-26 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665914#comment-16665914
 ] 

Guanghao Zhang commented on HBASE-21365:


[~xucang] This is an incompatible change and the behavior is not same with the 
old implementation. So this should not be ported to branch-2 and branch-1.

> Throw exception when user put data with skip wal to a table which may be 
> replicated
> ---
>
> Key: HBASE-21365
> URL: https://issues.apache.org/jira/browse/HBASE-21365
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 3.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21365.master.001.patch, 
> HBASE-21365.master.002.patch, HBASE-21365.master.003.patch
>
>
> A real problem in our production cluster. A user point that his table's data 
> can't be replicate to the peer cluster. Then we start to debug the reason. We 
> checked the replication scope, checked the replication wal entry filter, and 
> check the namespace,tablecfs config. But didn't found any problem. We enabled 
> the RS's debug log to find the reason. Finally, we found use use put with 
> skip wal to write data. But it taked a long time... Our replication use wal 
> to replicate data. So the data can't be replicated to peer cluster. I thought 
> throw a exception may be better for user if the table's replication scope is 
> not 0. (as 0 means not replicated).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21378) [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master is not initialized

2018-10-26 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665909#comment-16665909
 ] 

Jingyun Tian commented on HBASE-21378:
--

Sure. It's not a big work. I'll submit a patch later.

> [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when 
> master is not initialized
> --
>
> Key: HBASE-21378
> URL: https://issues.apache.org/jira/browse/HBASE-21378
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools, hbck2
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: 
> 0001-HBASE-21378-checkHBCKSupport-blocks-assigning-hbase-.patch
>
>
> When I encounter the scenario that hbase:namespace is not online.
> {code}
> 2018-10-24,14:38:16,910 WARN org.apache.hadoop.hbase.master.HMaster: 
> hbase:namespace,,1529933109115.7e0801c8232b2dc15face54532056076. is NOT 
> online; state={7e0801c8232b2dc15face54532056076 state=OPEN, ts=1540363033384, 
> server=c4-hadoop-tst-st30.bj,29100,1540348649479}; 
> ServerCrashProcedures=false. Master startup cannot progress, in 
> holding-pattern until region onlined.
> {code}
> Then I tried to assign it manually, but it throws PleaseHoldException.
> {code}
> Wed Oct 24 15:26:52 CST 2018, 
> RpcRetryingCaller{globalStartTime=1540365754487, pause=200, maxAttempts=16}, 
> org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:144)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3133)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3125)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterMetrics(HBaseAdmin.java:2161)
>   at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:98)
>   at org.apache.hbase.HBCK2.run(HBCK2.java:364)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hbase.HBCK2.main(HBCK2.java:447)
> Caused by: org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:361)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:349)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:101)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
> {code}
> Then I check the code and found it is because of checkHBCKSupport(), I 

[jira] [Assigned] (HBASE-21374) Backport HBASE-21342 to branch-1

2018-10-26 Thread mazhenlin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mazhenlin reassigned HBASE-21374:
-

Assignee: mazhenlin

> Backport HBASE-21342 to branch-1
> 
>
> Key: HBASE-21374
> URL: https://issues.apache.org/jira/browse/HBASE-21374
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Assignee: mazhenlin
>Priority: Major
> Attachments: HBASE-21374.branch-1.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21374) Backport HBASE-21342 to branch-1

2018-10-26 Thread mazhenlin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665907#comment-16665907
 ] 

mazhenlin commented on HBASE-21374:
---

encountered problems with jdk7:

1 Consumer not available in jdk7, defined a similar interface instead.

2 compute, merge of ConcurrentHashMap also not available, make it difficult to 
remove unused entries without synchronized blocks. So HashMap with synchronized 
blocks was used instead of ConcurrentHashMap, since I assumed that the cost to 
acquire lock can be ignored compared to the full bulkload process.

> Backport HBASE-21342 to branch-1
> 
>
> Key: HBASE-21374
> URL: https://issues.apache.org/jira/browse/HBASE-21374
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Priority: Major
> Attachments: HBASE-21374.branch-1.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21374) Backport HBASE-21342 to branch-1

2018-10-26 Thread mazhenlin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mazhenlin updated HBASE-21374:
--
Attachment: HBASE-21374.branch-1.001.patch

> Backport HBASE-21342 to branch-1
> 
>
> Key: HBASE-21374
> URL: https://issues.apache.org/jira/browse/HBASE-21374
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Priority: Major
> Attachments: HBASE-21374.branch-1.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-26 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665313#comment-16665313
 ] 

Allan Yang edited comment on HBASE-21395 at 10/27/18 3:56 AM:
--

[~stack], FYI, this can go to 2.1.2. If users don't modify table so frequently 
like ITBLL, the chance of race condition is very small.


was (Author: allan163):
[~stack], FYI, this can go to 2.1.2. If users don't modify table so frequently 
like ITBLL, the change of race condition is very small.

> Abort split/merge procedure if there is a table procedure of the same table 
> going on
> 
>
> Key: HBASE-21395
> URL: https://issues.apache.org/jira/browse/HBASE-21395
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.1.2
>
> Attachments: HBASE-21395.branch-2.0.001.patch
>
>
> In my ITBLL, I often see that if split/merge procedure and table 
> procedure(like ModifyTableProcedure) happen at the same time, and since there 
> some race conditions between these two kind of procedures,  causing some 
> serious problems. e.g. the split/merged parent is bought on line by the table 
> procedure or the split merged region making the whole table procedure 
> rollback.
> Talked with [~Apache9] offline today, this kind of problem was solved in 
> branch-2+ since There is a fence that only one RTSP can agianst a single 
> region at the same time.
> To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe 
> fence in the split/merge procedure: If there is a table procedure going on 
> against the same table, then abort the split/merge procedure. Aborting the 
> split/merge procedure at the beginning of the execution is no big deal, 
> compared with the mess it will cause...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21373) Backport to branch-1, "HBASE-21338 [balancer] If balancer is an ill-fit for cluster size, it gives little indication"

2018-10-26 Thread Xu Cang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Cang updated HBASE-21373:

Attachment: HBASE-21373.branch-1.002.patch

> Backport to branch-1, "HBASE-21338 [balancer] If balancer is an ill-fit for 
> cluster size, it gives little indication"
> -
>
> Key: HBASE-21373
> URL: https://issues.apache.org/jira/browse/HBASE-21373
> Project: HBase
>  Issue Type: Bug
>  Components: Operability
>Reporter: stack
>Assignee: Xu Cang
>Priority: Major
> Attachments: HBASE-21373.branch-1.001.patch, 
> HBASE-21373.branch-1.002.patch
>
>
> Issue to backport to branch-1. Hope you don't mind my assigning it to you Xu 
> Cang.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21373) Backport to branch-1, "HBASE-21338 [balancer] If balancer is an ill-fit for cluster size, it gives little indication"

2018-10-26 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665890#comment-16665890
 ] 

Xu Cang commented on HBASE-21373:
-

uploaded patch .001 to fix checkstyle issues. Unit tests failure is not related 
IMO, let hadoop-qa try again.

> Backport to branch-1, "HBASE-21338 [balancer] If balancer is an ill-fit for 
> cluster size, it gives little indication"
> -
>
> Key: HBASE-21373
> URL: https://issues.apache.org/jira/browse/HBASE-21373
> Project: HBase
>  Issue Type: Bug
>  Components: Operability
>Reporter: stack
>Assignee: Xu Cang
>Priority: Major
> Attachments: HBASE-21373.branch-1.001.patch, 
> HBASE-21373.branch-1.002.patch
>
>
> Issue to backport to branch-1. Hope you don't mind my assigning it to you Xu 
> Cang.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-26 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665882#comment-16665882
 ] 

Xu Cang commented on HBASE-21395:
-

Nit:

I see you use ".count()" with stream in if statement. But actually we don't 
need to count them all, one should be enough.

Also, formatting in this line: 
{quote}.map(p -> (AbstractStateMachineTableProcedure) p).filter(
{quote}
maybe start a new line for 'filter"

 

> Abort split/merge procedure if there is a table procedure of the same table 
> going on
> 
>
> Key: HBASE-21395
> URL: https://issues.apache.org/jira/browse/HBASE-21395
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.1.2
>
> Attachments: HBASE-21395.branch-2.0.001.patch
>
>
> In my ITBLL, I often see that if split/merge procedure and table 
> procedure(like ModifyTableProcedure) happen at the same time, and since there 
> some race conditions between these two kind of procedures,  causing some 
> serious problems. e.g. the split/merged parent is bought on line by the table 
> procedure or the split merged region making the whole table procedure 
> rollback.
> Talked with [~Apache9] offline today, this kind of problem was solved in 
> branch-2+ since There is a fence that only one RTSP can agianst a single 
> region at the same time.
> To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe 
> fence in the split/merge procedure: If there is a table procedure going on 
> against the same table, then abort the split/merge procedure. Aborting the 
> split/merge procedure at the beginning of the execution is no big deal, 
> compared with the mess it will cause...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21376) Add some verbose log to MasterProcedureScheduler

2018-10-26 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665875#comment-16665875
 ] 

Hudson commented on HBASE-21376:


Results for branch branch-2.1
[build #539 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Add some verbose log to MasterProcedureScheduler
> 
>
> Key: HBASE-21376
> URL: https://issues.apache.org/jira/browse/HBASE-21376
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.0.3, 2.1.2
>
> Attachments: HBASE-21376.branch-2.0.001.patch, 
> HBASE-21376.branch-2.0.001.patch
>
>
> As discussed in HBASE-21364, we divided the patch in HBASE-21364 to two, the 
> critical one is already submitted in HBASE-21364 to branch-2.0 and 
> branch-2.1, but I also added some useful logs  which need to commit to all 
> branches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21391) RefreshPeerProcedure should also wait master initialized before executing

2018-10-26 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665877#comment-16665877
 ] 

Hudson commented on HBASE-21391:


Results for branch branch-2.1
[build #539 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> RefreshPeerProcedure should also wait master initialized before executing
> -
>
> Key: HBASE-21391
> URL: https://issues.apache.org/jira/browse/HBASE-21391
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21391.patch
>
>
> Missed this one when introducing the waitInitialized method in Procedure, and 
> found when implementing HBASE-21389.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665876#comment-16665876
 ] 

Hudson commented on HBASE-20973:


Results for branch branch-2.1
[build #539 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/539//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> 

[jira] [Updated] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-26 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21322:
-
Attachment: HBASE-21322.master.006.patch

> Add a scheduleServerCrashProcedure() API to HbckService
> ---
>
> Key: HBASE-21322
> URL: https://issues.apache.org/jira/browse/HBASE-21322
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21322.branch-2.1.001.patch, 
> HBASE-21322.master.001.patch, HBASE-21322.master.002.patch, 
> HBASE-21322.master.003.patch, HBASE-21322.master.004.patch, 
> HBASE-21322.master.005.patch, HBASE-21322.master.006.patch, Screenshot from 
> 2018-10-17 13-35-58.png, Screenshot from 2018-10-17 13-38-41.png, Screenshot 
> from 2018-10-17 13-47-06.png
>
>
> According to my test, if one RS is down, then all procedure logs are deleted, 
> it will lead to that no ServerCrashProcedure is scheduled. And restarting 
> master cannot help. Thus we need to schedule a ServerCrashProcedure manually 
> to solve the problem. I plan to add a scheduleServerCrashProcedure() API to 
> HbckService, then add this API to HBCK2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-26 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21322:
-
Attachment: HBASE-21322.branch-2.1.001.patch

> Add a scheduleServerCrashProcedure() API to HbckService
> ---
>
> Key: HBASE-21322
> URL: https://issues.apache.org/jira/browse/HBASE-21322
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21322.branch-2.1.001.patch, 
> HBASE-21322.master.001.patch, HBASE-21322.master.002.patch, 
> HBASE-21322.master.003.patch, HBASE-21322.master.004.patch, 
> HBASE-21322.master.005.patch, Screenshot from 2018-10-17 13-35-58.png, 
> Screenshot from 2018-10-17 13-38-41.png, Screenshot from 2018-10-17 
> 13-47-06.png
>
>
> According to my test, if one RS is down, then all procedure logs are deleted, 
> it will lead to that no ServerCrashProcedure is scheduled. And restarting 
> master cannot help. Thus we need to schedule a ServerCrashProcedure manually 
> to solve the problem. I plan to add a scheduleServerCrashProcedure() API to 
> HbckService, then add this API to HBCK2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21391) RefreshPeerProcedure should also wait master initialized before executing

2018-10-26 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665839#comment-16665839
 ] 

Hudson commented on HBASE-21391:


Results for branch branch-2
[build #1449 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1449/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1449//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1449//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1449//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> RefreshPeerProcedure should also wait master initialized before executing
> -
>
> Key: HBASE-21391
> URL: https://issues.apache.org/jira/browse/HBASE-21391
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21391.patch
>
>
> Missed this one when introducing the waitInitialized method in Procedure, and 
> found when implementing HBASE-21389.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665838#comment-16665838
 ] 

Hudson commented on HBASE-20973:


Results for branch branch-2
[build #1449 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1449/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1449//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1449//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1449//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> 

[jira] [Commented] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665814#comment-16665814
 ] 

Hadoop QA commented on HBASE-21322:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
19s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  4m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} hbase-client: The patch generated 1 new + 116 
unchanged - 0 fixed = 117 total (was 116) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
23s{color} | {color:red} hbase-server: The patch generated 5 new + 23 unchanged 
- 0 fixed = 28 total (was 23) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
22s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
14m  7s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
43s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
51s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}230m 57s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 8s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}308m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAdmin1 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce 

[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-21175:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch, Artem.

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Fix For: 3.0.0
>
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch, HBASE-21175.v07.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665769#comment-16665769
 ] 

Hadoop QA commented on HBASE-21175:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
26s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} hbase-server: The patch generated 0 new + 0 
unchanged - 10 fixed = 0 total (was 10) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
22s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m  4s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}124m 
24s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21175 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945808/HBASE-21175.v07.patch 
|
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux ae733946fc98 4.4.0-134-generic #160~14.04.1-Ubuntu SMP Fri Aug 
17 11:07:07 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / e5ba79816a |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14880/testReport/ |
| Max. process+thread count | 4667 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14880/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |



[jira] [Commented] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665746#comment-16665746
 ] 

Hadoop QA commented on HBASE-21395:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
38s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
5s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
20s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
39s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
21s{color} | {color:red} hbase-server: The patch generated 1 new + 11 unchanged 
- 0 fixed = 12 total (was 11) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
13s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}130m  8s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}169m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAdmin1 |
|   | hadoop.hbase.master.TestAssignmentListener |
|   | hadoop.hbase.snapshot.TestMobFlushSnapshotFromClient |
|   | hadoop.hbase.master.TestMergeTableRegionsWhileRSCrash |
|   | hadoop.hbase.TestSplitMerge |
|   | hadoop.hbase.master.normalizer.TestSimpleRegionNormalizerOnCluster |
|   | hadoop.hbase.snapshot.TestFlushSnapshotFromClient |
|   | hadoop.hbase.client.TestAsyncRegionAdminApi2 |
|   | hadoop.hbase.namespace.TestNamespaceAuditor |
|   | hadoop.hbase.client.TestTableFavoredNodes |
|   | hadoop.hbase.TestSequenceIdMonotonicallyIncreasing |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21395 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945797/HBASE-21395.branch-2.0.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname 

[jira] [Updated] (HBASE-21380) TestRSGroups failing

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21380:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to branch-2.1 only, the only place that needs this.

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.branch-2.1.002.patch, HBASE-21380.branch-2.1.003.patch, 
> HBASE-21380.branch-2.1.004.patch, HBASE-21380.master.001.patch, 
> HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21380) TestRSGroups failing

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21380:
--
Fix Version/s: 2.1.1

> TestRSGroups failing
> 
>
> Key: HBASE-21380
> URL: https://issues.apache.org/jira/browse/HBASE-21380
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Sean Busbey
>Assignee: Mike Drob
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-21380.branch-2.1.001.patch, 
> HBASE-21380.branch-2.1.002.patch, HBASE-21380.branch-2.1.003.patch, 
> HBASE-21380.branch-2.1.004.patch, HBASE-21380.master.001.patch, 
> HBASE-21380.master.002.patch
>
>
> only failing on branch-2.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21375) Revisit the lock and queue implementation in MasterProcedureScheduler

2018-10-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665735#comment-16665735
 ] 

Hadoop QA commented on HBASE-21375:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
31s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch passed checkstyle in hbase-procedure 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} hbase-server: The patch generated 0 new + 7 
unchanged - 4 fixed = 7 total (was 11) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
24s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 24s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
22s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}259m 34s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}314m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAdmin1 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21375 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945762/HBASE-21375-v2.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux cc01727d66c2 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |

[jira] [Updated] (HBASE-21191) Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21191:
--
Fix Version/s: 2.0.3

> Add a holding-pattern if no assign for meta or namespace (Can happen if 
> masterprocwals have been cleared).
> --
>
> Key: HBASE-21191
> URL: https://issues.apache.org/jira/browse/HBASE-21191
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.1.1, 2.0.3
>
> Attachments: HBASE-21191.branch-2.0.001.patch, 
> HBASE-21191.branch-2.1.001.patch, HBASE-21191.branch-2.1.002.patch, 
> HBASE-21191.branch-2.1.003.patch, HBASE-21191.branch-2.1.004.patch, 
> HBASE-21191.branch-2.1.005.patch, HBASE-21191.branch-2.1.006.patch, 
> HBASE-21191.branch-2.1.007.patch
>
>
> If the masterprocwals have been removed -- operator error, hdfs dataloss, or 
> because we have gotten ourselves into a pathological state where we have 
> hundreds of masterprocwals too process and it is taking too long so we just 
> want to startover -- then master startup will have a dilemma. Master startup 
> needs hbase:meta to be online. If the masterprocwals have been removed, there 
> may be no outstanding assign or a servercrashprocedure with coverage for 
> hbase:meta (I ran into this issue repeatedly in internal testing purging 
> masterprocwals on a large test cluster). Worse, when master startup cannot 
> find an online hbase:meta, it exits after exhausting the RPC retries.
> So, we need a holding-pattern for master startup if hbase:meta is not online 
> if only so an operator can schedule an assign for meta or so they can assign 
> fixup procedures (HBASE-21035 has discussion on why we cannot just 
> auto-schedule an assign of meta).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21191) Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21191:
--
Status: Patch Available  (was: Reopened)

> Add a holding-pattern if no assign for meta or namespace (Can happen if 
> masterprocwals have been cleared).
> --
>
> Key: HBASE-21191
> URL: https://issues.apache.org/jira/browse/HBASE-21191
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-21191.branch-2.0.001.patch, 
> HBASE-21191.branch-2.1.001.patch, HBASE-21191.branch-2.1.002.patch, 
> HBASE-21191.branch-2.1.003.patch, HBASE-21191.branch-2.1.004.patch, 
> HBASE-21191.branch-2.1.005.patch, HBASE-21191.branch-2.1.006.patch, 
> HBASE-21191.branch-2.1.007.patch
>
>
> If the masterprocwals have been removed -- operator error, hdfs dataloss, or 
> because we have gotten ourselves into a pathological state where we have 
> hundreds of masterprocwals too process and it is taking too long so we just 
> want to startover -- then master startup will have a dilemma. Master startup 
> needs hbase:meta to be online. If the masterprocwals have been removed, there 
> may be no outstanding assign or a servercrashprocedure with coverage for 
> hbase:meta (I ran into this issue repeatedly in internal testing purging 
> masterprocwals on a large test cluster). Worse, when master startup cannot 
> find an online hbase:meta, it exits after exhausting the RPC retries.
> So, we need a holding-pattern for master startup if hbase:meta is not online 
> if only so an operator can schedule an assign for meta or so they can assign 
> fixup procedures (HBASE-21035 has discussion on why we cannot just 
> auto-schedule an assign of meta).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21191) Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665730#comment-16665730
 ] 

stack commented on HBASE-21191:
---

HBASE-21191.branch-2.0.001.patch is backport of the patch committed to 
branch-2.1.

> Add a holding-pattern if no assign for meta or namespace (Can happen if 
> masterprocwals have been cleared).
> --
>
> Key: HBASE-21191
> URL: https://issues.apache.org/jira/browse/HBASE-21191
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-21191.branch-2.0.001.patch, 
> HBASE-21191.branch-2.1.001.patch, HBASE-21191.branch-2.1.002.patch, 
> HBASE-21191.branch-2.1.003.patch, HBASE-21191.branch-2.1.004.patch, 
> HBASE-21191.branch-2.1.005.patch, HBASE-21191.branch-2.1.006.patch, 
> HBASE-21191.branch-2.1.007.patch
>
>
> If the masterprocwals have been removed -- operator error, hdfs dataloss, or 
> because we have gotten ourselves into a pathological state where we have 
> hundreds of masterprocwals too process and it is taking too long so we just 
> want to startover -- then master startup will have a dilemma. Master startup 
> needs hbase:meta to be online. If the masterprocwals have been removed, there 
> may be no outstanding assign or a servercrashprocedure with coverage for 
> hbase:meta (I ran into this issue repeatedly in internal testing purging 
> masterprocwals on a large test cluster). Worse, when master startup cannot 
> find an online hbase:meta, it exits after exhausting the RPC retries.
> So, we need a holding-pattern for master startup if hbase:meta is not online 
> if only so an operator can schedule an assign for meta or so they can assign 
> fixup procedures (HBASE-21035 has discussion on why we cannot just 
> auto-schedule an assign of meta).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21191) Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reopened HBASE-21191:
---

Reopening so can backport to branch-2.0.

> Add a holding-pattern if no assign for meta or namespace (Can happen if 
> masterprocwals have been cleared).
> --
>
> Key: HBASE-21191
> URL: https://issues.apache.org/jira/browse/HBASE-21191
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-21191.branch-2.0.001.patch, 
> HBASE-21191.branch-2.1.001.patch, HBASE-21191.branch-2.1.002.patch, 
> HBASE-21191.branch-2.1.003.patch, HBASE-21191.branch-2.1.004.patch, 
> HBASE-21191.branch-2.1.005.patch, HBASE-21191.branch-2.1.006.patch, 
> HBASE-21191.branch-2.1.007.patch
>
>
> If the masterprocwals have been removed -- operator error, hdfs dataloss, or 
> because we have gotten ourselves into a pathological state where we have 
> hundreds of masterprocwals too process and it is taking too long so we just 
> want to startover -- then master startup will have a dilemma. Master startup 
> needs hbase:meta to be online. If the masterprocwals have been removed, there 
> may be no outstanding assign or a servercrashprocedure with coverage for 
> hbase:meta (I ran into this issue repeatedly in internal testing purging 
> masterprocwals on a large test cluster). Worse, when master startup cannot 
> find an online hbase:meta, it exits after exhausting the RPC retries.
> So, we need a holding-pattern for master startup if hbase:meta is not online 
> if only so an operator can schedule an assign for meta or so they can assign 
> fixup procedures (HBASE-21035 has discussion on why we cannot just 
> auto-schedule an assign of meta).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21191) Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21191:
--
Attachment: HBASE-21191.branch-2.0.001.patch

> Add a holding-pattern if no assign for meta or namespace (Can happen if 
> masterprocwals have been cleared).
> --
>
> Key: HBASE-21191
> URL: https://issues.apache.org/jira/browse/HBASE-21191
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-21191.branch-2.0.001.patch, 
> HBASE-21191.branch-2.1.001.patch, HBASE-21191.branch-2.1.002.patch, 
> HBASE-21191.branch-2.1.003.patch, HBASE-21191.branch-2.1.004.patch, 
> HBASE-21191.branch-2.1.005.patch, HBASE-21191.branch-2.1.006.patch, 
> HBASE-21191.branch-2.1.007.patch
>
>
> If the masterprocwals have been removed -- operator error, hdfs dataloss, or 
> because we have gotten ourselves into a pathological state where we have 
> hundreds of masterprocwals too process and it is taking too long so we just 
> want to startover -- then master startup will have a dilemma. Master startup 
> needs hbase:meta to be online. If the masterprocwals have been removed, there 
> may be no outstanding assign or a servercrashprocedure with coverage for 
> hbase:meta (I ran into this issue repeatedly in internal testing purging 
> masterprocwals on a large test cluster). Worse, when master startup cannot 
> find an online hbase:meta, it exits after exhausting the RPC retries.
> So, we need a holding-pattern for master startup if hbase:meta is not online 
> if only so an operator can schedule an assign for meta or so they can assign 
> fixup procedures (HBASE-21035 has discussion on why we cannot just 
> auto-schedule an assign of meta).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21389) Revisit the procedure lock for sync replication

2018-10-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665727#comment-16665727
 ] 

Hadoop QA commented on HBASE-21389:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
26s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
25s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 12s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}259m 18s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 6s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}304m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.replication.TestSyncReplicationStandbyKillRS |
|   | hadoop.hbase.client.TestAdmin1 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21389 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945763/HBASE-21389-v1.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 53503c7dbf46 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 0ab7c3a189 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14875/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 

[jira] [Updated] (HBASE-21399) Generate and commit 2.1.1 RELEASENOTES.md and CHANGES.md

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21399:
--
Description: 
Ran ./release-doc-maker/releasedocmaker.py -p HBASE --fileversions -v 2.1.1 -l 
--sortorder=newer --skip-credits then carefully stitched the product into the 
current CHANGES.md and RELEASENOTES.md files being careful to preserve markdown 
header ABOVE the apache license else the .md files won't render as markdown as 
in

{code}
# HBASE  2.1.1 Release Notes



These release notes cover new developer and user-facing incompatibilities, 
important issues, features, and major improvements.


---

{code}

Check that CHANGES and RELEASENOTES draw properly in a markdown parser.



  was:
Ran ./release-doc-maker/releasedocmaker.py -p HBASE --fileversions -v 2.1.1 -l 
--sortorder=newer --skip-credits then carefully stitched the product into the 
current CHANGES.md and RELEASENOTES.md files being careful to preserve markdown 
header ABOVE the apache license else the .md files won't render as markdown as 
in

{code}
# HBASE  2.1.1 Release Notes



These release notes cover new developer and user-facing incompatibilities, 
important issues, features, and major improvements.


---

{code}




> Generate and commit 2.1.1 RELEASENOTES.md and CHANGES.md
> 
>
> Key: HBASE-21399
> URL: https://issues.apache.org/jira/browse/HBASE-21399
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Priority: Major
>
> Ran ./release-doc-maker/releasedocmaker.py -p HBASE --fileversions -v 2.1.1 
> -l --sortorder=newer --skip-credits then carefully stitched the product into 
> the current CHANGES.md and RELEASENOTES.md files being careful to preserve 
> markdown header ABOVE the apache license else the .md files won't render as 
> markdown as in
> {code}
> # HBASE  2.1.1 Release Notes
> 
> These release notes cover new developer and user-facing incompatibilities, 
> important issues, features, and major improvements.
> ---
> 
> {code}
> Check that CHANGES and RELEASENOTES draw properly in a markdown parser.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21399) Generate and commit 2.1.1 RELEASENOTES.md and CHANGES.md

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21399:
--
Description: 
Ran ./release-doc-maker/releasedocmaker.py -p HBASE --fileversions -v 2.1.1 -l 
--sortorder=newer --skip-credits then carefully stitched the product into the 
current CHANGES.md and RELEASENOTES.md files being careful to preserve markdown 
header ABOVE the apache license else the .md files won't render as markdown as 
in

{code}
# HBASE  2.1.1 Release Notes



These release notes cover new developer and user-facing incompatibilities, 
important issues, features, and major improvements.


---

{code}



  was:Ran ./release-doc-maker/releasedocmaker.py -p HBASE --fileversions -v 
2.1.1 -l --sortorder=newer --skip-credits then carefully stitched the product 
into the current CHANGES.md and RELEASENOTES.md files.


> Generate and commit 2.1.1 RELEASENOTES.md and CHANGES.md
> 
>
> Key: HBASE-21399
> URL: https://issues.apache.org/jira/browse/HBASE-21399
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Priority: Major
>
> Ran ./release-doc-maker/releasedocmaker.py -p HBASE --fileversions -v 2.1.1 
> -l --sortorder=newer --skip-credits then carefully stitched the product into 
> the current CHANGES.md and RELEASENOTES.md files being careful to preserve 
> markdown header ABOVE the apache license else the .md files won't render as 
> markdown as in
> {code}
> # HBASE  2.1.1 Release Notes
> 
> These release notes cover new developer and user-facing incompatibilities, 
> important issues, features, and major improvements.
> ---
> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21399) Generate and commit 2.1.1 RELEASENOTES.md and CHANGES.md

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21399:
--
Description: Ran ./release-doc-maker/releasedocmaker.py -p HBASE 
--fileversions -v 2.1.1 -l --sortorder=newer --skip-credits then carefully 
stitched the product into the current CHANGES.md and RELEASENOTES.md files.

> Generate and commit 2.1.1 RELEASENOTES.md and CHANGES.md
> 
>
> Key: HBASE-21399
> URL: https://issues.apache.org/jira/browse/HBASE-21399
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Priority: Major
>
> Ran ./release-doc-maker/releasedocmaker.py -p HBASE --fileversions -v 2.1.1 
> -l --sortorder=newer --skip-credits then carefully stitched the product into 
> the current CHANGES.md and RELEASENOTES.md files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21376) Add some verbose log to MasterProcedureScheduler

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21376:
--
Fix Version/s: (was: 2.1.1)
   2.1.2

> Add some verbose log to MasterProcedureScheduler
> 
>
> Key: HBASE-21376
> URL: https://issues.apache.org/jira/browse/HBASE-21376
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.0.3, 2.1.2
>
> Attachments: HBASE-21376.branch-2.0.001.patch, 
> HBASE-21376.branch-2.0.001.patch
>
>
> As discussed in HBASE-21364, we divided the patch in HBASE-21364 to two, the 
> critical one is already submitted in HBASE-21364 to branch-2.0 and 
> branch-2.1, but I also added some useful logs  which need to commit to all 
> branches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21394) Restore snapshot in parallel

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21394:
--
Fix Version/s: (was: 2.1.1)
   2.1.2

> Restore snapshot in parallel
> 
>
> Key: HBASE-21394
> URL: https://issues.apache.org/jira/browse/HBASE-21394
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
>
> Our MapReduce/Spark job is highly dependent on SnapshotScanner.  When restore 
> a big table for SnapshotScanner,  it'll take hours ..
> Restore snapshot in parallel will helps a lot. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21399) Generate and commit 2.1.1 RELEASENOTES.md and CHANGES.md

2018-10-26 Thread stack (JIRA)
stack created HBASE-21399:
-

 Summary: Generate and commit 2.1.1 RELEASENOTES.md and CHANGES.md
 Key: HBASE-21399
 URL: https://issues.apache.org/jira/browse/HBASE-21399
 Project: HBase
  Issue Type: Sub-task
Reporter: stack






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21398) Copy down docs, amend to suite branch-2.1, and then commit

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-21398.
---
   Resolution: Fixed
 Assignee: stack
Fix Version/s: 2.1.1

> Copy down docs, amend to suite branch-2.1, and then commit
> --
>
> Key: HBASE-21398
> URL: https://issues.apache.org/jira/browse/HBASE-21398
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.1.1
>
>
> Freshen the doc in branch-2.1 by copying from master branch. Purge backup and 
> spark. Left in the serial replication mentions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21398) Copy down docs, amend to suite branch-2.1, and then commit

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21398:
--
Description: 
Freshen the doc in branch-2.1 by copying from master branch. Purge backup and 
spark. Left in the serial replication mentions.


> Copy down docs, amend to suite branch-2.1, and then commit
> --
>
> Key: HBASE-21398
> URL: https://issues.apache.org/jira/browse/HBASE-21398
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Priority: Major
> Fix For: 2.1.1
>
>
> Freshen the doc in branch-2.1 by copying from master branch. Purge backup and 
> spark. Left in the serial replication mentions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21398) Copy down docs, amend to suite branch-2.1, and then commit

2018-10-26 Thread stack (JIRA)
stack created HBASE-21398:
-

 Summary: Copy down docs, amend to suite branch-2.1, and then commit
 Key: HBASE-21398
 URL: https://issues.apache.org/jira/browse/HBASE-21398
 Project: HBase
  Issue Type: Sub-task
Reporter: stack






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665629#comment-16665629
 ] 

Hadoop QA commented on HBASE-20973:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
52s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
48s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
12m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  3m 36s{color} 
| {color:red} hbase-procedure in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 42m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.procedure2.TestProcedureRollbackAIOOB |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20973 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945775/HBASE-20973.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux f1a3189b8e10 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 0ab7c3a189 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14877/artifact/patchprocess/patch-unit-hbase-procedure.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14877/testReport/ |
| Max. process+thread count | 278 (vs. ulimit of 1) |
| modules | C: hbase-procedure U: hbase-procedure |
| 

[jira] [Resolved] (HBASE-21397) Set version to 2.1.1 on branch-2.1 in prep for first RC

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-21397.
---
   Resolution: Fixed
Fix Version/s: 2.1.1

> Set version to 2.1.1 on branch-2.1 in prep for first RC
> ---
>
> Key: HBASE-21397
> URL: https://issues.apache.org/jira/browse/HBASE-21397
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.1.1
>
>
> Ran mvn clean org.codehaus.mojo:versions-maven-plugin:2.5:set 
> -DnewVersion=2.1.1 and pushed the change on branch-2.1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21397) Set version to 2.1.1 on branch-2.1 in prep for first RC

2018-10-26 Thread stack (JIRA)
stack created HBASE-21397:
-

 Summary: Set version to 2.1.1 on branch-2.1 in prep for first RC
 Key: HBASE-21397
 URL: https://issues.apache.org/jira/browse/HBASE-21397
 Project: HBase
  Issue Type: Sub-task
Reporter: stack
Assignee: stack


Ran mvn clean org.codehaus.mojo:versions-maven-plugin:2.5:set 
-DnewVersion=2.1.1 and pushed the change on branch-2.1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21396) Create 2.1.1 release

2018-10-26 Thread stack (JIRA)
stack created HBASE-21396:
-

 Summary: Create 2.1.1 release
 Key: HBASE-21396
 URL: https://issues.apache.org/jira/browse/HBASE-21396
 Project: HBase
  Issue Type: Task
  Components: rm
Reporter: stack
Assignee: stack






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21237) Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665618#comment-16665618
 ] 

stack commented on HBASE-21237:
---

Looking more the TestAssignmentManager comparing to master branch, the tests 
have been refactored. Ruling this too risky a change just now. If new RC, can 
pull it in then.

> Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS
> --
>
> Key: HBASE-21237
> URL: https://issues.apache.org/jira/browse/HBASE-21237
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Blocker
> Fix For: 2.0.3, 2.1.2
>
> Attachments: HBASE-21237-branch-2.1.patch, 
> HBASE-21237.branch-2.0.001.patch
>
>
> As discussed in HBASE-21217, in branch-2.0 and branch-2.1, we should use  
> CompatRemoteProcedureResolver  instead of ExecuteProceduresRemoteCall to 
> dispatch region open/close requests to RS. Since ExecuteProceduresRemoteCall  
> will group all the open/close operations in one call and execute them 
> sequentially on the target RS. If one operation fails, all the operation will 
> be marked as failure. Actually, some of the operations(like open region) is 
> already executing in the open region handler thread. But master thinks these 
> operations fails and reassign the regions to another RS. So when the previous 
> RS report to the master that the region is online, master will kill the RS 
> since it already assign the region to another RS.
> For branch-2.2+, HBASE-21217 will fix this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20973:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Tried it. Had to add ClassRule but then all worked. Pushed to branch-2.0+. 
Thanks [~Apache9] (and [~allan163])

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter 

[jira] [Updated] (HBASE-21237) Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21237:
--
Fix Version/s: (was: 2.1.1)
   2.1.2

> Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS
> --
>
> Key: HBASE-21237
> URL: https://issues.apache.org/jira/browse/HBASE-21237
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Blocker
> Fix For: 2.0.3, 2.1.2
>
> Attachments: HBASE-21237-branch-2.1.patch, 
> HBASE-21237.branch-2.0.001.patch
>
>
> As discussed in HBASE-21217, in branch-2.0 and branch-2.1, we should use  
> CompatRemoteProcedureResolver  instead of ExecuteProceduresRemoteCall to 
> dispatch region open/close requests to RS. Since ExecuteProceduresRemoteCall  
> will group all the open/close operations in one call and execute them 
> sequentially on the target RS. If one operation fails, all the operation will 
> be marked as failure. Actually, some of the operations(like open region) is 
> already executing in the open region handler thread. But master thinks these 
> operations fails and reassign the regions to another RS. So when the previous 
> RS report to the master that the region is online, master will kill the RS 
> since it already assign the region to another RS.
> For branch-2.2+, HBASE-21217 will fix this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665584#comment-16665584
 ] 

stack commented on HBASE-21395:
---

Ran  mvn test -Dtest=TestMergeTableRegionsProcedure and got a nice message



[ERROR] Failures:
[ERROR]   TestMergeTableRegionsProcedure.testMergeRegionsConcurrently:212 found 
exception: org.apache.hadoop.hbase.exceptions.MergeRegionException via 
master-merge-regions:org.apache.hadoop.hbase.exceptions.MergeRegionException: 
There is a table procedure going on against the same table, abort the merge of 
pid=48, state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE, locked=true; 
MergeTableRegionsProcedure table=testMergeRegionsConcurrently, 
regions=[3f957497f8b7b306120ca2c71fe1, 893d444596effc7aae69f8c1145500a0], 
forcibly=true
[ERROR]   TestMergeTableRegionsProcedure.testMergeTwoRegions:159 found 
exception: org.apache.hadoop.hbase.exceptions.MergeRegionException via 
master-merge-regions:org.apache.hadoop.hbase.exceptions.MergeRegionException: 
There is a table procedure going on against the same table, abort the merge of 
pid=36, state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE, locked=true; 
MergeTableRegionsProcedure table=testMergeTwoRegions, 
regions=[f5de4b3605e2aebdef0c8544d55abfb0, 67478c63bec10165fb383b8d35796a4a], 
forcibly=true
[ERROR]   TestMergeTableRegionsProcedure.testMergeWithoutPONR:295 expected a 
running proc
[ERROR]   TestMergeTableRegionsProcedure.testRecoveryAndDoubleExecution:242 
found exception: org.apache.hadoop.hbase.exceptions.MergeRegionException via 
master-merge-regions:org.apache.hadoop.hbase.exceptions.MergeRegionException: 
There is a table procedure going on against the same table, abort the merge of 
pid=61, state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE, locked=true; 
MergeTableRegionsProcedure table=testRecoveryAndDoubleExecution, 
regions=[0db36af05fe46aef3e32a810e90c51ab, fd4e09b1f961af7a9cf05a12a51ef23b], 
forcibly=false
[ERROR]   TestMergeTableRegionsProcedure.testRollbackAndDoubleExecution:269 
expected a running proc
[INFO]
[ERROR] Tests run: 5, Failures: 5, Errors: 0, Skipped: 0


Punting out to 2.1.2


> Abort split/merge procedure if there is a table procedure of the same table 
> going on
> 
>
> Key: HBASE-21395
> URL: https://issues.apache.org/jira/browse/HBASE-21395
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.1.2
>
> Attachments: HBASE-21395.branch-2.0.001.patch
>
>
> In my ITBLL, I often see that if split/merge procedure and table 
> procedure(like ModifyTableProcedure) happen at the same time, and since there 
> some race conditions between these two kind of procedures,  causing some 
> serious problems. e.g. the split/merged parent is bought on line by the table 
> procedure or the split merged region making the whole table procedure 
> rollback.
> Talked with [~Apache9] offline today, this kind of problem was solved in 
> branch-2+ since There is a fence that only one RTSP can agianst a single 
> region at the same time.
> To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe 
> fence in the split/merge procedure: If there is a table procedure going on 
> against the same table, then abort the split/merge procedure. Aborting the 
> split/merge procedure at the beginning of the execution is no big deal, 
> compared with the mess it will cause...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21395:
--
Fix Version/s: 2.1.2

> Abort split/merge procedure if there is a table procedure of the same table 
> going on
> 
>
> Key: HBASE-21395
> URL: https://issues.apache.org/jira/browse/HBASE-21395
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.1.2
>
> Attachments: HBASE-21395.branch-2.0.001.patch
>
>
> In my ITBLL, I often see that if split/merge procedure and table 
> procedure(like ModifyTableProcedure) happen at the same time, and since there 
> some race conditions between these two kind of procedures,  causing some 
> serious problems. e.g. the split/merged parent is bought on line by the table 
> procedure or the split merged region making the whole table procedure 
> rollback.
> Talked with [~Apache9] offline today, this kind of problem was solved in 
> branch-2+ since There is a fence that only one RTSP can agianst a single 
> region at the same time.
> To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe 
> fence in the split/merge procedure: If there is a table procedure going on 
> against the same table, then abort the split/merge procedure. Aborting the 
> split/merge procedure at the beginning of the execution is no big deal, 
> compared with the mess it will cause...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665581#comment-16665581
 ] 

Hudson commented on HBASE-20973:


Results for branch branch-2.0
[build #1020 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1020/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1017//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1017//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1017//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> 

[jira] [Commented] (HBASE-21376) Add some verbose log to MasterProcedureScheduler

2018-10-26 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665580#comment-16665580
 ] 

Hudson commented on HBASE-21376:


Results for branch branch-2.0
[build #1020 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1020/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1017//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1017//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1017//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Add some verbose log to MasterProcedureScheduler
> 
>
> Key: HBASE-21376
> URL: https://issues.apache.org/jira/browse/HBASE-21376
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.1.1, 2.0.3
>
> Attachments: HBASE-21376.branch-2.0.001.patch, 
> HBASE-21376.branch-2.0.001.patch
>
>
> As discussed in HBASE-21364, we divided the patch in HBASE-21364 to two, the 
> critical one is already submitted in HBASE-21364 to branch-2.0 and 
> branch-2.1, but I also added some useful logs  which need to commit to all 
> branches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere

2018-10-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665543#comment-16665543
 ] 

Hadoop QA commented on HBASE-21325:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
36s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
22s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 36s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}131m 
35s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}174m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21325 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945740/HBASE-21325.master.005.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 4d4f47bd52d5 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 0ab7c3a189 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14874/testReport/ |
| Max. process+thread count | 4841 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14874/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Force to terminate 

[jira] [Commented] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665502#comment-16665502
 ] 

stack commented on HBASE-21322:
---

Skipping .005 for 2.1.1 RC.

Didn't apply and after fixup, got this:

[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.6.1:compile (default-compile) 
on project hbase-server: Compilation failure
[ERROR] 
/Users/stack/checkouts/hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java:[2461,34]
 getWALRootDir() has protected access in 
org.apache.hadoop.hbase.regionserver.HRegionServer

There'll probably be another RC [~tianjingyun] so you fix this up, can pull it 
in then sir.

> Add a scheduleServerCrashProcedure() API to HbckService
> ---
>
> Key: HBASE-21322
> URL: https://issues.apache.org/jira/browse/HBASE-21322
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21322.master.001.patch, 
> HBASE-21322.master.002.patch, HBASE-21322.master.003.patch, 
> HBASE-21322.master.004.patch, HBASE-21322.master.005.patch, Screenshot from 
> 2018-10-17 13-35-58.png, Screenshot from 2018-10-17 13-38-41.png, Screenshot 
> from 2018-10-17 13-47-06.png
>
>
> According to my test, if one RS is down, then all procedure logs are deleted, 
> it will lead to that no ServerCrashProcedure is scheduled. And restarting 
> master cannot help. Thus we need to schedule a ServerCrashProcedure manually 
> to solve the problem. I plan to add a scheduleServerCrashProcedure() API to 
> HbckService, then add this API to HBCK2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Artem Ervits (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Attachment: HBASE-21175.v07.patch

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch, HBASE-21175.v07.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Artem Ervits (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Status: Patch Available  (was: Open)

[~yuzhih...@gmail.com] v.07 addresses all previous concerns, compiles and test 
passes locally.

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch, HBASE-21175.v07.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Artem Ervits (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Comment: was deleted

(was: nevermind v.06 causes some compilation problems)

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665437#comment-16665437
 ] 

stack commented on HBASE-20973:
---

Confirmed mighty [~Apache9] backed out [~allan163]'s  
HBASE-20973.branch-2.0.002.patch from branch-2.0+. Waiting on hadoopqa before 
committing (seems like they are all occupied at mo...)

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making 

[jira] [Comment Edited] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Artem Ervits (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665431#comment-16665431
 ] 

Artem Ervits edited comment on HBASE-21175 at 10/26/18 5:24 PM:


nevermind v.06 causes some compilation problems


was (Author: dbist13):
v. 06 patch addresses last comments [~yuzhih...@gmail.com].

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Artem Ervits (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Attachment: (was: HBASE-21175.v06.patch)

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Artem Ervits (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665431#comment-16665431
 ] 

Artem Ervits commented on HBASE-21175:
--

v. 06 patch addresses last comments [~yuzhih...@gmail.com].

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch, HBASE-21175.v06.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Artem Ervits (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Status: Open  (was: Patch Available)

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch, HBASE-21175.v06.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Artem Ervits (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Attachment: HBASE-21175.v06.patch

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch, HBASE-21175.v06.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21237) Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665430#comment-16665430
 ] 

stack commented on HBASE-21237:
---

The TestAssignmentManager failure is legit. The mocking mechanism where we pass 
in a mocked dispatcher is bypassed. Not sure why after queuing the assign of 
meta, it is never scheduled. Going to pass on this for first RC. What is in 
place works though slower than what it could be. Thanks boys.

> Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS
> --
>
> Key: HBASE-21237
> URL: https://issues.apache.org/jira/browse/HBASE-21237
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Blocker
> Fix For: 2.1.1, 2.0.3
>
> Attachments: HBASE-21237-branch-2.1.patch, 
> HBASE-21237.branch-2.0.001.patch
>
>
> As discussed in HBASE-21217, in branch-2.0 and branch-2.1, we should use  
> CompatRemoteProcedureResolver  instead of ExecuteProceduresRemoteCall to 
> dispatch region open/close requests to RS. Since ExecuteProceduresRemoteCall  
> will group all the open/close operations in one call and execute them 
> sequentially on the target RS. If one operation fails, all the operation will 
> be marked as failure. Actually, some of the operations(like open region) is 
> already executing in the open region handler thread. But master thinks these 
> operations fails and reassign the regions to another RS. So when the previous 
> RS report to the master that the region is online, master will kill the RS 
> since it already assign the region to another RS.
> For branch-2.2+, HBASE-21217 will fix this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665402#comment-16665402
 ] 

Ted Yu commented on HBASE-21175:


Very close.
{code}
+   * @throws IOException throws an IOException if there's problem creating a 
table
{code}
Is the description accurate ?
Is there other scenario where IOE is thrown ?
{code}
+// Avoid passing a null master to CleanerChore, see HBASE-21175
{code}
nit:
Please move the start of comment two spaces to the left.



> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21385) HTable.delete request use rpc call directly instead of AsyncProcess

2018-10-26 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665394#comment-16665394
 ] 

Hudson commented on HBASE-21385:


Results for branch master
[build #569 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/569/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/569//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/569//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/569//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> HTable.delete request use rpc call directly instead of AsyncProcess
> ---
>
> Key: HBASE-21385
> URL: https://issues.apache.org/jira/browse/HBASE-21385
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.1.0, 2.2.0, 2.0.2
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21385.master.001.patch, 
> HBASE-21385.master.002.patch
>
>
> HBASE-16592 unify delete request to use AsyncProcess. But the job is not done 
> totally. As we still use rpc call for get, put, append, and increment. We 
> only use AsyncProcess for batch requests. And I found one problem in 
> HBASE-21365. The rpc call will throw a DoNotRetryException but AsyncProcess 
> will wrap it with a new RetriesExhaustedWithDetailsException. It is not 
> right. So I thought HTable.delete should use rpc call directly, it is same 
> with get, put, append and increment request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21383) Change refguide to point at hbck2 instead of hbck1

2018-10-26 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665395#comment-16665395
 ] 

Hudson commented on HBASE-21383:


Results for branch master
[build #569 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/569/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/569//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/569//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/569//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Change refguide to point at hbck2 instead of hbck1
> --
>
> Key: HBASE-21383
> URL: https://issues.apache.org/jira/browse/HBASE-21383
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21383.branch-2.1.001.patch
>
>
> Update the refguide. I



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Artem Ervits (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665347#comment-16665347
 ] 

Artem Ervits commented on HBASE-21175:
--

[~yuzhih...@gmail.com] please review v.05 addressing any newly-introduced and 
existing checkstyle warnings.

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21175) Partially initialized SnapshotHFileCleaner leads to NPE during TestHFileArchiving

2018-10-26 Thread Artem Ervits (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Ervits updated HBASE-21175:
-
Attachment: HBASE-21175.v05.patch

> Partially initialized SnapshotHFileCleaner leads to NPE during 
> TestHFileArchiving
> -
>
> Key: HBASE-21175
> URL: https://issues.apache.org/jira/browse/HBASE-21175
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Artem Ervits
>Priority: Minor
>  Labels: snapshot
> Attachments: HBASE-21175.v01.patch, HBASE-21175.v04.patch, 
> HBASE-21175.v05.patch
>
>
> TestHFileArchiving#testCleaningRace creates HFileCleaner instance within the 
> test.
> When SnapshotHFileCleaner.init() is called, there is no master parameter 
> passed in {{params}}.
> When the chore runs the cleaner during the test, NPE comes out of this line 
> in getDeletableFiles():
> {code}
>   return cache.getUnreferencedFiles(files, master.getSnapshotManager());
> {code}
> since master is null.
> We should either check for the null master or, pass master instance properly 
> when constructing the cleaner instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665324#comment-16665324
 ] 

stack commented on HBASE-21395:
---

Thanks [~allan163] Let me see how hadoopqa does on it. It is a simple check 
before the run of merge/split. Would be good to have I think (ITBLL critical 
for me testing candidates).

> Abort split/merge procedure if there is a table procedure of the same table 
> going on
> 
>
> Key: HBASE-21395
> URL: https://issues.apache.org/jira/browse/HBASE-21395
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21395.branch-2.0.001.patch
>
>
> In my ITBLL, I often see that if split/merge procedure and table 
> procedure(like ModifyTableProcedure) happen at the same time, and since there 
> some race conditions between these two kind of procedures,  causing some 
> serious problems. e.g. the split/merged parent is bought on line by the table 
> procedure or the split merged region making the whole table procedure 
> rollback.
> Talked with [~Apache9] offline today, this kind of problem was solved in 
> branch-2+ since There is a fence that only one RTSP can agianst a single 
> region at the same time.
> To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe 
> fence in the split/merge procedure: If there is a table procedure going on 
> against the same table, then abort the split/merge procedure. Aborting the 
> split/merge procedure at the beginning of the execution is no big deal, 
> compared with the mess it will cause...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21378) [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master is not initialized

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21378:
--
Summary: [hbck2] checkHBCKSupport blocks assigning hbase:meta or 
hbase:namespace when master is not initialized  (was: checkHBCKSupport blocks 
assigning hbase:meta or hbase:namespace when master is not initialized)

> [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when 
> master is not initialized
> --
>
> Key: HBASE-21378
> URL: https://issues.apache.org/jira/browse/HBASE-21378
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools, hbck2
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: 
> 0001-HBASE-21378-checkHBCKSupport-blocks-assigning-hbase-.patch
>
>
> When I encounter the scenario that hbase:namespace is not online.
> {code}
> 2018-10-24,14:38:16,910 WARN org.apache.hadoop.hbase.master.HMaster: 
> hbase:namespace,,1529933109115.7e0801c8232b2dc15face54532056076. is NOT 
> online; state={7e0801c8232b2dc15face54532056076 state=OPEN, ts=1540363033384, 
> server=c4-hadoop-tst-st30.bj,29100,1540348649479}; 
> ServerCrashProcedures=false. Master startup cannot progress, in 
> holding-pattern until region onlined.
> {code}
> Then I tried to assign it manually, but it throws PleaseHoldException.
> {code}
> Wed Oct 24 15:26:52 CST 2018, 
> RpcRetryingCaller{globalStartTime=1540365754487, pause=200, maxAttempts=16}, 
> org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:144)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3133)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3125)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterMetrics(HBaseAdmin.java:2161)
>   at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:98)
>   at org.apache.hbase.HBCK2.run(HBCK2.java:364)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hbase.HBCK2.main(HBCK2.java:447)
> Caused by: org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:361)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:349)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:101)
>   at 
> 

[jira] [Updated] (HBASE-21378) [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master is not initialized

2018-10-26 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21378:
--
Component/s: hbck2

> [hbck2] checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when 
> master is not initialized
> --
>
> Key: HBASE-21378
> URL: https://issues.apache.org/jira/browse/HBASE-21378
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools, hbck2
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: 
> 0001-HBASE-21378-checkHBCKSupport-blocks-assigning-hbase-.patch
>
>
> When I encounter the scenario that hbase:namespace is not online.
> {code}
> 2018-10-24,14:38:16,910 WARN org.apache.hadoop.hbase.master.HMaster: 
> hbase:namespace,,1529933109115.7e0801c8232b2dc15face54532056076. is NOT 
> online; state={7e0801c8232b2dc15face54532056076 state=OPEN, ts=1540363033384, 
> server=c4-hadoop-tst-st30.bj,29100,1540348649479}; 
> ServerCrashProcedures=false. Master startup cannot progress, in 
> holding-pattern until region onlined.
> {code}
> Then I tried to assign it manually, but it throws PleaseHoldException.
> {code}
> Wed Oct 24 15:26:52 CST 2018, 
> RpcRetryingCaller{globalStartTime=1540365754487, pause=200, maxAttempts=16}, 
> org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:144)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3133)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3125)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterMetrics(HBaseAdmin.java:2161)
>   at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:98)
>   at org.apache.hbase.HBCK2.run(HBCK2.java:364)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hbase.HBCK2.main(HBCK2.java:447)
> Caused by: org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:361)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:349)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:101)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
> {code}
> Then I check the code and found it is because of checkHBCKSupport(), I assign 
> hbase:namespace successfully by skipping this check. Thus I think the tool 

[jira] [Commented] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-26 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665313#comment-16665313
 ] 

Allan Yang commented on HBASE-21395:


[~stack], FYI, this can go to 2.1.2. If users don't modify table so frequently 
like ITBLL, the change of race condition is very small.

> Abort split/merge procedure if there is a table procedure of the same table 
> going on
> 
>
> Key: HBASE-21395
> URL: https://issues.apache.org/jira/browse/HBASE-21395
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21395.branch-2.0.001.patch
>
>
> In my ITBLL, I often see that if split/merge procedure and table 
> procedure(like ModifyTableProcedure) happen at the same time, and since there 
> some race conditions between these two kind of procedures,  causing some 
> serious problems. e.g. the split/merged parent is bought on line by the table 
> procedure or the split merged region making the whole table procedure 
> rollback.
> Talked with [~Apache9] offline today, this kind of problem was solved in 
> branch-2+ since There is a fence that only one RTSP can agianst a single 
> region at the same time.
> To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe 
> fence in the split/merge procedure: If there is a table procedure going on 
> against the same table, then abort the split/merge procedure. Aborting the 
> split/merge procedure at the beginning of the execution is no big deal, 
> compared with the mess it will cause...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21378) checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master is not initialized

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665314#comment-16665314
 ] 

stack commented on HBASE-21378:
---

I think --skip may be useful. Maybe work on this as a low-priority item?

> checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master 
> is not initialized
> --
>
> Key: HBASE-21378
> URL: https://issues.apache.org/jira/browse/HBASE-21378
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: 
> 0001-HBASE-21378-checkHBCKSupport-blocks-assigning-hbase-.patch
>
>
> When I encounter the scenario that hbase:namespace is not online.
> {code}
> 2018-10-24,14:38:16,910 WARN org.apache.hadoop.hbase.master.HMaster: 
> hbase:namespace,,1529933109115.7e0801c8232b2dc15face54532056076. is NOT 
> online; state={7e0801c8232b2dc15face54532056076 state=OPEN, ts=1540363033384, 
> server=c4-hadoop-tst-st30.bj,29100,1540348649479}; 
> ServerCrashProcedures=false. Master startup cannot progress, in 
> holding-pattern until region onlined.
> {code}
> Then I tried to assign it manually, but it throws PleaseHoldException.
> {code}
> Wed Oct 24 15:26:52 CST 2018, 
> RpcRetryingCaller{globalStartTime=1540365754487, pause=200, maxAttempts=16}, 
> org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:144)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3133)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3125)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterMetrics(HBaseAdmin.java:2161)
>   at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:98)
>   at org.apache.hbase.HBCK2.run(HBCK2.java:364)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hbase.HBCK2.main(HBCK2.java:447)
> Caused by: org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:361)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:349)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:101)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
> {code}
> Then I check the code and found it is because of checkHBCKSupport(), I assign 
> 

[jira] [Updated] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-26 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21395:
---
Attachment: HBASE-21395.branch-2.0.001.patch

> Abort split/merge procedure if there is a table procedure of the same table 
> going on
> 
>
> Key: HBASE-21395
> URL: https://issues.apache.org/jira/browse/HBASE-21395
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21395.branch-2.0.001.patch
>
>
> In my ITBLL, I often see that if split/merge procedure and table 
> procedure(like ModifyTableProcedure) happen at the same time, and since there 
> some race conditions between these two kind of procedures,  causing some 
> serious problems. e.g. the split/merged parent is bought on line by the table 
> procedure or the split merged region making the whole table procedure 
> rollback.
> Talked with [~Apache9] offline today, this kind of problem was solved in 
> branch-2+ since There is a fence that only one RTSP can agianst a single 
> region at the same time.
> To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe 
> fence in the split/merge procedure: If there is a table procedure going on 
> against the same table, then abort the split/merge procedure. Aborting the 
> split/merge procedure at the beginning of the execution is no big deal, 
> compared with the mess it will cause...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-26 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21395:
---
Status: Patch Available  (was: Open)

> Abort split/merge procedure if there is a table procedure of the same table 
> going on
> 
>
> Key: HBASE-21395
> URL: https://issues.apache.org/jira/browse/HBASE-21395
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.2, 2.1.0
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21395.branch-2.0.001.patch
>
>
> In my ITBLL, I often see that if split/merge procedure and table 
> procedure(like ModifyTableProcedure) happen at the same time, and since there 
> some race conditions between these two kind of procedures,  causing some 
> serious problems. e.g. the split/merged parent is bought on line by the table 
> procedure or the split merged region making the whole table procedure 
> rollback.
> Talked with [~Apache9] offline today, this kind of problem was solved in 
> branch-2+ since There is a fence that only one RTSP can agianst a single 
> region at the same time.
> To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe 
> fence in the split/merge procedure: If there is a table procedure going on 
> against the same table, then abort the split/merge procedure. Aborting the 
> split/merge procedure at the beginning of the execution is no big deal, 
> compared with the mess it will cause...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-26 Thread Allan Yang (JIRA)
Allan Yang created HBASE-21395:
--

 Summary: Abort split/merge procedure if there is a table procedure 
of the same table going on
 Key: HBASE-21395
 URL: https://issues.apache.org/jira/browse/HBASE-21395
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 2.0.2, 2.1.0
Reporter: Allan Yang
Assignee: Allan Yang


In my ITBLL, I often see that if split/merge procedure and table procedure(like 
ModifyTableProcedure) happen at the same time, and since there some race 
conditions between these two kind of procedures,  causing some serious 
problems. e.g. the split/merged parent is bought on line by the table procedure 
or the split merged region making the whole table procedure rollback.
Talked with [~Apache9] offline today, this kind of problem was solved in 
branch-2+ since There is a fence that only one RTSP can agianst a single region 
at the same time.
To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe 
fence in the split/merge procedure: If there is a table procedure going on 
against the same table, then abort the split/merge procedure. Aborting the 
split/merge procedure at the beginning of the execution is no big deal, 
compared with the mess it will cause...




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21237) Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS

2018-10-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665301#comment-16665301
 ] 

Hadoop QA commented on HBASE-21237:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
18s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
18s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
43s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} branch-2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 47s{color} 
| {color:red} hbase-server generated 1 new + 187 unchanged - 1 fixed = 188 
total (was 188) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
39s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 42s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}185m  0s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 4s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}225m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAsyncTableGetMultiThreaded |
|   | hadoop.hbase.master.assignment.TestAssignmentManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-21237 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945727/HBASE-21237-branch-2.1.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 063c3bab471c 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.1 / 127de9e637 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| javac | 

[jira] [Updated] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-26 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21322:
-
Attachment: HBASE-21322.master.005.patch

> Add a scheduleServerCrashProcedure() API to HbckService
> ---
>
> Key: HBASE-21322
> URL: https://issues.apache.org/jira/browse/HBASE-21322
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21322.master.001.patch, 
> HBASE-21322.master.002.patch, HBASE-21322.master.003.patch, 
> HBASE-21322.master.004.patch, HBASE-21322.master.005.patch, Screenshot from 
> 2018-10-17 13-35-58.png, Screenshot from 2018-10-17 13-38-41.png, Screenshot 
> from 2018-10-17 13-47-06.png
>
>
> According to my test, if one RS is down, then all procedure logs are deleted, 
> it will lead to that no ServerCrashProcedure is scheduled. And restarting 
> master cannot help. Thus we need to schedule a ServerCrashProcedure manually 
> to solve the problem. I plan to add a scheduleServerCrashProcedure() API to 
> HbckService, then add this API to HBCK2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20671) Merged region brought back to life causing RS to be killed by Master

2018-10-26 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665266#comment-16665266
 ] 

Josh Elser commented on HBASE-20671:


Sweet! I'll have to keep an eye out for it too with that info. Thanks Allan! 
Super helpful.

> Merged region brought back to life causing RS to be killed by Master
> 
>
> Key: HBASE-20671
> URL: https://issues.apache.org/jira/browse/HBASE-20671
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Affects Versions: 2.0.0
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Attachments: 0001-Test-for-HBASE-20671.patch, 
> hbase-hbase-master-ctr-e138-1518143905142-336066-01-03.hwx.site.log.zip, 
> hbase-hbase-regionserver-ctr-e138-1518143905142-336066-01-02.hwx.site.log.zip,
>  workaround.txt
>
>
> Another bug coming out of a master restart and replay of the pv2 logs.
> The master merged two regions into one successfully, was restarted, but then 
> ended up assigning the children region back out to the cluster. There is a 
> log message which appears to indicate that RegionStates acknowledges that it 
> doesn't know what this region is as it's replaying the pv2 WAL; however, it 
> incorrectly assumes that the region is just OFFLINE and needs to be assigned.
> {noformat}
> 2018-05-30 04:26:00,055 INFO  
> [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=2] master.HMaster: 
> Client=hrt_qa//172.27.85.11 Merge regions a7dd6606dcacc9daf085fc9fa2aecc0c 
> and 4017a3c778551d4d258c785d455f9c0b
> 2018-05-30 04:28:27,525 DEBUG 
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> procedure2.ProcedureExecutor: Completed pid=4368, state=SUCCESS; 
> MergeTableRegionsProcedure table=tabletwo_merge, 
> regions=[a7dd6606dcacc9daf085fc9fa2aecc0c, 4017a3c778551d4d258c785d455f9c0b], 
> forcibly=false
> {noformat}
> {noformat}
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: a7dd6606dcacc9daf085fc9fa2aecc0c 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,263 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=a7dd6606dcacc9daf085fc9fa2aecc0c
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.AssignmentManager: 4017a3c778551d4d258c785d455f9c0b 
> regionState=null; presuming OFFLINE
> 2018-05-30 04:29:20,266 INFO  
> [master/ctr-e138-1518143905142-336066-01-03:2] 
> assignment.RegionStates: Added to offline, CURRENTLY NEVER CLEARED!!! 
> rit=OFFLINE, location=null, table=tabletwo_merge, 
> region=4017a3c778551d4d258c785d455f9c0b
> {noformat}
> Eventually, the RS reports in its online regions, and the master tells it to 
> kill itself:
> {noformat}
> 2018-05-30 04:29:24,272 WARN  
> [RpcServer.default.FPBQ.Fifo.handler=26,queue=2,port=2] 
> assignment.AssignmentManager: Killing 
> ctr-e138-1518143905142-336066-01-02.hwx.site,16020,1527654546619: Not 
> online: tabletwo_merge,,1527652130538.a7dd6606dcacc9daf085fc9fa2aecc0c.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665255#comment-16665255
 ] 

Hadoop QA commented on HBASE-21322:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
19s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
25s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
12m 18s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
38s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
23s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}141m 
26s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 7s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}206m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21322 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945724/HBASE-21322.master.004.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  cc  hbaseprotoc  |
| 

[jira] [Commented] (HBASE-21237) Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS

2018-10-26 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665223#comment-16665223
 ] 

Allan Yang commented on HBASE-21237:


+1 for it, I think branch-2.0 also need it, [~stack] mentioned that assigning 
regions was too slow.

> Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS
> --
>
> Key: HBASE-21237
> URL: https://issues.apache.org/jira/browse/HBASE-21237
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Blocker
> Fix For: 2.1.1, 2.0.3
>
> Attachments: HBASE-21237-branch-2.1.patch, 
> HBASE-21237.branch-2.0.001.patch
>
>
> As discussed in HBASE-21217, in branch-2.0 and branch-2.1, we should use  
> CompatRemoteProcedureResolver  instead of ExecuteProceduresRemoteCall to 
> dispatch region open/close requests to RS. Since ExecuteProceduresRemoteCall  
> will group all the open/close operations in one call and execute them 
> sequentially on the target RS. If one operation fails, all the operation will 
> be marked as failure. Actually, some of the operations(like open region) is 
> already executing in the open region handler thread. But master thinks these 
> operations fails and reassign the regions to another RS. So when the previous 
> RS report to the master that the region is online, master will kill the RS 
> since it already assign the region to another RS.
> For branch-2.2+, HBASE-21217 will fix this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21378) checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master is not initialized

2018-10-26 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665221#comment-16665221
 ] 

Jingyun Tian commented on HBASE-21378:
--

yeah, your issue should fix the problem. My code that time may not be the 
latest. So should I still work on this? -skip may be useful in the future.

> checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master 
> is not initialized
> --
>
> Key: HBASE-21378
> URL: https://issues.apache.org/jira/browse/HBASE-21378
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: 
> 0001-HBASE-21378-checkHBCKSupport-blocks-assigning-hbase-.patch
>
>
> When I encounter the scenario that hbase:namespace is not online.
> {code}
> 2018-10-24,14:38:16,910 WARN org.apache.hadoop.hbase.master.HMaster: 
> hbase:namespace,,1529933109115.7e0801c8232b2dc15face54532056076. is NOT 
> online; state={7e0801c8232b2dc15face54532056076 state=OPEN, ts=1540363033384, 
> server=c4-hadoop-tst-st30.bj,29100,1540348649479}; 
> ServerCrashProcedures=false. Master startup cannot progress, in 
> holding-pattern until region onlined.
> {code}
> Then I tried to assign it manually, but it throws PleaseHoldException.
> {code}
> Wed Oct 24 15:26:52 CST 2018, 
> RpcRetryingCaller{globalStartTime=1540365754487, pause=200, maxAttempts=16}, 
> org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:144)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3133)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3125)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterMetrics(HBaseAdmin.java:2161)
>   at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:98)
>   at org.apache.hbase.HBCK2.run(HBCK2.java:364)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hbase.HBCK2.main(HBCK2.java:447)
> Caused by: org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:361)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:349)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:101)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
> {code}

[jira] [Commented] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-26 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665209#comment-16665209
 ] 

Jingyun Tian commented on HBASE-21322:
--

[~stack] OK. I was not sure should I use ServerName because I think String 
makes the API looks simpler. Let me modify the code.

> Add a scheduleServerCrashProcedure() API to HbckService
> ---
>
> Key: HBASE-21322
> URL: https://issues.apache.org/jira/browse/HBASE-21322
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21322.master.001.patch, 
> HBASE-21322.master.002.patch, HBASE-21322.master.003.patch, 
> HBASE-21322.master.004.patch, Screenshot from 2018-10-17 13-35-58.png, 
> Screenshot from 2018-10-17 13-38-41.png, Screenshot from 2018-10-17 
> 13-47-06.png
>
>
> According to my test, if one RS is down, then all procedure logs are deleted, 
> it will lead to that no ServerCrashProcedure is scheduled. And restarting 
> master cannot help. Thus we need to schedule a ServerCrashProcedure manually 
> to solve the problem. I plan to add a scheduleServerCrashProcedure() API to 
> HbckService, then add this API to HBCK2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21237) Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS

2018-10-26 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665202#comment-16665202
 ] 

Duo Zhang commented on HBASE-21237:
---

Ping [~stack] and [~allan163].

> Use CompatRemoteProcedureResolver to dispatch open/close region requests to RS
> --
>
> Key: HBASE-21237
> URL: https://issues.apache.org/jira/browse/HBASE-21237
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Blocker
> Fix For: 2.1.1, 2.0.3
>
> Attachments: HBASE-21237-branch-2.1.patch, 
> HBASE-21237.branch-2.0.001.patch
>
>
> As discussed in HBASE-21217, in branch-2.0 and branch-2.1, we should use  
> CompatRemoteProcedureResolver  instead of ExecuteProceduresRemoteCall to 
> dispatch region open/close requests to RS. Since ExecuteProceduresRemoteCall  
> will group all the open/close operations in one call and execute them 
> sequentially on the target RS. If one operation fails, all the operation will 
> be marked as failure. Actually, some of the operations(like open region) is 
> already executing in the open region handler thread. But master thinks these 
> operations fails and reassign the regions to another RS. So when the previous 
> RS report to the master that the region is online, master will kill the RS 
> since it already assign the region to another RS.
> For branch-2.2+, HBASE-21217 will fix this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21391) RefreshPeerProcedure should also wait master initialized before executing

2018-10-26 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21391:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to branch-2.1+.

> RefreshPeerProcedure should also wait master initialized before executing
> -
>
> Key: HBASE-21391
> URL: https://issues.apache.org/jira/browse/HBASE-21391
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-21391.patch
>
>
> Missed this one when introducing the waitInitialized method in Procedure, and 
> found when implementing HBASE-21389.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665198#comment-16665198
 ] 

Duo Zhang commented on HBASE-20973:
---

Yes, you can try it, without the modification in ProcedureStoreTracker it will 
fail with waiting timeout. And I've already reverted the previous patch on all 
branches.

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no 

[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665193#comment-16665193
 ] 

stack commented on HBASE-20973:
---

Looks good to me. +1. Nice test. It fails w/o the patch? Will wait on hadoopqa 
and then apply. It does not do the revert. I'll do that before applying this. 
[~Apache9].

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter 

[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665190#comment-16665190
 ] 

Allan Yang commented on HBASE-20973:


I think it's OK

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what 
> happens, the table's exclusive lock can always be relased.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665188#comment-16665188
 ] 

Duo Zhang commented on HBASE-20973:
---

Add a warn log if the node does not exist. [~stack] [~allan163] What do you 
guys think of this fix? Thanks.

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what 
> happens, the table's exclusive lock can always 

[jira] [Updated] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20973:
--
Assignee: Duo Zhang  (was: Allan Yang)
  Status: Patch Available  (was: Reopened)

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.0.1, 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what 
> happens, the table's exclusive lock can always be relased.



--
This message was sent by Atlassian JIRA

[jira] [Updated] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20973:
--
Attachment: HBASE-20973.patch

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what 
> happens, the table's exclusive lock can always be relased.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-20973:
--
Attachment: (was: HBASE-20973-UT.patch)

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch, HBASE-20973.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case in HBASE-20921 but I just 
> can't reproduce it.
> A easy way to resolve this is add a try catch, making sure no matter what 
> happens, the table's exclusive lock can always be relased.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665183#comment-16665183
 ] 

stack commented on HBASE-21322:
---

Oh, hang on Why are you using String for ServerName? See in HBase.proto 
where we have proto definition for ServerName. Use that?

> Add a scheduleServerCrashProcedure() API to HbckService
> ---
>
> Key: HBASE-21322
> URL: https://issues.apache.org/jira/browse/HBASE-21322
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21322.master.001.patch, 
> HBASE-21322.master.002.patch, HBASE-21322.master.003.patch, 
> HBASE-21322.master.004.patch, Screenshot from 2018-10-17 13-35-58.png, 
> Screenshot from 2018-10-17 13-38-41.png, Screenshot from 2018-10-17 
> 13-47-06.png
>
>
> According to my test, if one RS is down, then all procedure logs are deleted, 
> it will lead to that no ServerCrashProcedure is scheduled. And restarting 
> master cannot help. Thus we need to schedule a ServerCrashProcedure manually 
> to solve the problem. I plan to add a scheduleServerCrashProcedure() API to 
> HbckService, then add this API to HBCK2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21322) Add a scheduleServerCrashProcedure() API to HbckService

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665178#comment-16665178
 ] 

stack commented on HBASE-21322:
---

bq. Seems not related?

You are right.



> Add a scheduleServerCrashProcedure() API to HbckService
> ---
>
> Key: HBASE-21322
> URL: https://issues.apache.org/jira/browse/HBASE-21322
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21322.master.001.patch, 
> HBASE-21322.master.002.patch, HBASE-21322.master.003.patch, 
> HBASE-21322.master.004.patch, Screenshot from 2018-10-17 13-35-58.png, 
> Screenshot from 2018-10-17 13-38-41.png, Screenshot from 2018-10-17 
> 13-47-06.png
>
>
> According to my test, if one RS is down, then all procedure logs are deleted, 
> it will lead to that no ServerCrashProcedure is scheduled. And restarting 
> master cannot help. Thus we need to schedule a ServerCrashProcedure manually 
> to solve the problem. I plan to add a scheduleServerCrashProcedure() API to 
> HbckService, then add this API to HBCK2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21378) checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master is not initialized

2018-10-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665176#comment-16665176
 ] 

stack commented on HBASE-21378:
---

[~tianjingyun]

The below was supposed to address this case. Did you have it in the hbase you 
were playing with?

4ad63d77be HBASE-21345 [hbck2] Allow version check to proceed even though 
master is 'initializing'.

That said, this change looks like it could come in handy. I think the -s/--skip 
should be a general option rather than an option per command since every 
command calls check hbck/version before it runs?

Thanks for working on this.

> checkHBCKSupport blocks assigning hbase:meta or hbase:namespace when master 
> is not initialized
> --
>
> Key: HBASE-21378
> URL: https://issues.apache.org/jira/browse/HBASE-21378
> Project: HBase
>  Issue Type: Bug
>  Components: hbase-operator-tools
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: 
> 0001-HBASE-21378-checkHBCKSupport-blocks-assigning-hbase-.patch
>
>
> When I encounter the scenario that hbase:namespace is not online.
> {code}
> 2018-10-24,14:38:16,910 WARN org.apache.hadoop.hbase.master.HMaster: 
> hbase:namespace,,1529933109115.7e0801c8232b2dc15face54532056076. is NOT 
> online; state={7e0801c8232b2dc15face54532056076 state=OPEN, ts=1540363033384, 
> server=c4-hadoop-tst-st30.bj,29100,1540348649479}; 
> ServerCrashProcedures=false. Master startup cannot progress, in 
> holding-pattern until region onlined.
> {code}
> Then I tried to assign it manually, but it throws PleaseHoldException.
> {code}
> Wed Oct 24 15:26:52 CST 2018, 
> RpcRetryingCaller{globalStartTime=1540365754487, pause=200, maxAttempts=16}, 
> org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:144)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3133)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3125)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterMetrics(HBaseAdmin.java:2161)
>   at org.apache.hbase.HBCK2.checkHBCKSupport(HBCK2.java:98)
>   at org.apache.hbase.HBCK2.run(HBCK2.java:364)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hbase.HBCK2.main(HBCK2.java:447)
> Caused by: org.apache.hadoop.hbase.PleaseHoldException: 
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3064)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:934)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:361)
>   at 
> 

[jira] [Commented] (HBASE-20973) ArrayIndexOutOfBoundsException when rolling back procedure

2018-10-26 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665173#comment-16665173
 ] 

Duo Zhang commented on HBASE-20973:
---

Yes, I think we can relax the assertion for now. The rollback processing still 
need a bit polishing, can do this later. And [~stack] yes, let's hold up a bit 
and I will prepare a patch soon. And if it is OK you can just commit it, as 
probably we will be sleeping at that time...

> ArrayIndexOutOfBoundsException when rolling back procedure
> --
>
> Key: HBASE-20973
> URL: https://issues.apache.org/jira/browse/HBASE-20973
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-20973-UT.patch, HBASE-20973.branch-2.0.001.patch, 
> HBASE-20973.branch-2.0.002.patch
>
>
> Find this one while investigating HBASE-20921. After the root 
> procedure(ModifyTableProcedure  in this case) rolled back, a 
> ArrayIndexOutOfBoundsException was thrown
> {code}
> 2018-07-18 01:39:10,241 ERROR [PEWorker-8] procedure2.ProcedureExecutor(159): 
> CODE-BUG: Uncaught runtime exception for pid=5973, 
> state=FAILED:MODIFY_TABLE_REOPEN_ALL_REGIONS, exception=java.lang.NullPo
> interException via CODE-BUG: Uncaught runtime exception: pid=5974, ppid=5973, 
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED; 
> ReopenTableRegionsProcedure table=IntegrationTestBigLinkedList:java.l
> ang.NullPointerException; ModifyTableProcedure 
> table=IntegrationTestBigLinkedList
> java.lang.UnsupportedOperationException: unhandled 
> state=MODIFY_TABLE_REOPEN_ALL_REGIONS
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:147)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.rollbackState(ModifyTableProcedure.java:50)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.rollback(StateMachineProcedure.java:203)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doRollback(Procedure.java:864)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1353)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> 2018-07-18 01:39:10,243 WARN  [PEWorker-8] 
> procedure2.ProcedureExecutor(1756): Worker terminating UNNATURALLY null
> java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.updateState(ProcedureStoreTracker.java:405)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker$BitSetNode.delete(ProcedureStoreTracker.java:178)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:513)
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:505)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:741)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:691)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:603)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1387)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeRollback(ProcedureExecutor.java:1309)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1178)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:75)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1741)
> {code}
> This is a very serious condition, After this exception thrown, the exclusive 
> lock held by ModifyTableProcedure was never released. All the procedure 
> against this table were blocked. Until the master restarted, and since the 
> lock info for the procedure won't be restored, the other procedures can go 
> again, it is quite embarrassing that a bug save us...(this bug will be fixed 
> in HBASE-20846)
> I tried to reproduce this one using the test case 

  1   2   >