[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2024-01-14 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17806618#comment-17806618
 ] 

Hudson commented on HBASE-28031:


Results for branch branch-2.6
[build #31 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/31/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/31/General_20Nightly_20Build_20Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/31/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/31/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/31/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.8
>
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2024-01-13 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17806416#comment-17806416
 ] 

Bryan Beaudreault commented on HBASE-28031:
---

[~zhangduo] I see this Jira has fixVersion of 2.6.0, but I don't see a related 
commit in the branch. Should we cherry-pick it?

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.8
>
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy42.addBlock(Unknown Source) ~[?:?]
>   at 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-12-16 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17797489#comment-17797489
 ] 

Hudson commented on HBASE-28031:


Results for branch branch-3
[build #104 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/104/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/104/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/104/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/104/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.8
>
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-12-15 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17797361#comment-17797361
 ] 

Hudson commented on HBASE-28031:


Results for branch branch-2
[build #944 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/944/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/944/General_20Nightly_20Build_20Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/944/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/944/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/944/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.8
>
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-12-15 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17797353#comment-17797353
 ] 

Hudson commented on HBASE-28031:


Results for branch branch-2.4
[build #667 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/667/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/667/General_20Nightly_20Build_20Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/667/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/667/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/667/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.8
>
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-12-15 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17797315#comment-17797315
 ] 

Hudson commented on HBASE-28031:


Results for branch branch-2.5
[build #450 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/450/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/450/General_20Nightly_20Build_20Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/450/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/450/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/450/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.8
>
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-12-15 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17797300#comment-17797300
 ] 

Hudson commented on HBASE-28031:


Results for branch master
[build #964 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/964/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/964/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/964/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/964/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-1, 2.5.8
>
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-12-14 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796622#comment-17796622
 ] 

Duo Zhang commented on HBASE-28031:
---

Thanks [~bbeaudreault] for the useful input.

Let me add more logs here.

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy42.addBlock(Unknown Source) ~[?:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-12-01 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17792113#comment-17792113
 ] 

Bryan Beaudreault commented on HBASE-28031:
---

Ok so that one, it looks like the test causing all that spam 
(testUserTableClusterScopeQuota) actually succeeds. The job times out during 
testUserNamespaceClusterScopeQuota. According to the thread dump at the end, 
it's while trying to refresh quota cache:
{code:java}
"Listener at localhost/42451" daemon prio=5 tid=18 runnable
java.lang.Thread.State: RUNNABLE
at javax.security.auth.Subject.getSubject(Subject.java:297)
at 
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:577)
at 
org.apache.hadoop.hbase.security.User$SecureHadoopUser.(User.java:280)
at org.apache.hadoop.hbase.security.User.getCurrent(User.java:160)
at 
org.apache.hadoop.hbase.quotas.ThrottleQuotaTestUtil.triggerCacheRefresh(ThrottleQuotaTestUtil.java:156)
at 
org.apache.hadoop.hbase.quotas.ThrottleQuotaTestUtil.triggerUserCacheRefresh(ThrottleQuotaTestUtil.java:109)
at 
org.apache.hadoop.hbase.quotas.TestClusterScopeQuotaThrottle.testUserNamespaceClusterScopeQuota(TestClusterScopeQuotaThrottle.java:197)
 {code}
Looks like there's a [while loop in 
there|https://github.com/apache/hbase/blob/master/hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/ThrottleQuotaTestUtil.java#L150-L184]
 which doesn't have a timeout. We should probably update that to use Waiter 
with a timeout, so at least the test will fail more usefully.

The while loop triggers the QuotaRefresherChore. According to the thread dump, 
that chore is waiting on a future:
{code:java}
"regionserver/jenkins-hbase19:0.Chore.1" daemon prio=5 tid=366 in Object.wait()
java.lang.Thread.State: WAITING (on object monitor)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
at 
java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at 
java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at org.apache.hadoop.hbase.util.FutureUtils.get(FutureUtils.java:182)
at 
org.apache.hadoop.hbase.client.TableOverAsyncTable.get(TableOverAsyncTable.java:193)
at 
org.apache.hadoop.hbase.quotas.QuotaTableUtil.doGet(QuotaTableUtil.java:910)
at 
org.apache.hadoop.hbase.quotas.QuotaUtil.fetchGlobalQuotas(QuotaUtil.java:375)
at 
org.apache.hadoop.hbase.quotas.QuotaUtil.fetchNamespaceQuotas(QuotaUtil.java:342)
at 
org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore$1.fetchEntries(QuotaCache.java:274)
at 
org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.fetch(QuotaCache.java:365)
at 
org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.fetchNamespaceQuotaState(QuotaCache.java:266)
at 
org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:257)
 {code}
This is just a point-in-time, and I don't see any exceptions. I don't know if 
that Future is actually blocked, or maybe it returns quickly but not what the 
triggerCacheRefresh is looking for. We might need more logging. There's a 
LOG.debug("QuotaCache") dump _after_ the while loop. Maybe we need to move it 
to _inside_ the while loop.

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-12-01 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17792086#comment-17792086
 ] 

Duo Zhang commented on HBASE-28031:
---

https://nightlies.apache.org/hbase/HBase-Flaky-Tests/master/14210/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.quotas.TestClusterScopeQuotaThrottle-output.txt

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy42.addBlock(Unknown Source) ~[?:?]
>   at 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-12-01 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17792083#comment-17792083
 ] 

Bryan Beaudreault commented on HBASE-28031:
---

Is the nightlies browser really slow? I can load the main index at 
[https://nightlies.apache.org/hbase/HBase-Flaky-Tests/] but can't click into 
master, it just hangs forever. Can you link me a more recent failure?

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy42.addBlock(Unknown 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-12-01 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17792074#comment-17792074
 ] 

Duo Zhang commented on HBASE-28031:
---

Because we use a manual EnvironmentEdge for getting timestamp. I have fixed 
this problem in the last merge PR for this issue, where I limited the scope of 
the EnvironmentEdge to only take effect in quotas package.

You can see newer failure logs, which should not have the JVM pauses then.

Thanks.

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>  

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-11-30 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791809#comment-17791809
 ] 

Bryan Beaudreault commented on HBASE-28031:
---

I'm actually not sure that exception is the problem. If you look at the logs, 
we won't honor that 6m backoff:
{code:java}
2023-11-16T04:46:41,033 DEBUG [RPCClient-NioEventLoopGroup-5-2 {}] 
backoff.HBaseServerExceptionPauseManager(61): RpcThrottlingException suggested 
pause of 3600ns which would exceed the timeout. We should throw instead.
org.apache.hadoop.hbase.quotas.RpcThrottlingException: 
org.apache.hadoop.hbase.quotas.RpcThrottlingException: number of read requests 
exceeded - wait 6mins, 0ms {code}
Since the backoff is beyond the configured rpc timeout, it just gets thrown as 
an exception. The doGets method will catch and log the IOException:
{code:java}
2023-11-16T04:46:41,026 ERROR [Listener at localhost/37593 {}] 
quotas.ThrottleQuotaTestUtil(100): get failed after nRetries=10
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=1, exceptions:
2023-11-16T05:54:26.933Z, 
org.apache.hadoop.hbase.quotas.RpcThrottlingException: 
org.apache.hadoop.hbase.quotas.RpcThrottlingException: number of read requests 
exceeded - wait 6mins, 0ms {code}
Notice attempts=1, so on the first attempt it bails out for the above "We 
should throw instead" reason.

Anyway, the reason we are seeing 6 minutes here is by design for the test:
 * The throttle is set to 20 req/{*}hour{*}
 * ClusterThrottle works by dividing the total throttle by the number of 
regions.
 * A server with 1 region can do 10 req/hour, and they try sending 20 requests
 * So the 10 requests will succeed, but consume the entire throttle. With 10 
req/hour, it takes 60/10 = 6 minutes for another request to be allowed

Why are we seeing so many JVM pauses?
{code:java}
2023-11-16T04:46:41,361 WARN  [M:0;jenkins-hbase19:38849 {}] util.Sleeper(86): 
We slept 208100ms instead of 100ms, this is likely due to a long garbage 
collecting pause and it's usually bad, see 
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired {code}
That's just one of many

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-11-30 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791625#comment-17791625
 ] 

Bryan Beaudreault commented on HBASE-28031:
---

I will take a look. I don't use ClusterQuota, just RegionServerQuota. But the 
way the wait time is calculated is:
 * Let's say 1 byte/sec throttle
 * Get comes in and returns returns 361 bytes
 * That exceeds the quota by 360 bytes
 * So if another Get comes in, it will be asked to wait at least 6 minutes 
because 6 *60 * 1 would get us back to having more quota available

There's a little more to it, different requests when checking quota check for 
different minimum quotas available, so it'll probably be a little more than 6 
minutes in this example. But you get the idea.

Last time I looked at ClusterQuota, it's a simple summing of quota usages 
across all regionservers. Will need to look again.

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>  

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-11-29 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791393#comment-17791393
 ] 

Duo Zhang commented on HBASE-28031:
---

{noformat}
2023-11-29T08:45:46,778 DEBUG [RPCClient-NioEventLoopGroup-5-4 {}] 
client.AsyncRegionLocatorHelper(64): Try updating 
region=TestNs:TestTable,1,1701247545341.d0c6bfe41f40ef43702adbe0b5d09208., 
hostname=jenkins-hbase19.apache.org,37211,1701247539203, seqNum=-1 , the old 
value is 
region=TestNs:TestTable,1,1701247545341.d0c6bfe41f40ef43702adbe0b5d09208., 
hostname=jenkins-hbase19.apache.org,37211,1701247539203, seqNum=-1, 
error=org.apache.hadoop.hbase.quotas.RpcThrottlingException: 
org.apache.hadoop.hbase.quotas.RpcThrottlingException: number of read requests 
exceeded - wait 6mins, 0ms
at 
org.apache.hadoop.hbase.quotas.RpcThrottlingException.throwThrottlingException(RpcThrottlingException.java:133)
at 
org.apache.hadoop.hbase.quotas.RpcThrottlingException.throwNumReadRequestsExceeded(RpcThrottlingException.java:99)
at 
org.apache.hadoop.hbase.quotas.TimeBasedLimiter.checkQuota(TimeBasedLimiter.java:172)
at 
org.apache.hadoop.hbase.quotas.DefaultOperationQuota.checkQuota(DefaultOperationQuota.java:82)
at 
org.apache.hadoop.hbase.quotas.RegionServerRpcQuotaManager.checkQuota(RegionServerRpcQuotaManager.java:220)
at 
org.apache.hadoop.hbase.quotas.RegionServerRpcQuotaManager.checkQuota(RegionServerRpcQuotaManager.java:168)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2456)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:44992)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:437)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)
at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)
at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82)
{noformat}

[~bbeaudreault] Do you know the details of how to calculate the throttling wait 
time? The test is still a bit flaky, I think the problem is that we sometimes 
let the client wait for 6 minutes, so it is very easy to timeout...

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-11-29 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791337#comment-17791337
 ] 

Hudson commented on HBASE-28031:


Results for branch master
[build #953 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/953/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/953/General_20Nightly_20Build_20Report/]




(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/953/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/953/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
>   at 
> 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-11-28 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790479#comment-17790479
 ] 

Duo Zhang commented on HBASE-28031:
---

Merge the new PR to master.

Let's see if the test could disappear on the flaky dashboard.

Thanks [~zghao] for reviewing!

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy42.addBlock(Unknown Source) ~[?:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-11-20 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788238#comment-17788238
 ] 

Duo Zhang commented on HBASE-28031:
---

OK it does not fail with the exception in the description now but it could 
still fail with timeout...

I think injecting a manual EnvironmentEdge is not a stable operation when you 
want to start a mini cluster...

Let me think if we can limit the scope of the EnvironmentEdge...

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>  ~[hadoop-common-3.2.4.jar:?]
> 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-11-20 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788149#comment-17788149
 ] 

Hudson commented on HBASE-28031:


Results for branch master
[build #948 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/948/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/948/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/948/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/948/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
>   at 
> 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-11-19 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787782#comment-17787782
 ] 

Duo Zhang commented on HBASE-28031:
---

Merged the PR to master.

Let's see if this could fix the problem.

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_362]
>   at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy42.addBlock(Unknown Source) ~[?:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
> 

[jira] [Commented] (HBASE-28031) TestClusterScopeQuotaThrottle is still failing with broken WAL writer

2023-11-16 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786797#comment-17786797
 ] 

Duo Zhang commented on HBASE-28031:
---

I think I found the root cause here.

https://nightlies.apache.org/hbase/HBase-Flaky-Tests/master/13861/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.quotas.TestClusterScopeQuotaThrottle-output.txt

{noformat}
2023-11-16T04:46:44,553 INFO  [MiniHBaseClusterRegionServer-EventLoopGroup-4-2 
{}] monitor.StreamSlowMonitor(154): Slow datanode: 
DatanodeInfoWithStorage[127.0.0.1:40519,DS-38d4e95d-6266-4736-ad99-fbaa6d9a3d9e,DISK],
 data length=0, duration=178300ms, unfinishedReplicas=0, 
lastAckTimestamp=1700140678333, monitor name: defaultMonitorName
2023-11-16T04:46:44,553 INFO  [MiniHBaseClusterRegionServer-EventLoopGroup-4-2 
{}] monitor.ExcludeDatanodeManager(82): Added datanode: 
DatanodeInfoWithStorage[127.0.0.1:40519,DS-38d4e95d-6266-4736-ad99-fbaa6d9a3d9e,DISK]
 to exclude cache by [slow packet ack] success, current excludeDNsCache size=1
{noformat}

It is because that we inject EnvironmentEdge in this test to simulate time 
elapse but StreamSlowMonitor also use EnvironmentEdgeManager to detect slow DNs 
and the injected EnvironmentEdge messes things up...

I think when calculating time cost, we should use System.nanoTime.

> TestClusterScopeQuotaThrottle is still failing with broken WAL writer
> -
>
> Key: HBASE-28031
> URL: https://issues.apache.org/jira/browse/HBASE-28031
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Priority: Major
>
> {noformat}
> 2023-08-17T10:47:31,026 WARN  [regionserver/jenkins-hbase19:0.logRoller {}] 
> asyncfs.FanOutOneBlockAsyncDFSOutputHelper(515): create fan-out dfs output 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  failed, retry = 0
> org.apache.hadoop.ipc.RemoteException: File 
> /user/jenkins/test-data/bb8017fa-92f5-92c9-2f1d-aa9b90cf4b80/WALs/jenkins-hbase19.apache.org,43363,1692269230784/jenkins-hbase19.apache.org%2C43363%2C1692269230784.meta.1692433272886.meta
>  could only be written to 0 of the 1 minReplication nodes. There are 2 
> datanode(s) running and 2 node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2276)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:910)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:577)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1558) 
> ~[hadoop-common-3.2.4.jar:?]
>   at org.apache.hadoop.ipc.Client.call(Client.java:1455) 
> ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
>  ~[hadoop-common-3.2.4.jar:?]
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.4.jar:?]
>   at com.sun.proxy.$Proxy41.addBlock(Unknown Source) ~[?:?]
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520)
>  ~[hadoop-hdfs-client-3.2.4.jar:?]
>   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) ~[?:?]
>   at 
>