[ https://issues.apache.org/jira/browse/HBASE-25735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315990#comment-17315990 ]
Hudson commented on HBASE-25735: -------------------------------- Results for branch master [build #256 on builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/256/]: (x) *{color:red}-1 overall{color}* ---- details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/256/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/256/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/256/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} -- Something went wrong with this stage, [check relevant console output|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/256//console]. > Add target Region to connection exceptions > ------------------------------------------ > > Key: HBASE-25735 > URL: https://issues.apache.org/jira/browse/HBASE-25735 > Project: HBase > Issue Type: Bug > Components: rpc > Reporter: Michael Stack > Assignee: Michael Stack > Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3 > > > We spent a bit of time making it so exceptions included the remote host name. > Looks like we can add the target Region name too with a bit of manipulation; > will help figuring hot-spotting or problem Region on serverside. For > example, here is what I was seeing recently on client-side when a RS was was > timing out requests: > {code} > 2021-04-06T02:18:23.533Z, RpcRetryingCaller{globalStartTime=1617675482894, > pause=100, maxAttempts=4}, org.apache.hadoop.hbase.ipc.CallTimeoutException: > Call to ps0989.example.org/1.1.1.1:16020 failed on local exception: > org.apache.hadoop.hbase.ipc.CallTimeoutException: > Call[id=88369369,methodName=Get], waitTime=5006, rpcTimeout=5000 > at > org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:145) > at org.apache.hadoop.hbase.client.HTable.get(HTable.java:383) > at org.apache.hadoop.hbase.client.HTable.get(HTable.java:357) > ... > Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to > ps0989.bot.parsec.apple.com/17.58.114.206:16020 failed on local exception: > org.apache.hadoop.hbase.ipc.CallTimeoutException: > Call[id=88369369,methodName=Get], waitTime=5006, rpcTimeout=5000 > at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:209) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:378) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:89) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:409) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:405) > at org.apache.hadoop.hbase.ipc.Call.setTimeout(Call.java:110) > at > org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:136) > at > org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:672) > at > org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:747) > at > org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:472) > ... 1 more > Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: > Call[id=88369369,methodName=Get], waitTime=5006, rpcTimeout=5000 > at > org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:137) > ... 4 more > {code} > I wanted the region it was hitting. I wanted to know if it was a server > problem or a Region issue. If clients only having issue w/ one Region, then I > could focus on it. > After the PR the exception (from another context) looks like this: > {code} > org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to > address=127.0.0.1:12345, regionInfo=hbase:meta,,1.1588230740 failed on local > exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: error > .... > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)