[
https://issues.apache.org/jira/browse/HBASE-27961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nihal Jain updated HBASE-27961:
-------------------------------
Release Note: Add support for batching in the following commands: assigns,
unassigns and bypass
> [HBCK2] Running assigns/unassigns command with large number of files/regions
> throws CallTimeoutException
> --------------------------------------------------------------------------------------------------------
>
> Key: HBASE-27961
> URL: https://issues.apache.org/jira/browse/HBASE-27961
> Project: HBase
> Issue Type: Bug
> Components: hbase-operator-tools, hbck2
> Reporter: Nihal Jain
> Assignee: Nihal Jain
> Priority: Major
> Fix For: hbase-operator-tools-1.3.0
>
>
> While trying to run assigns command with a huge list of region, it fails with
> CTE. Even on trying to run it by breaking input into multiple files, it still
> fails and have to blindly submit same command again and again until no error.
> Exception seen as described above is as follows:
> {code:java}
> Exception in thread "main" java.io.IOException:
> org.apache.hbase.thirdparty.com.google.protobuf.ServiceException:
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to
> address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException:
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
> at org.apache.hadoop.hbase.client.HBaseHbck.assigns(HBaseHbck.java:146)
> at org.apache.hbase.HBCK2.assigns(HBCK2.java:454)
> at org.apache.hbase.HBCK2.doCommandLine(HBCK2.java:1070)
> at org.apache.hbase.HBCK2.run(HBCK2.java:1028)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hbase.HBCK2.main(HBCK2.java:1367)
> Caused by: org.apache.hbase.thirdparty.com.google.protobuf.ServiceException:
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to
> address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException:
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:340)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$200(AbstractRpcClient.java:92)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:594)
> at
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$BlockingStub.assigns(MasterProtos.java)
> at org.apache.hadoop.hbase.client.HBaseHbck.assigns(HBaseHbck.java:141)
> ... 6 more
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to
> address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException:
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
> at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:222)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:92)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:424)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:419)
> at org.apache.hadoop.hbase.ipc.Call.setTimeout(Call.java:107)
> at
> org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:134)
> at
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.run(HashedWheelTimer.java:715)
> at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:34)
> at
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:703)
> at
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:790)
> at
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:503)
> at java.lang.Thread.run(Thread.java:750)
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException:
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
> at
> org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:135)
> ... 6 more
> {code}
> The same issue should be valid for most of the command like unassigns, bypass
> etc.
> *Proposed fixed*
> * This can be fixed by introducing batching of the list of region passed via
> commandline or specified via -i arg
--
This message was sent by Atlassian Jira
(v8.20.10#820010)