[ 
https://issues.apache.org/jira/browse/HBASE-27961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nihal Jain updated HBASE-27961:
-------------------------------
    Component/s: hbase-operator-tools

> [HBCK2] Running assigns/unassigns command with large number of files/regions 
> throws CallTimeoutException
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-27961
>                 URL: https://issues.apache.org/jira/browse/HBASE-27961
>             Project: HBase
>          Issue Type: Bug
>          Components: hbase-operator-tools, hbck2
>            Reporter: Nihal Jain
>            Assignee: Nihal Jain
>            Priority: Major
>
> While trying to run assigns command with a huge list of region, it fails with 
> CTE. Even on trying to run it by breaking input into multiple files, it still 
> fails and have to blindly submit same command again and again until no error.
> Exception seen as described above is as follows: 
> {code:java}
> Exception in thread "main" java.io.IOException: 
> org.apache.hbase.thirdparty.com.google.protobuf.ServiceException: 
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to 
> address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception: 
> org.apache.hadoop.hbase.ipc.CallTimeoutException: 
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
>       at org.apache.hadoop.hbase.client.HBaseHbck.assigns(HBaseHbck.java:146)
>       at org.apache.hbase.HBCK2.assigns(HBCK2.java:454)
>       at org.apache.hbase.HBCK2.doCommandLine(HBCK2.java:1070)
>       at org.apache.hbase.HBCK2.run(HBCK2.java:1028)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>       at org.apache.hbase.HBCK2.main(HBCK2.java:1367)
> Caused by: org.apache.hbase.thirdparty.com.google.protobuf.ServiceException: 
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to 
> address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception: 
> org.apache.hadoop.hbase.ipc.CallTimeoutException: 
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:340)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$200(AbstractRpcClient.java:92)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:594)
>       at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$BlockingStub.assigns(MasterProtos.java)
>       at org.apache.hadoop.hbase.client.HBaseHbck.assigns(HBaseHbck.java:141)
>       ... 6 more
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to 
> address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception: 
> org.apache.hadoop.hbase.ipc.CallTimeoutException: 
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
>       at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:222)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:92)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:424)
>       at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:419)
>       at org.apache.hadoop.hbase.ipc.Call.setTimeout(Call.java:107)
>       at 
> org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:134)
>       at 
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.run(HashedWheelTimer.java:715)
>       at 
> org.apache.hbase.thirdparty.io.netty.util.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:34)
>       at 
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:703)
>       at 
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:790)
>       at 
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:503)
>       at java.lang.Thread.run(Thread.java:750)
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: 
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
>       at 
> org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:135)
>       ... 6 more
> {code}
> The same issue should be valid for most of the command like unassigns, 
> reportMissingRegionsInMeta etc.
> *Proposed fixed*
>  * This can be fixed by introducing batching if a list of regions is passed 
> via commandline.
>  * Also in case files are specified via -i arg, we could treat each 
> individual file as a batch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to