[ https://issues.apache.org/jira/browse/HBASE-27961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nihal Jain updated HBASE-27961: ------------------------------- Component/s: hbase-operator-tools > [HBCK2] Running assigns/unassigns command with large number of files/regions > throws CallTimeoutException > -------------------------------------------------------------------------------------------------------- > > Key: HBASE-27961 > URL: https://issues.apache.org/jira/browse/HBASE-27961 > Project: HBase > Issue Type: Bug > Components: hbase-operator-tools, hbck2 > Reporter: Nihal Jain > Assignee: Nihal Jain > Priority: Major > > While trying to run assigns command with a huge list of region, it fails with > CTE. Even on trying to run it by breaking input into multiple files, it still > fails and have to blindly submit same command again and again until no error. > Exception seen as described above is as follows: > {code:java} > Exception in thread "main" java.io.IOException: > org.apache.hbase.thirdparty.com.google.protobuf.ServiceException: > org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to > address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception: > org.apache.hadoop.hbase.ipc.CallTimeoutException: > Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms > at org.apache.hadoop.hbase.client.HBaseHbck.assigns(HBaseHbck.java:146) > at org.apache.hbase.HBCK2.assigns(HBCK2.java:454) > at org.apache.hbase.HBCK2.doCommandLine(HBCK2.java:1070) > at org.apache.hbase.HBCK2.run(HBCK2.java:1028) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hbase.HBCK2.main(HBCK2.java:1367) > Caused by: org.apache.hbase.thirdparty.com.google.protobuf.ServiceException: > org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to > address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception: > org.apache.hadoop.hbase.ipc.CallTimeoutException: > Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:340) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$200(AbstractRpcClient.java:92) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:594) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$BlockingStub.assigns(MasterProtos.java) > at org.apache.hadoop.hbase.client.HBaseHbck.assigns(HBaseHbck.java:141) > ... 6 more > Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to > address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception: > org.apache.hadoop.hbase.ipc.CallTimeoutException: > Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms > at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:222) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:92) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:424) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:419) > at org.apache.hadoop.hbase.ipc.Call.setTimeout(Call.java:107) > at > org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:134) > at > org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.run(HashedWheelTimer.java:715) > at > org.apache.hbase.thirdparty.io.netty.util.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:34) > at > org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:703) > at > org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:790) > at > org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:503) > at java.lang.Thread.run(Thread.java:750) > Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: > Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms > at > org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:135) > ... 6 more > {code} > The same issue should be valid for most of the command like unassigns, > reportMissingRegionsInMeta etc. > *Proposed fixed* > * This can be fixed by introducing batching if a list of regions is passed > via commandline. > * Also in case files are specified via -i arg, we could treat each > individual file as a batch. -- This message was sent by Atlassian Jira (v8.20.10#820010)