[
https://issues.apache.org/jira/browse/HBASE-27961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739963#comment-17739963
]
Nihal Jain edited comment on HBASE-27961 at 7/4/23 5:01 PM:
------------------------------------------------------------
Hit this while testing patch for HBASE-27724
As a workaround had to submit assigns by running once for each file, each
having 1000 regions as shown below:
{code:java}
hbase hbck -j $HBCK_JAR assigns -i test/user_regions.txt.0
hbase hbck -j $HBCK_JAR assigns -i test/user_regions.txt.1
hbase hbck -j $HBCK_JAR assigns -i test/user_regions.txt.2
hbase hbck -j $HBCK_JAR assigns -i test/user_regions.txt.3
hbase hbck -j $HBCK_JAR assigns -i test/user_regions.txt.4
hbase hbck -j $HBCK_JAR assigns -i test/user_regions.txt.5
hbase hbck -j $HBCK_JAR assigns -i test/user_regions.txt.6
hbase hbck -j $HBCK_JAR assigns -i test/user_regions.txt.7
{code}
Instead of below which was failing
{code:java}
hbase hbck -j $HBCK_JAR assigns -i test/user_regions.txt.0
test/user_regions.txt.1 test/user_regions.txt.2 test/user_regions.txt.3
test/user_regions.txt.4 test/user_regions.txt.5 test/user_regions.txt.6
test/user_regions.txt.7
{code}
was (Author: nihaljain.cs):
Hit this while testing patch for HBASE-27724
> [HBCK2] Running assigns/unassigns command with large number of files/regions
> throws CallTimeoutException
> --------------------------------------------------------------------------------------------------------
>
> Key: HBASE-27961
> URL: https://issues.apache.org/jira/browse/HBASE-27961
> Project: HBase
> Issue Type: Bug
> Components: hbase-operator-tools, hbck2
> Reporter: Nihal Jain
> Assignee: Nihal Jain
> Priority: Major
>
> While trying to run assigns command with a huge list of region, it fails with
> CTE. Even on trying to run it by breaking input into multiple files, it still
> fails and have to blindly submit same command again and again until no error.
> Exception seen as described above is as follows:
> {code:java}
> Exception in thread "main" java.io.IOException:
> org.apache.hbase.thirdparty.com.google.protobuf.ServiceException:
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to
> address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException:
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
> at org.apache.hadoop.hbase.client.HBaseHbck.assigns(HBaseHbck.java:146)
> at org.apache.hbase.HBCK2.assigns(HBCK2.java:454)
> at org.apache.hbase.HBCK2.doCommandLine(HBCK2.java:1070)
> at org.apache.hbase.HBCK2.run(HBCK2.java:1028)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hbase.HBCK2.main(HBCK2.java:1367)
> Caused by: org.apache.hbase.thirdparty.com.google.protobuf.ServiceException:
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to
> address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException:
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:340)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$200(AbstractRpcClient.java:92)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:594)
> at
> org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$BlockingStub.assigns(MasterProtos.java)
> at org.apache.hadoop.hbase.client.HBaseHbck.assigns(HBaseHbck.java:141)
> ... 6 more
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to
> address=SOME_HOST_NAME:SOME_PORT_NUMBER failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException:
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
> at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:222)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:92)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:424)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:419)
> at org.apache.hadoop.hbase.ipc.Call.setTimeout(Call.java:107)
> at
> org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:134)
> at
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.run(HashedWheelTimer.java:715)
> at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.ImmediateExecutor.execute(ImmediateExecutor.java:34)
> at
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:703)
> at
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:790)
> at
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:503)
> at java.lang.Thread.run(Thread.java:750)
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException:
> Call[id=0,methodName=Assigns], waitTime=90142ms, rpcTimeout=90000ms
> at
> org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:135)
> ... 6 more
> {code}
> The same issue should be valid for most of the command like unassigns,
> reportMissingRegionsInMeta etc.
> *Proposed fixed*
> * This can be fixed by introducing batching if a list of regions is passed
> via commandline.
> * Also in case files are specified via -i arg, we could treat each
> individual file as a batch.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)