[
https://issues.apache.org/jira/browse/HBASE-24595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yu Wang updated HBASE-24595:
----------------------------
Description:
environment:
jdk: 1.8.0_181
hadoop: 3.1.1
hbase: 2.1.6
hbase shell create namespace blocked when all datanodes has restarted
in kerberos environment,
but create it successfully without kerberos
hmaster日志中显示:
2020-06-19 23:47:48,241 WARN [PEWorker-15] procedure.CreateNamespaceProcedure:
Retriable error trying to create namespace=abcd2 (in
state=CREATE_NAMESPACE_INSERT_INTO_NS_TABLE)
java.net.SocketTimeoutException: callTimeout=1200000, callDuration=1220061:
Call to hadoop-hbnn0005.com/172.20.101.36:16020 failed on local exception:
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=116, waitTime=10763,
rpcTimeout=10759 row 'abcd2' on table 'hbase:namespace' at
region=hbase:namespace,,1592548148073.f5c7e71fb5e5cab3b27e52600996f7fd.,
hostname=hadoop-hbnn0005.com,16020,1592580274989, seqNum=162
at
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:159)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:542)
at
org.apache.hadoop.hbase.master.TableNamespaceManager.insertIntoNSTable(TableNamespaceManager.java:167)
at
org.apache.hadoop.hbase.master.procedure.CreateNamespaceProcedure.insertIntoNSTable(CreateNamespaceProcedure.java:240)
at
org.apache.hadoop.hbase.master.procedure.CreateNamespaceProcedure.executeFromState(CreateNamespaceProcedure.java:85)
at
org.apache.hadoop.hbase.master.procedure.CreateNamespaceProcedure.executeFromState(CreateNamespaceProcedure.java:39)
at
org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:189)
at
org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:965)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1723)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1462)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1200(ProcedureExecutor.java:78)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2039)
Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to
hadoop-hbnn0005.com/172.20.101.36:16020 failed on local exception:
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=116, waitTime=10763,
rpcTimeout=10759
at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:205)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406)
at org.apache.hadoop.hbase.ipc.Call.setTimeout(Call.java:96)
at
org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:199)
at
org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:682)
at
org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:757)
at
org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:485)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=116,
waitTime=10763, rpcTimeout=10759
at
org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:200)
... 4 more
2020-06-19 23:47:49,218 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
Worker stuck PEWorker-15(pid=171), run time 20mins, 1.262sec
2020-06-19 23:47:54,220 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
Worker stuck PEWorker-15(pid=171), run time 20mins, 6.263sec
2020-06-19 23:47:59,220 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
Worker stuck PEWorker-15(pid=171), run time 20mins, 11.264sec
2020-06-19 23:48:04,220 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
Worker stuck PEWorker-15(pid=171), run time 20mins, 16.264sec
2020-06-19 23:48:09,221 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
Worker stuck PEWorker-15(pid=171), run time 20mins, 21.265sec
2020-06-19 23:48:14,221 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
Worker stuck PEWorker-15(pid=171), run time 20mins, 26.265sec
2020-06-19 23:48:19,221 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
Worker stuck PEWorker-15(pid=171), run time 20mins, 31.265sec
2020-06-19 23:48:24,222 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
Worker stuck PEWorker-15(pid=171), run time 20mins, 36.266sec
2020-06-19 23:48:29,222 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
Worker stuck PEWorker-15(pid=171), run time 20mins, 41.266sec
2020-06-19 23:48:34,223 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
Worker stuck PEWorker-15(pid=171), run time 20mins, 46.267sec
2020-06-19 23:48:39,223 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
Worker stuck PEWorker-15(pid=171), run time 20mins, 51.267sec
was:
environment:
jdk: 1.8.0_181
hadoop: 3.1.1
hbase: 2.1.6
hbase shell create namespace blocked when all datanodes has restarted
in kerberos environment,
but create it successfully without kerberos
> hbase create namespace blocked when all datanodes has restarted
> ---------------------------------------------------------------
>
> Key: HBASE-24595
> URL: https://issues.apache.org/jira/browse/HBASE-24595
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.1.6
> Reporter: Yu Wang
> Priority: Critical
> Attachments: create_namespace_1.png, create_namespace_2.png,
> hmaster.log, hmaster.png, hmaster_4569.jstack, hregionserver.log,
> hregionserver_25649.jstack, procedure.png
>
>
> environment:
> jdk: 1.8.0_181
> hadoop: 3.1.1
> hbase: 2.1.6
> hbase shell create namespace blocked when all datanodes has restarted
> in kerberos environment,
> but create it successfully without kerberos
>
> hmaster日志中显示:
> 2020-06-19 23:47:48,241 WARN [PEWorker-15]
> procedure.CreateNamespaceProcedure: Retriable error trying to create
> namespace=abcd2 (in state=CREATE_NAMESPACE_INSERT_INTO_NS_TABLE)
> java.net.SocketTimeoutException: callTimeout=1200000, callDuration=1220061:
> Call to hadoop-hbnn0005.com/172.20.101.36:16020 failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=116,
> waitTime=10763, rpcTimeout=10759 row 'abcd2' on table 'hbase:namespace' at
> region=hbase:namespace,,1592548148073.f5c7e71fb5e5cab3b27e52600996f7fd.,
> hostname=hadoop-hbnn0005.com,16020,1592580274989, seqNum=162
> at
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:159)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:542)
> at
> org.apache.hadoop.hbase.master.TableNamespaceManager.insertIntoNSTable(TableNamespaceManager.java:167)
> at
> org.apache.hadoop.hbase.master.procedure.CreateNamespaceProcedure.insertIntoNSTable(CreateNamespaceProcedure.java:240)
> at
> org.apache.hadoop.hbase.master.procedure.CreateNamespaceProcedure.executeFromState(CreateNamespaceProcedure.java:85)
> at
> org.apache.hadoop.hbase.master.procedure.CreateNamespaceProcedure.executeFromState(CreateNamespaceProcedure.java:39)
> at
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:189)
> at
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:965)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1723)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1462)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1200(ProcedureExecutor.java:78)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2039)
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to
> hadoop-hbnn0005.com/172.20.101.36:16020 failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=116,
> waitTime=10763, rpcTimeout=10759
> at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:205)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406)
> at org.apache.hadoop.hbase.ipc.Call.setTimeout(Call.java:96)
> at
> org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:199)
> at
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:682)
> at
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:757)
> at
> org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:485)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=116,
> waitTime=10763, rpcTimeout=10759
> at
> org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:200)
> ... 4 more
> 2020-06-19 23:47:49,218 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
> Worker stuck PEWorker-15(pid=171), run time 20mins, 1.262sec
> 2020-06-19 23:47:54,220 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
> Worker stuck PEWorker-15(pid=171), run time 20mins, 6.263sec
> 2020-06-19 23:47:59,220 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
> Worker stuck PEWorker-15(pid=171), run time 20mins, 11.264sec
> 2020-06-19 23:48:04,220 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
> Worker stuck PEWorker-15(pid=171), run time 20mins, 16.264sec
> 2020-06-19 23:48:09,221 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
> Worker stuck PEWorker-15(pid=171), run time 20mins, 21.265sec
> 2020-06-19 23:48:14,221 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
> Worker stuck PEWorker-15(pid=171), run time 20mins, 26.265sec
> 2020-06-19 23:48:19,221 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
> Worker stuck PEWorker-15(pid=171), run time 20mins, 31.265sec
> 2020-06-19 23:48:24,222 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
> Worker stuck PEWorker-15(pid=171), run time 20mins, 36.266sec
> 2020-06-19 23:48:29,222 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
> Worker stuck PEWorker-15(pid=171), run time 20mins, 41.266sec
> 2020-06-19 23:48:34,223 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
> Worker stuck PEWorker-15(pid=171), run time 20mins, 46.267sec
> 2020-06-19 23:48:39,223 WARN [ProcExecTimeout] procedure2.ProcedureExecutor:
> Worker stuck PEWorker-15(pid=171), run time 20mins, 51.267sec
--
This message was sent by Atlassian Jira
(v8.3.4#803005)