[ 
https://issues.apache.org/jira/browse/HBASE-26699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480466#comment-17480466
 ] 

Duo Zhang edited comment on HBASE-26699 at 1/22/22, 3:28 PM:
-------------------------------------------------------------

Please check the state of your HDFS cluster.

{noformat}
2022-01-21 16:30:43,214 ERROR [split-log-closeStream-pool-0] 
wal.RecoveredEditsOutputSink: Could not close recovered edits at 
hdfs://nokiainfra-altiplano-hdfs-namenode:8020/hbase/data/hbase/namespace/9984ea6f7b1b85da667568ba16f61bda/recovered.edits/0000000000000000004-nokiainfra-altiplano-hbase-regionserver-0.nokiainfra-altiplano-hbase-regionserver.default.svc.cluster.local%2C16020%2C1642772796664.1642772804029.temp
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
/hbase/data/hbase/namespace/9984ea6f7b1b85da667568ba16f61bda/recovered.edits/0000000000000000004-nokiainfra-altiplano-hbase-regionserver-0.nokiainfra-altiplano-hbase-regionserver.default.svc.cluster.local%2C16020%2C1642772796664.1642772804029.temp
 could only be replicated to 0 nodes instead of minReplication (=1).  There are 
3 datanode(s) running and no node(s) are excluded in this operation.
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1620)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3135)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3059)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:493)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1540)
        at org.apache.hadoop.ipc.Client.call(Client.java:1486)
        at org.apache.hadoop.ipc.Client.call(Client.java:1385)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
        at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:448)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
        at com.sun.proxy.$Proxy21.addBlock(Unknown Source)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:376)
        at com.sun.proxy.$Proxy22.addBlock(Unknown Source)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:376)
        at com.sun.proxy.$Proxy22.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1846)
        at 
org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1645)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:710)
{noformat}

The NameNode tells us we can not write to enough DNs when we want to allocate a 
new block. The disks are full?


was (Author: apache9):
Please check the state of your HDFS cluster.

{noformat}
2022-01-21 16:30:43,214 ERROR [split-log-closeStream-pool-0] 
wal.RecoveredEditsOutputSink: Could not close recovered edits at 
hdfs://nokiainfra-altiplano-hdfs-namenode:8020/hbase/data/hbase/namespace/9984ea6f7b1b85da667568ba16f61bda/recovered.edits/0000000000000000004-nokiainfra-altiplano-hbase-regionserver-0.nokiainfra-altiplano-hbase-regionserver.default.svc.cluster.local%2C16020%2C1642772796664.1642772804029.temp
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
/hbase/data/hbase/namespace/9984ea6f7b1b85da667568ba16f61bda/recovered.edits/0000000000000000004-nokiainfra-altiplano-hbase-regionserver-0.nokiainfra-altiplano-hbase-regionserver.default.svc.cluster.local%2C16020%2C1642772796664.1642772804029.temp
 could only be replicated to 0 nodes instead of minReplication (=1).  There are 
3 datanode(s) running and no node(s) are excluded in this operation.
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1620)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3135)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3059)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:493)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1540)
        at org.apache.hadoop.ipc.Client.call(Client.java:1486)
        at org.apache.hadoop.ipc.Client.call(Client.java:1385)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
        at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:448)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
        at com.sun.proxy.$Proxy21.addBlock(Unknown Source)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:376)
        at com.sun.proxy.$Proxy22.addBlock(Unknown Source)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:376)
        at com.sun.proxy.$Proxy22.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1846)
        at 
org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1645)
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:710)
{noformat}

The NameNode tells us we can not write to enough DNs when closing a file.

> hbase regionserver keep on restarting in azure setup
> ----------------------------------------------------
>
>                 Key: HBASE-26699
>                 URL: https://issues.apache.org/jira/browse/HBASE-26699
>             Project: HBase
>          Issue Type: Bug
>         Environment: Azure cloud
>            Reporter: Kaushik Mandal
>            Priority: Major
>             Fix For: 2.4.8
>
>         Attachments: trace_master.txt, trace_region.txt
>
>
> hadoop hbase version 2.4.8
> hadoop hdfs version 2.7.7
>  
> Hbase master and regionserver keep on restarting due to 
> procedure.SplitWALProcedure: Failed to split wa
>  
> logs from master and regionserver attached



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to