[
https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016330#comment-16016330
]
Ravi Prakash commented on HDFS-11817:
-------------------------------------
Thanks for your investigation Kihwal! I am seeing something similar on 2.7.3. A
block is holding up decommissioning because recovery failed. (The strace below
is from the time when the cluster was 2.7.2) . DN2 and DN3 are no longer part
of the cluster. DN1 is the node held up for decomissioning. I checked that the
block and meta file indeed are in the finalized directory.
{code}
2016-09-19 09:02:25,837 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
recoverBlocks FAILED: RecoveringBlock{BP-<someid>:blk_1094097355_20357090;
getBlockSize()=0; corrupt=false; offset=-1;
locs=[DatanodeInfoWithStorage[<DN1>:50010,null,null],
DatanodeInfoWithStorage[<DN2>:50010,null,null],
DatanodeInfoWithStorage[<DN3>:50010,null,null]]}
org.apache.hadoop.ipc.RemoteException(java.lang.IllegalStateException): Failed
to finalize INodeFile <filename> since blocks[0] is non-complete, where
blocks=[blk_1094097355_20552508{UCState=COMMITTED, truncateBlock=null,
primaryNodeIndex=0,
replicas=[ReplicaUC[[DISK]DS-03bed13e-5cdd-4207-91b6-abd83f9eb7d3:NORMAL:<DN1>:50010|RBW]]}].
at
com.google.common.base.Preconditions.checkState(Preconditions.java:172)
at
org.apache.hadoop.hdfs.server.namenode.INodeFile.assertAllBlocksComplete(INodeFile.java:222)
at
org.apache.hadoop.hdfs.server.namenode.INodeFile.toCompleteFile(INodeFile.java:209)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.finalizeINodeFileUnderConstruction(FSNamesystem.java:4218)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.closeFileCommitBlocks(FSNamesystem.java:4457)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4419)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:837)
at
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolServerSideTranslatorPB.java:291)
at
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28768)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy16.commitBlockSynchronization(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolClientSideTranslatorPB.java:312)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:2780)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2642)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.access$400(DataNode.java:243)
at
org.apache.hadoop.hdfs.server.datanode.DataNode$5.run(DataNode.java:2519)
at java.lang.Thread.run(Thread.java:744)
{code}
I am not sure what purpose failing {{commitBlockSyncronization()}} in this case
fulfills, so I would be agreeable to your proposed solution
bq. We can have commitBlockSynchronization() check for valid storage ID before
updating data structures. Even if no valid storage ID is found, we can't fail
the operation
> A faulty node can cause a lease leak and NPE on accessing data
> --------------------------------------------------------------
>
> Key: HDFS-11817
> URL: https://issues.apache.org/jira/browse/HDFS-11817
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.8.0
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Priority: Critical
> Attachments: hdfs-11817_supplement.txt
>
>
> When the namenode performs a lease recovery for a failed write, the
> {{commitBlockSynchronization()}} will fail, if none of the new target has
> sent a received-IBR. At this point, the data is inaccessible, as the
> namenode will throw a {{NullPointerException}} upon {{getBlockLocations()}}.
> The lease recovery will be retried in about an hour by the namenode. If the
> nodes are faulty (usually when there is only one new target), they may not
> block report until this point. If this happens, lease recovery throws an
> {{AlreadyBeingCreatedException}}, which causes LeaseManager to simply remove
> the lease without finalizing the inode.
> This results in an inconsistent lease state. The inode stays
> under-construction, but no more lease recovery is attempted. A manual lease
> recovery is also not allowed.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]