Lukas Majercak created HDFS-11499:
-------------------------------------
Summary: Decommissioning stuck because of failing recovery
Key: HDFS-11499
URL: https://issues.apache.org/jira/browse/HDFS-11499
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs, namenode
Affects Versions: 3.0.0-alpha2, 2.7.3, 2.7.2, 2.7.1
Reporter: Lukas Majercak
Assignee: Lukas Majercak
Block recovery will fail to finalize the file if the locations of the last,
incomplete block are being decommissioned. Vice versa, the decommissioning will
be stuck, waiting for the last block to be completed.
{code:xml}
org.apache.hadoop.ipc.RemoteException(java.lang.IllegalStateException): Failed
to finalize INodeFile testRecoveryFile since blocks[255] is non-complete, where
blocks=[blk_1073741825_1001, blk_1073741826_1002...
{code}
The fix is to count replicas on decommissioning nodes when completing last
block in BlockManager.commitOrCompleteLastBlock, as we know that the
DecommissionManager will not decommission a node that has UC blocks.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]