liuguanghua created HDFS-17048:
----------------------------------
Summary: FSNamesystem.delete() maybe cause data residue when
active namenode crash or shutdown
Key: HDFS-17048
URL: https://issues.apache.org/jira/browse/HDFS-17048
Project: Hadoop HDFS
Issue Type: Bug
Environment:
Reporter: liuguanghua
Consider the following scenario:
(1) User delete a hdfs dir with many blocks.
(2) Then ative Namenode is crash or shutdown or failover to standby Namenode
by administrator
(3) This may result in residual data
FSNamesystem.delete() will
(1)delete dir first
(2)add toRemovedBlocks into markedDeleteQueue.
(3) MarkedDeleteBlockScrubber Thread will consumer the markedDeleteQueue and
delete blocks.
If the active namenode crash, the blocks in markedDeleteQueue will be lost and
never be deleted. And the block cloud not find via hdfs fsck command. But it is
alive in datanode disk.
Thus ,
SummaryA = hdfs dfs -du -s /
SummaryB =sum( datanode report dfsused)
SummaryA < SummaryB
This may be unavoidable. But is there any tools to find out the blocks that
should be delted and clean it ?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]