[
https://issues.apache.org/jira/browse/HDFS-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
liuguanghua updated HDFS-17048:
-------------------------------
Component/s: hdfs
> FSNamesystem.delete() maybe cause data residue when active namenode crash or
> shutdown
> ---------------------------------------------------------------------------------------
>
> Key: HDFS-17048
> URL: https://issues.apache.org/jira/browse/HDFS-17048
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs
> Environment:
>
> Reporter: liuguanghua
> Priority: Major
>
> Consider the following scenario:
> (1) User delete a hdfs dir with many blocks.
> (2) Then ative Namenode is crash or shutdown or failover to standby Namenode
> by administrator
> (3) This may result in residual data
>
> FSNamesystem.delete() will
> (1)delete dir first
> (2)add toRemovedBlocks into markedDeleteQueue.
> (3) MarkedDeleteBlockScrubber Thread will consumer the markedDeleteQueue and
> delete blocks.
> If the active namenode crash, the blocks in markedDeleteQueue will be lost
> and never be deleted. And the block cloud not find via hdfs fsck command. But
> it is alive in datanode disk.
>
> Thus ,
> SummaryA = hdfs dfs -du -s /
> SummaryB =sum( datanode report dfsused)
> SummaryA < SummaryB
>
> This may be unavoidable. But is there any way to find out the blocks that
> should be deleted and clean it ?
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]