[
https://issues.apache.org/jira/browse/HDFS-7707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303364#comment-14303364
]
Yongjun Zhang commented on HDFS-7707:
-------------------------------------
Hi [~brahmareddy],
Sorry for a late clarification here about HDFS-7414, to address your earlier
comment:
https://issues.apache.org/jira/browse/HDFS-7707?focusedCommentId=14298588&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14298588
My comments in HDFS-7414 earlier mentioned both HDFS-6527 and HDFS-6825. The
code you pasted there indicates that the release you are running has HDFS-6527
but not HDFS-6825.
HDFS-7707 is to remedy HDFS-6825 fix, for the special the case that a new dir
is created with the same name as the previously deleted dir. It's possible
HDFS-6825 alone can solve your issue (you can try now to see if that's the case
since HDFS-6825 fix is already committed), or you need to wait for HDFS-7707
fix and combine with HDFS-6825 fix.
Thanks.
> Edit log corruption due to delayed block removal again
> ------------------------------------------------------
>
> Key: HDFS-7707
> URL: https://issues.apache.org/jira/browse/HDFS-7707
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.6.0
> Reporter: Yongjun Zhang
> Assignee: Yongjun Zhang
> Attachments: HDFS-7707.001.patch, HDFS-7707.002.patch,
> reproduceHDFS-7707.patch
>
>
> Edit log corruption is seen again, even with the fix of HDFS-6825.
> Prior to HDFS-6825 fix, if dirX is deleted recursively, an OP_CLOSE can get
> into edit log for the fileY under dirX, thus corrupting the edit log
> (restarting NN with the edit log would fail).
> What HDFS-6825 does to fix this issue is, to detect whether fileY is already
> deleted by checking the ancestor dirs on it's path, if any of them doesn't
> exist, then fileY is already deleted, and don't put OP_CLOSE to edit log for
> the file.
> For this new edit log corruption, what I found was, the client first deleted
> dirX recursively, then create another dir with exactly the same name as dirX
> right away. Because HDFS-6825 count on the namespace checking (whether dirX
> exists in its parent dir) to decide whether a file has been deleted, the
> newly created dirX defeats this checking, thus OP_CLOSE for the already
> deleted file gets into the edit log, due to delayed block removal.
> What we need to do is to have a more robust way to detect whether a file has
> been deleted.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)