[
https://issues.apache.org/jira/browse/HDFS-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202644#comment-13202644
]
Bikas Saha commented on HDFS-2909:
----------------------------------
Repro steps
1) Start 2 NN's in active standby mode
2) Remove write permissions from shared edits dir
3) Upon log roll triggered by standby, the active gets error when finalizing
the edit logs
4) The error exception is caught way up on the stack and error does not get
reported against the bad shared edits dir
This happens because error reporting happens when FSImage.rollEditLogs() calls
storage.writeTransactionIdFileToStorage() which is called after
FSEDit.rollEditLogs(). The error in FSEdit.rollEditLogs() raises an exception
that is not handled in FSImage.rollEditLogs() and hence
storage.writeTransactionIdFileToStorage() does not get called and no error is
reported. The bad directory continues to remain in FSImage.storage.
> HA: Inaccessible shared edits dir not getting removed from FSImage storage
> dirs upon error
> ------------------------------------------------------------------------------------------
>
> Key: HDFS-2909
> URL: https://issues.apache.org/jira/browse/HDFS-2909
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ha, name-node
> Affects Versions: HA branch (HDFS-1623)
> Reporter: Bikas Saha
> Assignee: Bikas Saha
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira