[
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jing Zhao updated HDFS-5504:
----------------------------
Resolution: Fixed
Fix Version/s: 2.3.0
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)
I've committed this to trunk and branch-2. Thanks Vinay!
> In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold,
> leads to NN safemode.
> ------------------------------------------------------------------------------------------------
>
> Key: HDFS-5504
> URL: https://issues.apache.org/jira/browse/HDFS-5504
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: snapshots
> Affects Versions: 3.0.0, 2.2.0
> Reporter: Vinay
> Assignee: Vinay
> Fix For: 2.3.0
>
> Attachments: HDFS-5504.patch, HDFS-5504.patch
>
>
> 1. HA installation, standby NN is down.
> 2. delete snapshot is called and it has deleted the blocks from blocksmap and
> all datanodes. log sync also happened.
> 3. before next log roll NN crashed
> 4. When the namenode restartes then it will fsimage and finalized edits from
> shared storage and set the safemode threshold. which includes blocks from
> deleted snapshot also. (because this edits is not yet read as namenode is
> restarted before the last edits segment is not finalized)
> 5. When it becomes active, it finalizes the edits and read the delete
> snapshot edits_op. but at this time, it was not reducing the safemode count.
> and it will continuing in safemode.
> 6. On next restart, as the edits is already finalized, on startup only it
> will read and set the safemode threshold correctly.
> But one more restart will bring NN out of safemode.
--
This message was sent by Atlassian JIRA
(v6.1#6144)