[
https://issues.apache.org/jira/browse/HDFS-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066002#comment-14066002
]
Vinayakumar B commented on HDFS-4120:
-------------------------------------
Hi [~rakeshr],
Thanks for the updated patch.
Basically "-skipSharedEditsCheck" option was added to workaround the scenario
mentioned in HDFS-3752.
i.e. Bootstrapping immidiately after saveNamespace. if you have some txns after
saveNamespace, then there is no need for this option.
So covering this exact scenario is very much required. Following changes can be
added to to test. which does following.
1. saveNamespace by entering to safemode, and leaves after saveNamespace
2. Check bootstrap without -skipSharedEditsCheck
3. Check bootstrap with -skipSharedEditsCheck
{code}
// shutdown nn1 and delete its edit log files
cluster.shutdownNameNode(1);
deleteEditLogIfExists(confNN1);
cluster.getNameNodeRpc(0).setSafeMode(SafeModeAction.SAFEMODE_ENTER, true);
cluster.getNameNodeRpc(0).saveNamespace();
cluster.getNameNodeRpc(0).setSafeMode(SafeModeAction.SAFEMODE_LEAVE, true);
// check without -skipSharedEditsCheck, Bootstrap should fail for BKJM
// immediately after saveNamespace
int rc = BootstrapStandby.run(new String[] { "-force", "-nonInteractive" },
confNN1);
Assert.assertEquals("Mismatches return code", 6, rc);
// check with -skipSharedEditsCheck
rc = BootstrapStandby.run(new String[] { "-force", "-nonInteractive",
"-skipSharedEditsCheck" }, confNN1);
Assert.assertEquals("Mismatches return code", 0, rc);
{code}
> Add a new "-skipSharedEditsCheck" option for BootstrapStandby
> -------------------------------------------------------------
>
> Key: HDFS-4120
> URL: https://issues.apache.org/jira/browse/HDFS-4120
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: ha, namenode
> Affects Versions: 3.0.0, 2.0.2-alpha
> Reporter: Liang Xie
> Assignee: Liang Xie
> Priority: Minor
> Attachments: HDFS-4120.patch, HDFS-4120.patch, HDFS-4120.txt
>
>
> Per
> https://issues.apache.org/jira/browse/HDFS-3752?focusedCommentId=13449466&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13449466
> , let's introduce a new option, it should be very safe, but really useful
> for some corner case. e.g. when SNN losts local storage, we need to reset
> SNN, but in current trunk, it'll always get a FATAL msg and could never be
> successful. Another workaroud for this case, is full-sync the "current"
> directory from ANN, but it'll be cost more disk-space & net bandwidth, IMHO.
--
This message was sent by Atlassian JIRA
(v6.2#6252)