[ https://issues.apache.org/jira/browse/HDFS-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169947#comment-13169947 ]
Eli Collins commented on HDFS-1972: ----------------------------------- Patch looks like a good implementation of the approach. Here's my initial comments. For those following along and wondering why the patch doesn't have the DNs ignore commands from the standby that part of the fencing was already done in HDFS-2627. I'd modify the comment by blockContentsTrusted to something like "DN may have some pending deletions issued by a prior NN that this NN is unaware of. Therefore we don't perform actions based on the contents of this DN until after we receive a BR followed by a heartbeat confirming the DN thought we were active, which means this NN is now uptodate with respect to this DN". Maybe revert the polarity and rename blockContentsStale, since we're really tracking whether the block contents are up-to-date? Update javadoc for NumberReplicas, good to define "untrusted", if a DN is considered untrusted then all replicas are considered unstrusted. Not your change but in BlockManager rename "count" to "decomissioned" and update the javadoc. In processMisReplicatedBlock a comment to the effect of (but better worded than) "countNodes counts all blocks from an unstrusted DN as untrusted (and all DNs start out unstrusted until their next heartbeat), however we only act on this mistrust if the block is over-replicated". Commment "If we have a least one" in invalidateBlock can be moved down to the 2nd if". I think it's OK to assume postponedMisreplicatedBlocks is always small.. I suppose even if we re-commisioning a rack and immediately fail-over this should be sufficient. > HA: Datanode fencing mechanism > ------------------------------ > > Key: HDFS-1972 > URL: https://issues.apache.org/jira/browse/HDFS-1972 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: data-node, ha, name-node > Reporter: Suresh Srinivas > Assignee: Todd Lipcon > Attachments: hdfs-1972-v1.txt, hdfs-1972.txt > > > In high availability setup, with an active and standby namenode, there is a > possibility of two namenodes sending commands to the datanode. The datanode > must honor commands from only the active namenode and reject the commands > from standby, to prevent corruption. This invariant must be complied with > during fail over and other states such as split brain. This jira addresses > issues related to this, design of the solution and implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira