[
https://issues.apache.org/jira/browse/HDFS-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated HDFS-1972:
------------------------------
Attachment: hdfs-1972.txt
OK. I have an implementation of "Solution 2" from above which I believe works
correctly in all the cases. A few minor differences from what's described above:
- It turns out that the DN side of the block report didn't need to be modified.
The DNA_INVALIDATE command already calls {{FSDataset.invalidate}} as soon as it
enqueues it for asynchronous deletion. This will remove it from the DN side
volumeMap structure, which is where we generate the block report from.
One possible race here which is worth investigating: the DirectoryScanner could
find the block file right before it's deleted, and re-add it to the volume map.
I'd like to address this as a follow-up since it's a rare race and will take a
while to write a decent test case for it.
- Rather than modifying the BR call to include a flag to acknowledge active
state, I changed the NN side to carry an additional flag which is set on the
first heartbeat after failover. This has the same effect and I think was a
little simpler than making a protocol change.
I've also added a metric and metasave output for the list of postponed block
deletions.
This patch applies on top of HDFS-2602 from
https://issues.apache.org/jira/secure/attachment/12506819/HDFS-2602.patch but
will probably need to be re-generated after that's committed.
> HA: Datanode fencing mechanism
> ------------------------------
>
> Key: HDFS-1972
> URL: https://issues.apache.org/jira/browse/HDFS-1972
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: data-node, name-node
> Reporter: Suresh Srinivas
> Assignee: Todd Lipcon
> Attachments: hdfs-1972-v1.txt, hdfs-1972.txt
>
>
> In high availability setup, with an active and standby namenode, there is a
> possibility of two namenodes sending commands to the datanode. The datanode
> must honor commands from only the active namenode and reject the commands
> from standby, to prevent corruption. This invariant must be complied with
> during fail over and other states such as split brain. This jira addresses
> issues related to this, design of the solution and implementation.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira