[ 
https://issues.apache.org/jira/browse/HDFS-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1972:
------------------------------

    Attachment: hdfs-1972.txt

OK. I have an implementation of "Solution 2" from above which I believe works 
correctly in all the cases. A few minor differences from what's described above:

- It turns out that the DN side of the block report didn't need to be modified. 
The DNA_INVALIDATE command already calls {{FSDataset.invalidate}} as soon as it 
enqueues it for asynchronous deletion. This will remove it from the DN side 
volumeMap structure, which is where we generate the block report from.

One possible race here which is worth investigating: the DirectoryScanner could 
find the block file right before it's deleted, and re-add it to the volume map. 
I'd like to address this as a follow-up since it's a rare race and will take a 
while to write a decent test case for it.

- Rather than modifying the BR call to include a flag to acknowledge active 
state, I changed the NN side to carry an additional flag which is set on the 
first heartbeat after failover. This has the same effect and I think was a 
little simpler than making a protocol change.

I've also added a metric and metasave output for the list of postponed block 
deletions.


This patch applies on top of HDFS-2602 from 
https://issues.apache.org/jira/secure/attachment/12506819/HDFS-2602.patch but 
will probably need to be re-generated after that's committed.
                
> HA: Datanode fencing mechanism
> ------------------------------
>
>                 Key: HDFS-1972
>                 URL: https://issues.apache.org/jira/browse/HDFS-1972
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node, name-node
>            Reporter: Suresh Srinivas
>            Assignee: Todd Lipcon
>         Attachments: hdfs-1972-v1.txt, hdfs-1972.txt
>
>
> In high availability setup, with an active and standby namenode, there is a 
> possibility of two namenodes sending commands to the datanode. The datanode 
> must honor commands from only the active namenode and reject the commands 
> from standby, to prevent corruption. This invariant must be complied with 
> during fail over and other states such as split brain. This jira addresses 
> issues related to this, design of the solution and implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to