[
https://issues.apache.org/jira/browse/HDFS-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated HDFS-2179:
------------------------------
Attachment: hdfs-2179.txt
Here's a preliminary version of this. I've included the basic framework code,
as well as two fencing implementations:
1) shell-command based fencing
2) ssh-based fencing that uses jsch to ssh into the target node and {{fuser}}
to kill whatever process is holding onto the target port
This isn't at all integrated into the NN as of yet, since it's not clear what
the hook points will be. But if this looks like the right path, I'd like to
commit it to the HA branch, and we can adapt it to its integration points (eg
failover controller) later.
> HA: namenode fencing mechanism
> ------------------------------
>
> Key: HDFS-2179
> URL: https://issues.apache.org/jira/browse/HDFS-2179
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: name-node
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: hdfs-2179.txt
>
>
> In an HA cluster, when there are two NNs, the invariant that only one NN is
> active at a time has to be preserved in order to prevent "split brain
> syndrome." Thus, when a standby NN is transition to "active" state during a
> failover, it needs to somehow _fence_ the formerly active NN to ensure that
> it can no longer perform edits. This JIRA is to discuss and implement NN
> fencing.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira