[
https://issues.apache.org/jira/browse/HDFS-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079155#comment-13079155
]
Aaron T. Myers commented on HDFS-2179:
--------------------------------------
Patch looks pretty good, Todd. A few comments:
# Please add some comments to the {{FenceMethod}} interface
# I think {{FenceMethod}} should be public. Entirely possible (if not likely)
end users will want to implement their own {{FenceMethods}}, and they shouldn't
need to put them in {{o.a.h.hdfs.server.namenode.ha}}.
# Please add some class comments to {{NodeFencer}}.
# Seems to me like {{NodeFencer.fence}} should be catching {{Exception}} thrown
by the individual methods. No reason not to try the other ones if some
exception other than {{BadFencingConfigurationException}} is thrown.
# In {{SshFenceByTcpPort.getNNPort}}, won't this be getting the port of the NN
from where the SSH is occurring, not necessarily of the NN which is being SSHed
into? This sort of points to what may be a larger problem, which is that I
believe it's presently impossible to configure the addresses of multiple NNs in
a single configuration.
> HA: namenode fencing mechanism
> ------------------------------
>
> Key: HDFS-2179
> URL: https://issues.apache.org/jira/browse/HDFS-2179
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: name-node
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: hdfs-2179.txt
>
>
> In an HA cluster, when there are two NNs, the invariant that only one NN is
> active at a time has to be preserved in order to prevent "split brain
> syndrome." Thus, when a standby NN is transition to "active" state during a
> failover, it needs to somehow _fence_ the formerly active NN to ensure that
> it can no longer perform edits. This JIRA is to discuss and implement NN
> fencing.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira