[
https://issues.apache.org/jira/browse/HDFS-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085977#comment-13085977
]
Suresh Srinivas commented on HDFS-2231:
---------------------------------------
I had used VIP address to mean failover address, which seems to have caused the
confusion. Here is the second part rewritten:
For discussion of existing configuration see the first part of -
https://issues.apache.org/jira/browse/HDFS-2231?focusedCommentId=13080279&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13080279
h2. Configuration requirements for HA
*Terminology:*
# NNAddress1, NNAddress2 - address of individual NNs. They could be logical
addresses.
# NNActiveAddress - Address where the active is running. This is one of
NNAddress1 or NNAddress2.
# NNStandbyAddress - Where the standby is running. This is one of NNAddress1 or
NNAddress2.
# NNFailoverAddress - this is the address of the active used by HA setups that
use IP failover mechanism.
*Requirements:*
# Backward compatibility: Existing deployments must be able to use the existing
configuration without any change.
# Datanodes and client need to know both the namenodes through configuration.
# As much as possible the configuration for all the nodes must be the same. The
special configuration required for different node types (namenode, datanodes,
gateways) should be minimmal.
h3. HA solution uses IP failover
# System needs to be configured with three sets of addresses,
NNFailoverAddress, NNAddress1 and NNAddress2.
# To get to the active namenode, clients use NNFailoverAddress.
# To discover NNStandbyAddress clients and datanode may use ZooKeeper or try
NNAddress1 and NNAddress2.
h3. Active and Standby namenode addresses without IP failover
# This setup does not require NNFailoverAddress.
# To discover NNActiveAddress and NNStandbyAddress clients and datanodes may
try NNAddress1 and NNAddress2 or use Zookeeper.
h2. Proposal
h3. For solutions using IP Failover
# NNFailoverAddress related configuration goes into configuration (Set 1
above). I propose using the existing keys: DFS_NAMENODE_RPC_ADDRESS_KEY,
DFS_NAMENODE__SERVICE_RPC_ADDRESS_KEY, DFS_NAMENODE_HTTP_ADDRESS_KEY,
DFS_NAMENODE_HTTPS_ADDRESS_KEY
h3. Generic part common to both VIP and non VIP based solution:*
*How do we add both namenodes into a common configuration?*
Datanodes need to know both the namenode addresses. I propse adding:
DFS_NAMENODE_IDS (dfs.namenode.ids) and comma separated list of ids (any
appropriate string). Add (Set 2) suffixed with "." + <NamenodeID>.
The client and datanodes can read DFS_NAMENODES and use the suffix to get
corresponding parameters to use.
*How does namenode know its NamenodeID and what configuration parameters to
load?*
Namenode discovers its own configuration from parameter DFS_NAMENODE_ID
(dfs.namenode.id). On namenodes an xml include points to a file with a
parameter DFS_NAMENODE_ID with corresponding NamenodeID. On other nodes such as
datanodes and client gateway machines the xml include points an empty file. I
like Todd's proposal, where a namenode when sees empty or unconfigured
DFS_NAMENODE_ID, could try binding to the rpc address and when it succeeds, it
discovers its NamenodeID, from suffix in the config param. (We could drop
DFS_NAMENODE_ID altogether).
Example for deployments without IP failover:
NNAddress1 = host1:port
NNAddress2 = host2:port
{noformat}
<property>
<name>dfs.namenode.ids</name>
<value>nn1, nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn1</name>
<value>host1:port</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn2</name>
<value>host2:port</value>
</property>
{noformat}
Example for deployments with IP failover:
NNFailoverAddress = failoverAddress:port
NNAddress1 = host1:port
NNAddress2 = host2:port
{noformat}
<property>
<name>dfs.namenode.ids</name>
<value>nn1, nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address</name>
<value>failoverAddress:port</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn1</name>
<value>host1:port</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn2</name>
<value>host2:port</value>
</property>
{noformat}
> Configuration changes for HA namenode
> -------------------------------------
>
> Key: HDFS-2231
> URL: https://issues.apache.org/jira/browse/HDFS-2231
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Suresh Srinivas
> Assignee: Suresh Srinivas
> Fix For: HA branch (HDFS-1623)
>
>
> This jira tracks the changes required for configuring HA setup for namenodes.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira