[jira] [Commented] (HDFS-2231) Configuration changes for HA namenode

Suresh Srinivas (JIRA) Tue, 16 Aug 2011 14:16:52 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085977#comment-13085977
 ]


Suresh Srinivas commented on HDFS-2231:
---------------------------------------

I had used VIP address to mean failover address, which seems to have caused the 
confusion. Here is the second part rewritten:

For discussion of existing configuration see the first part of - 
https://issues.apache.org/jira/browse/HDFS-2231?focusedCommentId=13080279&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13080279

h2. Configuration requirements for HA
*Terminology:*
# NNAddress1, NNAddress2 - address of individual NNs. They could be logical 
addresses.
# NNActiveAddress - Address where the active is running. This is one of 
NNAddress1 or NNAddress2.
# NNStandbyAddress - Where the standby is running. This is one of NNAddress1 or 
NNAddress2.
# NNFailoverAddress - this is the address of the active used by HA setups that 
use IP failover mechanism.

*Requirements:*
# Backward compatibility: Existing deployments must be able to use the existing 
configuration without any change.
# Datanodes and client need to know both the namenodes through configuration.
# As much as possible the configuration for all the nodes must be the same. The 
special configuration required for different node types (namenode, datanodes, 
gateways) should be minimmal.

h3. HA solution uses IP failover
# System needs to be configured with three sets of addresses, 
NNFailoverAddress, NNAddress1 and NNAddress2.
# To get to the active namenode, clients use NNFailoverAddress.
# To discover NNStandbyAddress clients and datanode may use ZooKeeper or try 
NNAddress1 and NNAddress2.

h3. Active and Standby namenode addresses without IP failover
# This setup does not require NNFailoverAddress.
# To discover NNActiveAddress and NNStandbyAddress clients and datanodes may 
try NNAddress1 and NNAddress2 or use Zookeeper.

h2. Proposal
h3. For solutions using IP Failover
# NNFailoverAddress related configuration goes into configuration (Set 1 
above). I propose using the existing keys: DFS_NAMENODE_RPC_ADDRESS_KEY, 
DFS_NAMENODE__SERVICE_RPC_ADDRESS_KEY, DFS_NAMENODE_HTTP_ADDRESS_KEY, 
DFS_NAMENODE_HTTPS_ADDRESS_KEY

h3. Generic part common to both VIP and non VIP based solution:*
*How do we add both namenodes into a common configuration?*
Datanodes need to know both the namenode addresses.  I propse adding:
DFS_NAMENODE_IDS (dfs.namenode.ids) and comma separated list of ids (any 
appropriate string). Add (Set 2) suffixed with "." + <NamenodeID>.
The client and datanodes can read DFS_NAMENODES and use the suffix to get 
corresponding parameters to use.

*How does namenode know its NamenodeID and what configuration parameters to 
load?*
Namenode discovers its own configuration from parameter DFS_NAMENODE_ID 
(dfs.namenode.id). On namenodes an xml include points to a file with a 
parameter DFS_NAMENODE_ID with corresponding NamenodeID. On other nodes such as 
datanodes and client gateway machines the xml include points an empty file. I 
like Todd's proposal, where a namenode when sees empty or unconfigured 
DFS_NAMENODE_ID, could try binding to the rpc address and when it succeeds, it 
discovers its NamenodeID, from suffix in the config param. (We could drop 
DFS_NAMENODE_ID altogether).

Example for deployments without IP failover:
NNAddress1 = host1:port
NNAddress2 = host2:port

{noformat}
<property>
<name>dfs.namenode.ids</name>
<value>nn1, nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn1</name>
<value>host1:port</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn2</name>
<value>host2:port</value>
</property>
{noformat}

Example for deployments with IP failover:
NNFailoverAddress = failoverAddress:port
NNAddress1 = host1:port
NNAddress2 = host2:port

{noformat}
<property>
<name>dfs.namenode.ids</name>
<value>nn1, nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address</name>
<value>failoverAddress:port</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn1</name>
<value>host1:port</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn2</name>
<value>host2:port</value>
</property>
{noformat}



> Configuration changes for HA namenode
> -------------------------------------
>
>                 Key: HDFS-2231
>                 URL: https://issues.apache.org/jira/browse/HDFS-2231
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>             Fix For: HA branch (HDFS-1623)
>
>
> This jira tracks the changes required for configuring HA setup for namenodes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2231) Configuration changes for HA namenode

Reply via email to