Hi Liang Yanbo ☺ Thanks for yor reply I could get the federated name node to work – with some changes though – posting for benefit of group
Step4) I created ssh cert on my secondary (federated) name node as well ( using ssh-keygen) and copied it on all other data and the existing name nodes using ssh-copy-id\ Step5) I added following entry to my hdfs-site.xml ( please note that the default port # is 50020 as opposed to what I noted earlier – and the format of value is IP Address:Port #) <configuration> <!-- Other properties …--> <property> <name>dfs.datanode.ipc.address</name> <value>0.0.0.0:50020</value> <description>data node rpc port</description> </property> <!-- Other properties …--> </configuration> Step 6) I restart my data nodes hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop datanode Step 7) I refresh my data nodes hdfs dfsadmin -refreshNamenodes <my ip>:50020 And presto – I am all set – Yogesh Devi, Architect, Dell Cloud Clinical Archive Dell Land Phone +91 80 28413000 Extension – 2781 Hand Phone +91 99014 71082 From: Yanbo Liang [mailto:[email protected]] Sent: Saturday, August 16, 2014 3:20 PM To: [email protected] Subject: Re: Problems with the Fedarated name node configuration - Do you see anything wrong in above configuration ? Looks like all right. - Where am I supposed to run this ( on name nodes, data nodes or on every node) ? run on all DataNodes, refresh all DataNodes to pick up the newly added NameNode. - I suppose the default data node rpc port is “8020” – and I should be able to set it by a property in hdfs-site.xml ( dfs.datanode.ipc.address ) – is that correct ? Yes. - Regarding SSH configuration – I have created a ssh cert only on my primary node ( using ssh-keygen) and copied it on all other data and the new name nodes using ssh-copy-id. Would it be necessary to create cert for the new name node as well ? Yes. 2014-08-16 13:35 GMT+08:00 <[email protected]<mailto:[email protected]>>: Hello, I am a HDFS newbie I am using Hadoop version 2.4.1 And following instructions for cluster set-up from http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/ClusterSetup.html and for namenode federation from http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/Federation.html I have set-up a HDFS cluster with one name-node and two data-nodes successfully (with ease ☺) I however am having challenges setting up a federated name-node. All my machines are Suse Linux SLES 11 Here are the steps that I followed for adding a federated name node to my working cluster Step1: I set up a new SLES11 VM and installed HDFS on that Step2 : Changed config in my hdfs-site.xml as follows and deployed on all machines <configuration> <property> <property> <name>dfs.nameservices</name> <value>ns1,ns2</value> </property> <name>dfs.namenode.name.dir</name> <value>file:/home/hduser/mydata/hdfs/namenode</value> <description>Space for name node to persist stuff</description> </property> <property> <name>dfs.namenode.rpc-address.ns1</name> <value>sles-hdfs1:9000</value> </property> <property> <name>dfs.namenode.http-address.ns1</name> <value>sles-hdfs1:50070</value> </property> <property> <name>dfs.namenode.rpc-address.ns2</name> <value>sles-hdfs4:9000</value> </property> <property> <name>dfs.namenode.http-address.ns2</name> <value>sles-hdfs2:50070</value> </property> <property> <name>dfs.namenode.hosts</name> <value>sles-hdfs2,sles-hdfs5</value> <description>List of allowed data nodes</description> </property> </configuration> Step3: I formatted my new name with same cluster id that I used for my first (working) name node hdfs namenode -format -clusterId CID-085f6f5f-784f-4b00-b3bf-937f2dd7808a Step4: I start the new name node and it starts successfully hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode Hereafter – The instructions are somewhat unclear – Next I am supposed to run the command $ hdfs dfadmin -refreshNameNode <datanode_host_name>:<datanode_rpc_port> Questions that I have are - Do you see anything wrong in above configuration ? - Where am I supposed to run this ( on name nodes, data nodes or on every node) ? - I suppose the default data node rpc port is “8020” – and I should be able to set it by a property in hdfs-site.xml ( dfs.datanode.ipc.address ) – is that correct ? - Regarding SSH configuration – I have created a ssh cert only on my primary node ( using ssh-keygen) and copied it on all other data and the new name nodes using ssh-copy-id. Would it be necessary to create cert for the new name node as well ? Just FYI – server names of my nodes sles-hdfs1- primary name node sles-hdfs2-One Data node sles-hdfs5- Another Data node sles-hdfs4 –new federated name node
