- Do you see anything wrong in above configuration ? Looks like all right.
- Where am I supposed to run this ( on name nodes, data nodes or on every node) ? run on all DataNodes, refresh all DataNodes to pick up the newly added NameNode. - I suppose the default data node rpc port is “8020” – and I should be able to set it by a property in hdfs-site.xml ( dfs.datanode.ipc.address ) – is that correct ? Yes. - Regarding SSH configuration – I have created a ssh cert only on my primary node ( using ssh-keygen) and copied it on all other data and the new name nodes using ssh-copy-id. Would it be necessary to create cert for the new name node as well ? Yes. 2014-08-16 13:35 GMT+08:00 <[email protected]>: > Hello, > > > > I am a HDFS newbie > > I am using Hadoop version 2.4.1 > > And following instructions for cluster set-up from > > > http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/ClusterSetup.html > > and for namenode federation from > > > http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/Federation.html > > > > I have set-up a HDFS cluster with one name-node and two data-nodes > successfully (with ease J) > > > > I however am having challenges setting up a federated name-node. > > > > All my machines are Suse Linux SLES 11 > > > > Here are the steps that I followed for adding a federated name node to my > working cluster > > > > Step1: I set up a new SLES11 VM and installed HDFS on that > > > > Step2 : Changed config in my hdfs-site.xml as follows and deployed on all > machines > > > > <configuration> > > <property> > > <property> > > <name>dfs.nameservices</name> > > <value>ns1,ns2</value> > > </property> > > <name>dfs.namenode.name.dir</name> > > <value>file:/home/hduser/mydata/hdfs/namenode</value> > > <description>Space for name node to persist stuff</description> > > </property> > > <property> > > <name>dfs.namenode.rpc-address.ns1</name> > > <value>sles-hdfs1:9000</value> > > </property> > > <property> > > <name>dfs.namenode.http-address.ns1</name> > > <value>sles-hdfs1:50070</value> > > </property> > > <property> > > <name>dfs.namenode.rpc-address.ns2</name> > > <value>sles-hdfs4:9000</value> > > </property> > > <property> > > <name>dfs.namenode.http-address.ns2</name> > > <value>sles-hdfs2:50070</value> > > </property> > > <property> > > <name>dfs.namenode.hosts</name> > > <value>sles-hdfs2,sles-hdfs5</value> > > <description>List of allowed data nodes</description> > > </property> > > </configuration> > > > > Step3: I formatted my new name with same cluster id that I used for my > first (working) name node > > hdfs namenode -format -clusterId CID-085f6f5f-784f-4b00-b3bf-937f2dd7808a > > > > Step4: I start the new name node and it starts successfully > > hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode > > > > Hereafter – The instructions are somewhat unclear – > > > > Next I am supposed to run the command > > > > $ hdfs dfadmin -refreshNameNode <datanode_host_name>:<datanode_rpc_port> > > > > Questions that I have are > > - Do you see anything wrong in above configuration ? > > - Where am I supposed to run this ( on name nodes, data nodes or > on every node) ? > > - I suppose the default data node rpc port is “8020” – and I > should be able to set it by a property in hdfs-site.xml ( > dfs.datanode.ipc.address > ) – is that correct ? > > - Regarding SSH configuration – I have created a ssh cert only > on my primary node ( using ssh-keygen) and copied it on all other data and > the new name nodes using ssh-copy-id. Would it be necessary to create cert > for the new name node as well ? > > > > Just FYI – server names of my nodes > > sles-hdfs1- primary name node > > sles-hdfs2-One Data node > > sles-hdfs5- Another Data node > > sles-hdfs4 –new federated name node > > > > > > >
