Sumadhur, (Inline)
On Tue, May 1, 2012 at 8:28 AM, sumadhur <sumadhur_i...@yahoo.com> wrote: > > I am on hadoop 0.20. > > To add a data node to a cluster, if we do not use the include/exclude/slaves > files, do we need to do anything other than configuring the hdfs-site.xml to > point to name node and the mapred-site.xml to point to job tracker? > > For example, should the job tracker and name node be restarted always? Just booting up the DN service with the right config and a configured network for proper communication should suffice. In case you're using rack-awareness, ensure you update the rack-awareness script for your new node and refresh the NN before you start your DN. A restart isn't required for adding new nodes to the cluster. > On a related note, if we restart a data node(that has some blocks on it) and > the data node now has new IP address, Should we restart namenode/job tracker > for hdfs and map-reduce to function correctly? > Would the blocks on the restarted data node be detected or would hdfs think > that these blocks were lost and start replicating them? Stopping, changing the IP/Hostname cleanly and restarting the DN back up should not cause any block movement. -- Harsh J