Re: adding or restarting a data node in a hadoop cluster

Harsh J Mon, 30 Apr 2012 20:20:16 -0700

Sumadhur,

(Inline)

On Tue, May 1, 2012 at 8:28 AM, sumadhur <sumadhur_i...@yahoo.com> wrote:
>
> I am on hadoop 0.20.
>
> To add a data node to a cluster, if we do not use the include/exclude/slaves 
> files, do we need to  do anything other than configuring the hdfs-site.xml to 
> point to name node and the mapred-site.xml to point to job tracker?
>
> For example, should the job tracker and name node be restarted always?

Just booting up the DN service with the right config and a configured
network for proper communication should suffice.

In case you're using rack-awareness, ensure you update the
rack-awareness script for your new node and refresh the NN before you
start your DN.

A restart isn't required for adding new nodes to the cluster.

> On a related note, if we restart a data node(that has some blocks on it) and 
> the data node now has new IP address, Should we restart namenode/job tracker 
> for hdfs and map-reduce to function correctly?
> Would the blocks on the restarted data node be detected or would hdfs think 
> that these blocks were lost and start replicating them?

Stopping, changing the IP/Hostname cleanly and restarting the DN back
up should not cause any block movement.

-- 
Harsh J

Re: adding or restarting a data node in a hadoop cluster

Reply via email to