You can have a new TaskTracker or DataNode join the cluster by just starting 
that daemon on the slave (e.g. bin/hadoop-daemon.sh start tasktracker) and 
making sure it is configured to connect to the right JobTracker or NameNode 
(through the mapred.job.tracker and fs.default.name properties in the config 
files). The slaves file is only used for the bin/start-* and bin/stop-* 
scripts, but Hadoop doesn't look at it at runtime. There may be other similar 
files that it can look at though, such as a blacklist, but I think that in the 
default configuration you can just launch the daemon and it will work.

Note that if you add a new DataNode, Hadoop won't automatically move old data 
to it (to spread out the across the cluster) unless you run the HDFS 
rebalancer, at least as far as I know.

Matei

On Jun 30, 2011, at 8:56 PM, Paul Rimba wrote:

> Hey there,
> 
> i am trying to add a new datanode/tasktracker to a currently running cluster.
> 
> Is this feasible? And if yes, how do i change the masters, slaves and 
> dfs.replication(in hdfs-site.xml) configuration?
> 
> can i add the new slave to the slaves configuration file while the cluster is 
> running?
> 
> i found this ./bin/hadoop dfs -setrep -w 4 /path/to/file command to change 
> the dfs.replication on the fly.
> 
> Is there a better way to do it?
> 
> 
> 
> Thank you for your kind attention.
> 
> 
> Kind Regards,
> Paul

Reply via email to