Hey Matei, what if you do the bin/hadoop-daemon.sh start tasktracker bin/hadoop-daemon.sh start datanode.
Does it move the old data to the new slave? I run that scenario a couple of times and run the start-balancer.sh. It always says that the cluster is balanced. Does it mean that the has been spread out? Thanks Paul On Fri, Jul 1, 2011 at 2:05 PM, Matei Zaharia <ma...@eecs.berkeley.edu>wrote: > You can have a new TaskTracker or DataNode join the cluster by just > starting that daemon on the slave (e.g. bin/hadoop-daemon.sh start > tasktracker) and making sure it is configured to connect to the right > JobTracker or NameNode (through the mapred.job.tracker and > fs.default.nameproperties in the config files). The slaves file is only used > for the > bin/start-* and bin/stop-* scripts, but Hadoop doesn't look at it at > runtime. There may be other similar files that it can look at though, such > as a blacklist, but I think that in the default configuration you can just > launch the daemon and it will work. > > Note that if you add a new DataNode, Hadoop won't automatically move old > data to it (to spread out the across the cluster) unless you run the HDFS > rebalancer, at least as far as I know. > > Matei > > On Jun 30, 2011, at 8:56 PM, Paul Rimba wrote: > > Hey there, > > i am trying to add a new datanode/tasktracker to a currently running > cluster. > > Is this feasible? And if yes, how do i change the masters, slaves and > dfs.replication(in hdfs-site.xml) configuration? > > can i add the new slave to the slaves configuration file while the cluster > is running? > > i found this ./bin/hadoop dfs -setrep -w 4 /path/to/file command to change > the dfs.replication on the fly. > > Is there a better way to do it? > > > > Thank you for your kind attention. > > > Kind Regards, > Paul > > >