Re: Dynamic Cluster Node addition

Paul Rimba Thu, 30 Jun 2011 22:13:09 -0700

Hey Matei,

what if you do the bin/hadoop-daemon.sh start tasktracker
bin/hadoop-daemon.sh start datanode.


Does it move the old data to the new slave?

I run that scenario a couple of times and run the start-balancer.sh. It
always says that the cluster is balanced. Does it mean that the has been
spread out?

Thanks
Paul

On Fri, Jul 1, 2011 at 2:05 PM, Matei Zaharia <ma...@eecs.berkeley.edu>wrote:

> You can have a new TaskTracker or DataNode join the cluster by just
> starting that daemon on the slave (e.g. bin/hadoop-daemon.sh start
> tasktracker) and making sure it is configured to connect to the right
> JobTracker or NameNode (through the mapred.job.tracker and 
> fs.default.nameproperties in the config files). The slaves file is only used 
> for the
> bin/start-* and bin/stop-* scripts, but Hadoop doesn't look at it at
> runtime. There may be other similar files that it can look at though, such
> as a blacklist, but I think that in the default configuration you can just
> launch the daemon and it will work.
>
> Note that if you add a new DataNode, Hadoop won't automatically move old
> data to it (to spread out the across the cluster) unless you run the HDFS
> rebalancer, at least as far as I know.
>
> Matei
>
> On Jun 30, 2011, at 8:56 PM, Paul Rimba wrote:
>
> Hey there,
>
> i am trying to add a new datanode/tasktracker to a currently running
> cluster.
>
> Is this feasible? And if yes, how do i change the masters, slaves and
> dfs.replication(in hdfs-site.xml) configuration?
>
> can i add the new slave to the slaves configuration file while the cluster
> is running?
>
> i found this ./bin/hadoop dfs -setrep -w 4 /path/to/file command to change
> the dfs.replication on the fly.
>
> Is there a better way to do it?
>
>
>
> Thank you for your kind attention.
>
>
> Kind Regards,
> Paul
>
>
>

Re: Dynamic Cluster Node addition

Reply via email to