Re: Dynamic Cluster Node addition

Harsh J Thu, 30 Jun 2011 22:59:31 -0700

Paul,

You can inspect the data used by your new nodes after the balancer
operation runs. "hadoop dfsadmin -report" should tell you detailed
stats about each of the DNs, or look at /fsck


(Note: by default, the balancer operation may be bandwidth limited,
for performance reasons and may take a while to happen -- although
this is configurable)

On Fri, Jul 1, 2011 at 10:42 AM, Paul Rimba <paul.ri...@gmail.com> wrote:
> Hey Matei,
> what if you do the bin/hadoop-daemon.sh start tasktracker
> bin/hadoop-daemon.sh start datanode.
> Does it move the old data to the new slave?
> I run that scenario a couple of times and run the start-balancer.sh. It
> always says that the cluster is balanced. Does it mean that the has been
> spread out?
> Thanks
> Paul
> On Fri, Jul 1, 2011 at 2:05 PM, Matei Zaharia <ma...@eecs.berkeley.edu>
> wrote:
>>
>> You can have a new TaskTracker or DataNode join the cluster by just
>> starting that daemon on the slave (e.g. bin/hadoop-daemon.sh start
>> tasktracker) and making sure it is configured to connect to the right
>> JobTracker or NameNode (through the mapred.job.tracker and fs.default.name
>> properties in the config files). The slaves file is only used for the
>> bin/start-* and bin/stop-* scripts, but Hadoop doesn't look at it at
>> runtime. There may be other similar files that it can look at though, such
>> as a blacklist, but I think that in the default configuration you can just
>> launch the daemon and it will work.
>> Note that if you add a new DataNode, Hadoop won't automatically move old
>> data to it (to spread out the across the cluster) unless you run the HDFS
>> rebalancer, at least as far as I know.
>> Matei
>> On Jun 30, 2011, at 8:56 PM, Paul Rimba wrote:
>>
>> Hey there,
>> i am trying to add a new datanode/tasktracker to a currently running
>> cluster.
>> Is this feasible? And if yes, how do i change the masters, slaves and
>> dfs.replication(in hdfs-site.xml) configuration?
>> can i add the new slave to the slaves configuration file while the cluster
>> is running?
>> i found this ./bin/hadoop dfs -setrep -w 4 /path/to/file command to change
>> the dfs.replication on the fly.
>> Is there a better way to do it?
>>
>>
>> Thank you for your kind attention.
>>
>> Kind Regards,
>> Paul
>
>



-- 
Harsh J

Re: Dynamic Cluster Node addition

Reply via email to