Hi Tamir,
Thanks for the info, makes sense now :).
Cheers,
Usman
Hi,
The balancer works with the average utilization of all the nodes in the
cluster - in your case it's about 13%. Only nodes that are +/- 10% off the
average will be rebalanced. Node 4 isn't under-utilized because 13-10=3
which is less than 4%. You can use a different threshold than the default
10% (hadoop balancer -threshold 5). Read more here:
http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Rebalancer
Tamir
On Mon, Apr 27, 2009 at 11:36 AM, Usman Waheed <usm...@opera.com> wrote:
Hi,
I had sent out an email yesterday asking about how to balance the cluster
after setting the replication level to 2. I have 4 datanodes and one
namenode in my setup.
Using the -R switch with -setrep did the trick but one of my nodes became
under utilized. I then ran hadoop balancer and it did help but upto a
certain extent.
Datanode 4 noted below is now up to almost 5% but when i try to balance the
datanode again using the "hadoop balance" command it says that the cluster
is already balanced which isnt.
I wonder if there is an alternate way(s) or maybe overtime Datanode-4 will
pick up more blocks?
Any clues?
Thanks,
Usman
Name: 1
State : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 222235858599(206.97 GB)
Used raw bytes: 48140136448 (44.83 GB)
% used: 16.39%
Last contact: Mon Apr 27 08:34:46 UTC 2009
Name: 2
State : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 231235100994(215.35 GB)
Used raw bytes: 40704245760 (37.91 GB)
% used: 13.86%
Last contact: Mon Apr 27 08:34:45 UTC 2009
Name: 3
State : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 211936026161(197.38 GB)
Used raw bytes: 59591700480 (55.5 GB)
% used: 20.28%
Last contact: Mon Apr 27 08:34:45 UTC 2009
*Name: 4
*State : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 258876991693(241.1 GB)
Used raw bytes: 12142653440 (11.31 GB)
% used: 4.13%
Last contact: Mon Apr 27 08:34:46 UTC 2009