Hi All, I was looking into FAQ, but well still have questions. Datanodes in my production are running low in the space of one of dfs.data.dir
/dev/sda5 --> 355G 322G 33G 91% /hadoop1 <---- /dev/sdb1 --> 484G 324G 161G 67% /hadoop2 /dev/sdc1 484G 318G 167G 66% /hadoop3 /hadoop1 has smaller space since the very beginning because its drive is being shared with operating system. I found one FAQ in wiki page "3.12. On an individual data node, how do you balance the blocks on the disk? Hadoop currently does not have a method by which to do this automatically. To do this manually: 1 Take down the HDFS 2 Use the UNIX mv command to move the individual blocks and meta pairs from one directory to another on each host 3 Restart the HDFS " Question of step 1, take down the hdfs. does that mean the whole cluster OR just datanode process of a datanode/tasktracker host? Question of step 2, 2.1 "moving blk and meta pair." are blk and meta pairs referring to cd /hadoop1/data/current $ ls -al *8816473533602921489* -rw-rw-r-- 1 apps apps 1734467 Aug 27 21:03 blk_-8816473533602921489 -rw-rw-r-- 1 apps apps 63 Aug 27 21:03 blk_-8816473533602921489_78445781.meta ??? 2.2 "from one directory to another on each host" does it needs to be like blk(and meta) from "current" has to be landed to "current" directory of another dfs.data.dir mv /hadoop1/data/current/*8816473533602921489* /hadoop2/data/current/ or it can be different directory name in destination side. 2.3 how about subdirXX? under /hadoop1/data/current/ .... .... 55G subdir36 49G subdir37 ..... ..... it is so tempting to move subdir36, subdir37 because they are huge. should it look like mv /hadoop1/data/current/subdir36/* /hadoop2/data/current/subdir36/ well... under /hadoop2/data/current/subdir36/ also have bunch of blk(and meta) and bunch of subdirectories as well which mean if i do move, it might be some collide ? Thanks in advances. -P
