Hi Susheel, Adding a new directory to “dfs.datanode.data.dir” will not balance your disks straightforward. Eventually, by HDFS activity (deleting/invalidating some block, writing new ones), the disks will become balanced. If you want to balance them right after adding the new disk and changing the “dfs.datanode.data.dir” value, you have to shutdown the DN and manually move (mv) some files in the old directory to the new one.
The balancer will try to balance the usage between HDFS nodes, but it won't care about "internal" node disks utilization. For your particular case, the balancer won't fix your issue. Hope it helps, Aitor On 29 September 2014 05:53, Susheel Kumar Gadalay <[email protected]> wrote: > You mean if multiple directory locations are given, Hadoop will > balance the distribution of files across these different directories. > > But normally we start with 1 directory location and once it is > reaching the maximum, we add new directory. > > In this case how can we balance the distribution of files? > > One way is to list the files and move. > > Will start balance script will work? > > On 9/27/14, Alexander Pivovarov <[email protected]> wrote: > > It can read/write in parallel to all drives. More hdd more io speed. > > On Sep 27, 2014 7:28 AM, "Susheel Kumar Gadalay" <[email protected]> > > wrote: > > > >> Correct me if I am wrong. > >> > >> Adding multiple directories will not balance the files distributions > >> across these locations. > >> > >> Hadoop will add exhaust the first directory and then start using the > >> next, next .. > >> > >> How can I tell Hadoop to evenly balance across these directories. > >> > >> On 9/26/14, Matt Narrell <[email protected]> wrote: > >> > You can add a comma separated list of paths to the > >> “dfs.datanode.data.dir” > >> > property in your hdfs-site.xml > >> > > >> > mn > >> > > >> > On Sep 26, 2014, at 8:37 AM, Abdul Navaz <[email protected]> wrote: > >> > > >> >> Hi > >> >> > >> >> I am facing some space issue when I saving file into HDFS and/or > >> >> running > >> >> map reduce job. > >> >> > >> >> root@nn:~# df -h > >> >> Filesystem Size Used Avail > Use% > >> >> Mounted on > >> >> /dev/xvda2 5.9G 5.9G 0 > 100% > >> >> / > >> >> udev 98M 4.0K 98M > 1% > >> >> /dev > >> >> tmpfs 48M 192K 48M > 1% > >> >> /run > >> >> none 5.0M 0 5.0M > 0% > >> >> /run/lock > >> >> none 120M 0 120M > 0% > >> >> /run/shm > >> >> overflow 1.0M 4.0K 1020K > 1% > >> >> /tmp > >> >> /dev/xvda4 7.9G 147M 7.4G > 2% > >> >> /mnt > >> >> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 108G 75G > 59% > >> >> /groups/ch-geni-net/Hadoop-NET > >> >> 172.17.253.254:/q/proj/ch-geni-net 198G 108G 75G > 59% > >> >> /proj/ch-geni-net > >> >> root@nn:~# > >> >> > >> >> > >> >> I can see there is no space left on /dev/xvda2. > >> >> > >> >> How can I make hadoop to see newly mounted /dev/xvda4 ? Or do I need > >> >> to > >> >> move the file manually from /dev/xvda2 to xvda4 ? > >> >> > >> >> > >> >> > >> >> Thanks & Regards, > >> >> > >> >> Abdul Navaz > >> >> Research Assistant > >> >> University of Houston Main Campus, Houston TX > >> >> Ph: 281-685-0388 > >> >> > >> > > >> > > >> > > >
