I think they way it works when HDFS has a list in dfs.datanode.data.dir, it's basically a round robin between disks. And yes, it may not be perfect balanced cause of different file sizes.
On 29 September 2014 13:15, Susheel Kumar Gadalay <[email protected]> wrote: > Thank Aitor. > > That is what is my observation too. > > I added a new disk location and manually moved some files. > > But if 2 locations are given at the beginning itself for > dfs.datanode.data.dir, will hadoop balance the disks usage, if not > perfect because file sizes may differ. > > On 9/29/14, Aitor Cedres <[email protected]> wrote: > > Hi Susheel, > > > > Adding a new directory to “dfs.datanode.data.dir” will not balance your > > disks straightforward. Eventually, by HDFS activity > (deleting/invalidating > > some block, writing new ones), the disks will become balanced. If you > want > > to balance them right after adding the new disk and changing the > > “dfs.datanode.data.dir” > > value, you have to shutdown the DN and manually move (mv) some files in > the > > old directory to the new one. > > > > The balancer will try to balance the usage between HDFS nodes, but it > won't > > care about "internal" node disks utilization. For your particular case, > the > > balancer won't fix your issue. > > > > Hope it helps, > > Aitor > > > > On 29 September 2014 05:53, Susheel Kumar Gadalay <[email protected]> > > wrote: > > > >> You mean if multiple directory locations are given, Hadoop will > >> balance the distribution of files across these different directories. > >> > >> But normally we start with 1 directory location and once it is > >> reaching the maximum, we add new directory. > >> > >> In this case how can we balance the distribution of files? > >> > >> One way is to list the files and move. > >> > >> Will start balance script will work? > >> > >> On 9/27/14, Alexander Pivovarov <[email protected]> wrote: > >> > It can read/write in parallel to all drives. More hdd more io speed. > >> > On Sep 27, 2014 7:28 AM, "Susheel Kumar Gadalay" < > [email protected]> > >> > wrote: > >> > > >> >> Correct me if I am wrong. > >> >> > >> >> Adding multiple directories will not balance the files distributions > >> >> across these locations. > >> >> > >> >> Hadoop will add exhaust the first directory and then start using the > >> >> next, next .. > >> >> > >> >> How can I tell Hadoop to evenly balance across these directories. > >> >> > >> >> On 9/26/14, Matt Narrell <[email protected]> wrote: > >> >> > You can add a comma separated list of paths to the > >> >> “dfs.datanode.data.dir” > >> >> > property in your hdfs-site.xml > >> >> > > >> >> > mn > >> >> > > >> >> > On Sep 26, 2014, at 8:37 AM, Abdul Navaz <[email protected]> > >> >> > wrote: > >> >> > > >> >> >> Hi > >> >> >> > >> >> >> I am facing some space issue when I saving file into HDFS and/or > >> >> >> running > >> >> >> map reduce job. > >> >> >> > >> >> >> root@nn:~# df -h > >> >> >> Filesystem Size Used Avail > >> Use% > >> >> >> Mounted on > >> >> >> /dev/xvda2 5.9G 5.9G 0 > >> 100% > >> >> >> / > >> >> >> udev 98M 4.0K 98M > >> 1% > >> >> >> /dev > >> >> >> tmpfs 48M 192K 48M > >> 1% > >> >> >> /run > >> >> >> none 5.0M 0 5.0M > >> 0% > >> >> >> /run/lock > >> >> >> none 120M 0 120M > >> 0% > >> >> >> /run/shm > >> >> >> overflow 1.0M 4.0K 1020K > >> 1% > >> >> >> /tmp > >> >> >> /dev/xvda4 7.9G 147M 7.4G > >> 2% > >> >> >> /mnt > >> >> >> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 108G 75G > >> 59% > >> >> >> /groups/ch-geni-net/Hadoop-NET > >> >> >> 172.17.253.254:/q/proj/ch-geni-net 198G 108G 75G > >> 59% > >> >> >> /proj/ch-geni-net > >> >> >> root@nn:~# > >> >> >> > >> >> >> > >> >> >> I can see there is no space left on /dev/xvda2. > >> >> >> > >> >> >> How can I make hadoop to see newly mounted /dev/xvda4 ? Or do I > >> >> >> need > >> >> >> to > >> >> >> move the file manually from /dev/xvda2 to xvda4 ? > >> >> >> > >> >> >> > >> >> >> > >> >> >> Thanks & Regards, > >> >> >> > >> >> >> Abdul Navaz > >> >> >> Research Assistant > >> >> >> University of Houston Main Campus, Houston TX > >> >> >> Ph: 281-685-0388 > >> >> >> > >> >> > > >> >> > > >> >> > >> > > >> > > >
