Thank Aitor. That is what is my observation too.
I added a new disk location and manually moved some files. But if 2 locations are given at the beginning itself for dfs.datanode.data.dir, will hadoop balance the disks usage, if not perfect because file sizes may differ. On 9/29/14, Aitor Cedres <[email protected]> wrote: > Hi Susheel, > > Adding a new directory to “dfs.datanode.data.dir” will not balance your > disks straightforward. Eventually, by HDFS activity (deleting/invalidating > some block, writing new ones), the disks will become balanced. If you want > to balance them right after adding the new disk and changing the > “dfs.datanode.data.dir” > value, you have to shutdown the DN and manually move (mv) some files in the > old directory to the new one. > > The balancer will try to balance the usage between HDFS nodes, but it won't > care about "internal" node disks utilization. For your particular case, the > balancer won't fix your issue. > > Hope it helps, > Aitor > > On 29 September 2014 05:53, Susheel Kumar Gadalay <[email protected]> > wrote: > >> You mean if multiple directory locations are given, Hadoop will >> balance the distribution of files across these different directories. >> >> But normally we start with 1 directory location and once it is >> reaching the maximum, we add new directory. >> >> In this case how can we balance the distribution of files? >> >> One way is to list the files and move. >> >> Will start balance script will work? >> >> On 9/27/14, Alexander Pivovarov <[email protected]> wrote: >> > It can read/write in parallel to all drives. More hdd more io speed. >> > On Sep 27, 2014 7:28 AM, "Susheel Kumar Gadalay" <[email protected]> >> > wrote: >> > >> >> Correct me if I am wrong. >> >> >> >> Adding multiple directories will not balance the files distributions >> >> across these locations. >> >> >> >> Hadoop will add exhaust the first directory and then start using the >> >> next, next .. >> >> >> >> How can I tell Hadoop to evenly balance across these directories. >> >> >> >> On 9/26/14, Matt Narrell <[email protected]> wrote: >> >> > You can add a comma separated list of paths to the >> >> “dfs.datanode.data.dir” >> >> > property in your hdfs-site.xml >> >> > >> >> > mn >> >> > >> >> > On Sep 26, 2014, at 8:37 AM, Abdul Navaz <[email protected]> >> >> > wrote: >> >> > >> >> >> Hi >> >> >> >> >> >> I am facing some space issue when I saving file into HDFS and/or >> >> >> running >> >> >> map reduce job. >> >> >> >> >> >> root@nn:~# df -h >> >> >> Filesystem Size Used Avail >> Use% >> >> >> Mounted on >> >> >> /dev/xvda2 5.9G 5.9G 0 >> 100% >> >> >> / >> >> >> udev 98M 4.0K 98M >> 1% >> >> >> /dev >> >> >> tmpfs 48M 192K 48M >> 1% >> >> >> /run >> >> >> none 5.0M 0 5.0M >> 0% >> >> >> /run/lock >> >> >> none 120M 0 120M >> 0% >> >> >> /run/shm >> >> >> overflow 1.0M 4.0K 1020K >> 1% >> >> >> /tmp >> >> >> /dev/xvda4 7.9G 147M 7.4G >> 2% >> >> >> /mnt >> >> >> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 108G 75G >> 59% >> >> >> /groups/ch-geni-net/Hadoop-NET >> >> >> 172.17.253.254:/q/proj/ch-geni-net 198G 108G 75G >> 59% >> >> >> /proj/ch-geni-net >> >> >> root@nn:~# >> >> >> >> >> >> >> >> >> I can see there is no space left on /dev/xvda2. >> >> >> >> >> >> How can I make hadoop to see newly mounted /dev/xvda4 ? Or do I >> >> >> need >> >> >> to >> >> >> move the file manually from /dev/xvda2 to xvda4 ? >> >> >> >> >> >> >> >> >> >> >> >> Thanks & Regards, >> >> >> >> >> >> Abdul Navaz >> >> >> Research Assistant >> >> >> University of Houston Main Campus, Houston TX >> >> >> Ph: 281-685-0388 >> >> >> >> >> > >> >> > >> >> >> > >> >
