Dear All, I am not doing load balancing here. I am just copying a file and it is throwing me an error no space left on the device.
hduser@dn1:~$ df -h Filesystem Size Used Avail Use% Mounted on /dev/xvda2 5.9G 5.1G 533M 91% / udev 98M 4.0K 98M 1% /dev tmpfs 48M 196K 48M 1% /run none 5.0M 0 5.0M 0% /run/lock none 120M 0 120M 0% /run/shm 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 116G 67G 64% /groups/ch-geni-net/Hadoop-NET 172.17.253.254:/q/proj/ch-geni-net 198G 116G 67G 64% /proj/ch-geni-net /dev/xvda4 7.9G 147M 7.4G 2% /mnt hduser@dn1:~$ hduser@dn1:~$ hduser@dn1:~$ hduser@dn1:~$ cp data2.txt data3.txt cp: writing `data3.txt': No space left on device cp: failed to extend `data3.txt': No space left on device hduser@dn1:~$ I guess by default it is copying to default location. Why I am getting this error ? How can I fix this ? Thanks & Regards, Abdul Navaz Research Assistant University of Houston Main Campus, Houston TX Ph: 281-685-0388 From: Aitor Cedres <[email protected]> Reply-To: <[email protected]> Date: Monday, September 29, 2014 at 7:53 AM To: <[email protected]> Subject: Re: No space when running a hadoop job I think they way it works when HDFS has a list in dfs.datanode.data.dir, it's basically a round robin between disks. And yes, it may not be perfect balanced cause of different file sizes. On 29 September 2014 13:15, Susheel Kumar Gadalay <[email protected]> wrote: > Thank Aitor. > > That is what is my observation too. > > I added a new disk location and manually moved some files. > > But if 2 locations are given at the beginning itself for > dfs.datanode.data.dir, will hadoop balance the disks usage, if not > perfect because file sizes may differ. > > On 9/29/14, Aitor Cedres <[email protected]> wrote: >> > Hi Susheel, >> > >> > Adding a new directory to ³dfs.datanode.data.dir² will not balance your >> > disks straightforward. Eventually, by HDFS activity (deleting/invalidating >> > some block, writing new ones), the disks will become balanced. If you want >> > to balance them right after adding the new disk and changing the >> > ³dfs.datanode.data.dir² >> > value, you have to shutdown the DN and manually move (mv) some files in the >> > old directory to the new one. >> > >> > The balancer will try to balance the usage between HDFS nodes, but it won't >> > care about "internal" node disks utilization. For your particular case, the >> > balancer won't fix your issue. >> > >> > Hope it helps, >> > Aitor >> > >> > On 29 September 2014 05:53, Susheel Kumar Gadalay <[email protected]> >> > wrote: >> > >>> >> You mean if multiple directory locations are given, Hadoop will >>> >> balance the distribution of files across these different directories. >>> >> >>> >> But normally we start with 1 directory location and once it is >>> >> reaching the maximum, we add new directory. >>> >> >>> >> In this case how can we balance the distribution of files? >>> >> >>> >> One way is to list the files and move. >>> >> >>> >> Will start balance script will work? >>> >> >>> >> On 9/27/14, Alexander Pivovarov <[email protected]> wrote: >>>> >> > It can read/write in parallel to all drives. More hdd more io speed. >>>> >> > On Sep 27, 2014 7:28 AM, "Susheel Kumar Gadalay" >>>> <[email protected]> >>>> >> > wrote: >>>> >> > >>>>> >> >> Correct me if I am wrong. >>>>> >> >> >>>>> >> >> Adding multiple directories will not balance the files distributions >>>>> >> >> across these locations. >>>>> >> >> >>>>> >> >> Hadoop will add exhaust the first directory and then start using the >>>>> >> >> next, next .. >>>>> >> >> >>>>> >> >> How can I tell Hadoop to evenly balance across these directories. >>>>> >> >> >>>>> >> >> On 9/26/14, Matt Narrell <[email protected]> wrote: >>>>>> >> >> > You can add a comma separated list of paths to the >>>>> >> >> ³dfs.datanode.data.dir² >>>>>> >> >> > property in your hdfs-site.xml >>>>>> >> >> > >>>>>> >> >> > mn >>>>>> >> >> > >>>>>> >> >> > On Sep 26, 2014, at 8:37 AM, Abdul Navaz <[email protected]> >>>>>> >> >> > wrote: >>>>>> >> >> > >>>>>>> >> >> >> Hi >>>>>>> >> >> >> >>>>>>> >> >> >> I am facing some space issue when I saving file into HDFS and/or >>>>>>> >> >> >> running >>>>>>> >> >> >> map reduce job. >>>>>>> >> >> >> >>>>>>> >> >> >> root@nn:~# df -h >>>>>>> >> >> >> Filesystem Size Used Avail >>> >> Use% >>>>>>> >> >> >> Mounted on >>>>>>> >> >> >> /dev/xvda2 5.9G 5.9G 0 >>> >> 100% >>>>>>> >> >> >> / >>>>>>> >> >> >> udev 98M 4.0K 98M >>> >> 1% >>>>>>> >> >> >> /dev >>>>>>> >> >> >> tmpfs 48M 192K 48M >>> >> 1% >>>>>>> >> >> >> /run >>>>>>> >> >> >> none 5.0M 0 5.0M >>> >> 0% >>>>>>> >> >> >> /run/lock >>>>>>> >> >> >> none 120M 0 120M >>> >> 0% >>>>>>> >> >> >> /run/shm >>>>>>> >> >> >> overflow 1.0M 4.0K 1020K >>> >> 1% >>>>>>> >> >> >> /tmp >>>>>>> >> >> >> /dev/xvda4 7.9G 147M 7.4G >>> >> 2% >>>>>>> >> >> >> /mnt >>>>>>> >> >> >> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 108G 75G >>> >> 59% >>>>>>> >> >> >> /groups/ch-geni-net/Hadoop-NET >>>>>>> >> >> >> 172.17.253.254:/q/proj/ch-geni-net 198G 108G 75G >>> >> 59% >>>>>>> >> >> >> /proj/ch-geni-net >>>>>>> >> >> >> root@nn:~# >>>>>>> >> >> >> >>>>>>> >> >> >> >>>>>>> >> >> >> I can see there is no space left on /dev/xvda2. >>>>>>> >> >> >> >>>>>>> >> >> >> How can I make hadoop to see newly mounted /dev/xvda4 ? Or do I >>>>>>> >> >> >> need >>>>>>> >> >> >> to >>>>>>> >> >> >> move the file manually from /dev/xvda2 to xvda4 ? >>>>>>> >> >> >> >>>>>>> >> >> >> >>>>>>> >> >> >> >>>>>>> >> >> >> Thanks & Regards, >>>>>>> >> >> >> >>>>>>> >> >> >> Abdul Navaz >>>>>>> >> >> >> Research Assistant >>>>>>> >> >> >> University of Houston Main Campus, Houston TX >>>>>>> >> >> >> Ph: 281-685-0388 >>>>>>> >> >> >> >>>>>> >> >> > >>>>>> >> >> > >>>>> >> >> >>>> >> > >>> >> >> >
