Re: No space when running a hadoop job

ViSolve Hadoop Support Thu, 02 Oct 2014 23:31:02 -0700

Hello,

If you want to use drive /dev/xvda4 only, then add file location for'/dev/xvda4' and remove the file location for '/dev/xvda2' under"dfs.datanode.data.dir".

After the changes restart the hadoop services and check the availablespace using the below command.

     # hadoop fs -df -h

Regards,
ViSolve Hadoop Team

On 10/3/2014 4:36 AM, Abdul Navaz wrote:

Hello,

As you suggested I have changed the hdfs-site.xml file of datanodesand name node as below and formatted the name node.


</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>/mnt</value>

<description>Comma separated list of paths. Use the list ofdirectories from $DFS_DATA_DIR.

For example,/grid/hadoop/hdfs/dn,/grid1/hadoop/hdfs/dn.</description>


</property>



hduser@dn1:~$ df -h

Filesystem                             Size  Used Avail Use% Mounted on

/dev/xvda2                             5.9G  5.3G  258M  96% /

udev                             98M  4.0K   98M   1% /dev

tmpfs                             48M  196K   48M   1% /run

none                             5.0M     0  5.0M   0% /run/lock

none                             120M     0  120M   0% /run/shm

172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 113G 70G 62%/groups/ch-geni-net/Hadoop-NET

172.17.253.254:/q/proj/ch-geni-net 198G 113G 70G 62%/proj/ch-geni-net


/dev/xvda4                             7.9G  147M  7.4G   2% /mnt

hduser@dn1:~$

Even after doing so, the file is copied only to /dev/xvda2 instead of/dev/xvda4.


Once /dev/xvda2 is full I am getting the below error message.

hduser@nn:~$ hadoop fs -put file.txtac /user/hduser/getty/file12.txt

Warning: $HADOOP_HOME is deprecated.

14/10/02 16:52:52 WARN hdfs.DFSClient: DataStreamer Exception:org.apache.hadoop.ipc.RemoteException: java.io.IOException: File/user/hduser/getty/file12.txt could only be replicated to 0 nodes,instead of 1

atorg.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639)

Let me say like this: I don't want to use /dev/xvda2 as it hascapacity of 5.9GB , I want to use only /dev/xvda4. How can I do this ?





Thanks & Regards,

Abdul Navaz
Research Assistant
University of Houston Main Campus, Houston TX
Ph: 281-685-0388


From: Abdul Navaz <[email protected] <mailto:[email protected]>>
Date: Monday, September 29, 2014 at 1:53 PM
To: <[email protected] <mailto:[email protected]>>
Subject: Re: No space when running a hadoop job

Dear All,

I am not doing load balancing here. I am just copying a file and it isthrowing me an error no space left on the device.



hduser@dn1:~$ df -h

Filesystem Size Used Avail Use%Mounted on


/dev/xvda2               5.9G  5.1G  533M  91% /

udev                                     98M  4.0K   98M   1% /dev

tmpfs                                     48M  196K   48M   1% /run

none                                     5.0M     0  5.0M   0% /run/lock

none                                     120M     0  120M   0% /run/shm

172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 116G 67G 64%/groups/ch-geni-net/Hadoop-NET

172.17.253.254:/q/proj/ch-geni-net 198G 116G 67G 64%/proj/ch-geni-net


/dev/xvda4               7.9G  147M  7.4G   2% /mnt

hduser@dn1:~$

hduser@dn1:~$

hduser@dn1:~$

hduser@dn1:~$ cp data2.txt data3.txt

cp: writing `data3.txt': No space left on device

cp: failed to extend `data3.txt': No space left on device

hduser@dn1:~$

I guess by default it is copying to default location. Why I am gettingthis error ? How can I fix this ?



Thanks & Regards,

Abdul Navaz
Research Assistant
University of Houston Main Campus, Houston TX
Ph: 281-685-0388


From: Aitor Cedres <[email protected] <mailto:[email protected]>>
Reply-To: <[email protected] <mailto:[email protected]>>
Date: Monday, September 29, 2014 at 7:53 AM
To: <[email protected] <mailto:[email protected]>>
Subject: Re: No space when running a hadoop job

I think they way it works when HDFS has a listin dfs.datanode.data.dir, it's basically a round robin between disks.And yes, it may not be perfect balanced cause of different file sizes.

On 29 September 2014 13:15, Susheel Kumar Gadalay <[email protected]<mailto:[email protected]>> wrote:


    Thank Aitor.

    That is what is my observation too.

    I added a new disk location and manually moved some files.

    But if 2 locations are given at the beginning itself for
    dfs.datanode.data.dir, will hadoop balance the disks usage, if not
    perfect because file sizes may differ.

    On 9/29/14, Aitor Cedres <[email protected]
    <mailto:[email protected]>> wrote:
    > Hi Susheel,
    >
    > Adding a new directory to "dfs.datanode.data.dir" will not
    balance your
    > disks straightforward. Eventually, by HDFS activity
    (deleting/invalidating
    > some block, writing new ones), the disks will become balanced.
    If you want
    > to balance them right after adding the new disk and changing the
    > "dfs.datanode.data.dir"
    > value, you have to shutdown the DN and manually move (mv) some
    files in the
    > old directory to the new one.
    >
    > The balancer will try to balance the usage between HDFS nodes,
    but it won't
    > care about "internal" node disks utilization. For your
    particular case, the
    > balancer won't fix your issue.
    >
    > Hope it helps,
    > Aitor
    >
    > On 29 September 2014 05:53, Susheel Kumar Gadalay
    <[email protected] <mailto:[email protected]>>
    > wrote:
    >
    >> You mean if multiple directory locations are given, Hadoop will
    >> balance the distribution of files across these different
    directories.
    >>
    >> But normally we start with 1 directory location and once it is
    >> reaching the maximum, we add new directory.
    >>
    >> In this case how can we balance the distribution of files?
    >>
    >> One way is to list the files and move.
    >>
    >> Will start balance script will work?
    >>
    >> On 9/27/14, Alexander Pivovarov <[email protected]
    <mailto:[email protected]>> wrote:
    >> > It can read/write in parallel to all drives. More hdd more io
    speed.
    >> >  On Sep 27, 2014 7:28 AM, "Susheel Kumar Gadalay"
    <[email protected] <mailto:[email protected]>>
    >> > wrote:
    >> >
    >> >> Correct me if I am wrong.
    >> >>
    >> >> Adding multiple directories will not balance the files
    distributions
    >> >> across these locations.
    >> >>
    >> >> Hadoop will add exhaust the first directory and then start
    using the
    >> >> next, next ..
    >> >>
    >> >> How can I tell Hadoop to evenly balance across these
    directories.
    >> >>
    >> >> On 9/26/14, Matt Narrell <[email protected]
    <mailto:[email protected]>> wrote:
    >> >> > You can add a comma separated list of paths to the
    >> >> "dfs.datanode.data.dir"
    >> >> > property in your hdfs-site.xml
    >> >> >
    >> >> > mn
    >> >> >
    >> >> > On Sep 26, 2014, at 8:37 AM, Abdul Navaz
    <[email protected] <mailto:[email protected]>>
    >> >> > wrote:
    >> >> >
    >> >> >> Hi
    >> >> >>
    >> >> >> I am facing some space issue when I saving file into HDFS
    and/or
    >> >> >> running
    >> >> >> map reduce job.
    >> >> >>
    >> >> >> root@nn:~# df -h
    >> >> >> Filesystem                              Size  Used Avail
    >> Use%
    >> >> >> Mounted on
    >> >> >> /dev/xvda2                              5.9G  5.9G     0
    >> 100%
    >> >> >> /
    >> >> >> udev                               98M  4.0K   98M
    >>  1%
    >> >> >> /dev
    >> >> >> tmpfs                                48M  192K   48M
    >>  1%
    >> >> >> /run
    >> >> >> none                              5.0M     0  5.0M
    >>  0%
    >> >> >> /run/lock
    >> >> >> none                              120M     0  120M
    >>  0%
    >> >> >> /run/shm
    >> >> >> overflow                              1.0M  4.0K 1020K
    >>  1%
    >> >> >> /tmp
    >> >> >> /dev/xvda4                              7.9G  147M  7.4G
    >>  2%
    >> >> >> /mnt

>> >> >> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G108G 75G

    >> 59%
    >> >> >> /groups/ch-geni-net/Hadoop-NET
    >> >> >> 172.17.253.254:/q/proj/ch-geni-net    198G  108G   75G
    >> 59%
    >> >> >> /proj/ch-geni-net
    >> >> >> root@nn:~#
    >> >> >>
    >> >> >>
    >> >> >> I can see there is no space left on /dev/xvda2.
    >> >> >>
    >> >> >> How can I make hadoop to see newly mounted /dev/xvda4 ?
    Or do I
    >> >> >> need
    >> >> >> to
    >> >> >> move the file manually from /dev/xvda2 to xvda4 ?
    >> >> >>
    >> >> >>
    >> >> >>
    >> >> >> Thanks & Regards,
    >> >> >>
    >> >> >> Abdul Navaz
    >> >> >> Research Assistant
    >> >> >> University of Houston Main Campus, Houston TX
    >> >> >> Ph: 281-685-0388
    >> >> >>
    >> >> >
    >> >> >
    >> >>
    >> >
    >>
    >

Re: No space when running a hadoop job

Reply via email to