Hello,
As you suggested I have changed the hdfs-site.xml file of datanodes
and name node as below and formatted the name node.
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/mnt</value>
<description>Comma separated list of paths. Use the list of
directories from $DFS_DATA_DIR.
For example,
/grid/hadoop/hdfs/dn,/grid1/hadoop/hdfs/dn.</description>
</property>
hduser@dn1:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/xvda2 5.9G 5.3G 258M 96% /
udev 98M 4.0K 98M 1% /dev
tmpfs 48M 196K 48M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 120M 0 120M 0% /run/shm
172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 113G 70G 62%
/groups/ch-geni-net/Hadoop-NET
172.17.253.254:/q/proj/ch-geni-net 198G 113G 70G 62%
/proj/ch-geni-net
/dev/xvda4 7.9G 147M 7.4G 2% /mnt
hduser@dn1:~$
Even after doing so, the file is copied only to /dev/xvda2 instead of
/dev/xvda4.
Once /dev/xvda2 is full I am getting the below error message.
hduser@nn:~$ hadoop fs -put file.txtac /user/hduser/getty/file12.txt
Warning: $HADOOP_HOME is deprecated.
14/10/02 16:52:52 WARN hdfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/hduser/getty/file12.txt could only be replicated to 0 nodes,
instead of 1
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639)
Let me say like this: I don't want to use /dev/xvda2 as it has
capacity of 5.9GB , I want to use only /dev/xvda4. How can I do this ?
Thanks & Regards,
Abdul Navaz
Research Assistant
University of Houston Main Campus, Houston TX
Ph: 281-685-0388
From: Abdul Navaz <[email protected] <mailto:[email protected]>>
Date: Monday, September 29, 2014 at 1:53 PM
To: <[email protected] <mailto:[email protected]>>
Subject: Re: No space when running a hadoop job
Dear All,
I am not doing load balancing here. I am just copying a file and it is
throwing me an error no space left on the device.
hduser@dn1:~$ df -h
Filesystem Size Used Avail Use%
Mounted on
/dev/xvda2 5.9G 5.1G 533M 91% /
udev 98M 4.0K 98M 1% /dev
tmpfs 48M 196K 48M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 120M 0 120M 0% /run/shm
172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 116G 67G 64%
/groups/ch-geni-net/Hadoop-NET
172.17.253.254:/q/proj/ch-geni-net 198G 116G 67G 64%
/proj/ch-geni-net
/dev/xvda4 7.9G 147M 7.4G 2% /mnt
hduser@dn1:~$
hduser@dn1:~$
hduser@dn1:~$
hduser@dn1:~$ cp data2.txt data3.txt
cp: writing `data3.txt': No space left on device
cp: failed to extend `data3.txt': No space left on device
hduser@dn1:~$
I guess by default it is copying to default location. Why I am getting
this error ? How can I fix this ?
Thanks & Regards,
Abdul Navaz
Research Assistant
University of Houston Main Campus, Houston TX
Ph: 281-685-0388
From: Aitor Cedres <[email protected] <mailto:[email protected]>>
Reply-To: <[email protected] <mailto:[email protected]>>
Date: Monday, September 29, 2014 at 7:53 AM
To: <[email protected] <mailto:[email protected]>>
Subject: Re: No space when running a hadoop job
I think they way it works when HDFS has a list
in dfs.datanode.data.dir, it's basically a round robin between disks.
And yes, it may not be perfect balanced cause of different file sizes.
On 29 September 2014 13:15, Susheel Kumar Gadalay <[email protected]
<mailto:[email protected]>> wrote:
Thank Aitor.
That is what is my observation too.
I added a new disk location and manually moved some files.
But if 2 locations are given at the beginning itself for
dfs.datanode.data.dir, will hadoop balance the disks usage, if not
perfect because file sizes may differ.
On 9/29/14, Aitor Cedres <[email protected]
<mailto:[email protected]>> wrote:
> Hi Susheel,
>
> Adding a new directory to "dfs.datanode.data.dir" will not
balance your
> disks straightforward. Eventually, by HDFS activity
(deleting/invalidating
> some block, writing new ones), the disks will become balanced.
If you want
> to balance them right after adding the new disk and changing the
> "dfs.datanode.data.dir"
> value, you have to shutdown the DN and manually move (mv) some
files in the
> old directory to the new one.
>
> The balancer will try to balance the usage between HDFS nodes,
but it won't
> care about "internal" node disks utilization. For your
particular case, the
> balancer won't fix your issue.
>
> Hope it helps,
> Aitor
>
> On 29 September 2014 05:53, Susheel Kumar Gadalay
<[email protected] <mailto:[email protected]>>
> wrote:
>
>> You mean if multiple directory locations are given, Hadoop will
>> balance the distribution of files across these different
directories.
>>
>> But normally we start with 1 directory location and once it is
>> reaching the maximum, we add new directory.
>>
>> In this case how can we balance the distribution of files?
>>
>> One way is to list the files and move.
>>
>> Will start balance script will work?
>>
>> On 9/27/14, Alexander Pivovarov <[email protected]
<mailto:[email protected]>> wrote:
>> > It can read/write in parallel to all drives. More hdd more io
speed.
>> > On Sep 27, 2014 7:28 AM, "Susheel Kumar Gadalay"
<[email protected] <mailto:[email protected]>>
>> > wrote:
>> >
>> >> Correct me if I am wrong.
>> >>
>> >> Adding multiple directories will not balance the files
distributions
>> >> across these locations.
>> >>
>> >> Hadoop will add exhaust the first directory and then start
using the
>> >> next, next ..
>> >>
>> >> How can I tell Hadoop to evenly balance across these
directories.
>> >>
>> >> On 9/26/14, Matt Narrell <[email protected]
<mailto:[email protected]>> wrote:
>> >> > You can add a comma separated list of paths to the
>> >> "dfs.datanode.data.dir"
>> >> > property in your hdfs-site.xml
>> >> >
>> >> > mn
>> >> >
>> >> > On Sep 26, 2014, at 8:37 AM, Abdul Navaz
<[email protected] <mailto:[email protected]>>
>> >> > wrote:
>> >> >
>> >> >> Hi
>> >> >>
>> >> >> I am facing some space issue when I saving file into HDFS
and/or
>> >> >> running
>> >> >> map reduce job.
>> >> >>
>> >> >> root@nn:~# df -h
>> >> >> Filesystem Size Used Avail
>> Use%
>> >> >> Mounted on
>> >> >> /dev/xvda2 5.9G 5.9G 0
>> 100%
>> >> >> /
>> >> >> udev 98M 4.0K 98M
>> 1%
>> >> >> /dev
>> >> >> tmpfs 48M 192K 48M
>> 1%
>> >> >> /run
>> >> >> none 5.0M 0 5.0M
>> 0%
>> >> >> /run/lock
>> >> >> none 120M 0 120M
>> 0%
>> >> >> /run/shm
>> >> >> overflow 1.0M 4.0K 1020K
>> 1%
>> >> >> /tmp
>> >> >> /dev/xvda4 7.9G 147M 7.4G
>> 2%
>> >> >> /mnt
>> >> >> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G
108G 75G
>> 59%
>> >> >> /groups/ch-geni-net/Hadoop-NET
>> >> >> 172.17.253.254:/q/proj/ch-geni-net 198G 108G 75G
>> 59%
>> >> >> /proj/ch-geni-net
>> >> >> root@nn:~#
>> >> >>
>> >> >>
>> >> >> I can see there is no space left on /dev/xvda2.
>> >> >>
>> >> >> How can I make hadoop to see newly mounted /dev/xvda4 ?
Or do I
>> >> >> need
>> >> >> to
>> >> >> move the file manually from /dev/xvda2 to xvda4 ?
>> >> >>
>> >> >>
>> >> >>
>> >> >> Thanks & Regards,
>> >> >>
>> >> >> Abdul Navaz
>> >> >> Research Assistant
>> >> >> University of Houston Main Campus, Houston TX
>> >> >> Ph: 281-685-0388
>> >> >>
>> >> >
>> >> >
>> >>
>> >
>>
>