Re: No space when running a hadoop job

Susheel Kumar Gadalay Mon, 29 Sep 2014 05:16:00 -0700

Thank Aitor.

That is what is my observation too.


I added a new disk location and manually moved some files.

But if 2 locations are given at the beginning itself for
dfs.datanode.data.dir, will hadoop balance the disks usage, if not
perfect because file sizes may differ.

On 9/29/14, Aitor Cedres <[email protected]> wrote:
> Hi Susheel,
>
> Adding a new directory to “dfs.datanode.data.dir” will not balance your
> disks straightforward. Eventually, by HDFS activity (deleting/invalidating
> some block, writing new ones), the disks will become balanced. If you want
> to balance them right after adding the new disk and changing the
> “dfs.datanode.data.dir”
> value, you have to shutdown the DN and manually move (mv) some files in the
> old directory to the new one.
>
> The balancer will try to balance the usage between HDFS nodes, but it won't
> care about "internal" node disks utilization. For your particular case, the
> balancer won't fix your issue.
>
> Hope it helps,
> Aitor
>
> On 29 September 2014 05:53, Susheel Kumar Gadalay <[email protected]>
> wrote:
>
>> You mean if multiple directory locations are given, Hadoop will
>> balance the distribution of files across these different directories.
>>
>> But normally we start with 1 directory location and once it is
>> reaching the maximum, we add new directory.
>>
>> In this case how can we balance the distribution of files?
>>
>> One way is to list the files and move.
>>
>> Will start balance script will work?
>>
>> On 9/27/14, Alexander Pivovarov <[email protected]> wrote:
>> > It can read/write in parallel to all drives. More hdd more io speed.
>> >  On Sep 27, 2014 7:28 AM, "Susheel Kumar Gadalay" <[email protected]>
>> > wrote:
>> >
>> >> Correct me if I am wrong.
>> >>
>> >> Adding multiple directories will not balance the files distributions
>> >> across these locations.
>> >>
>> >> Hadoop will add exhaust the first directory and then start using the
>> >> next, next ..
>> >>
>> >> How can I tell Hadoop to evenly balance across these directories.
>> >>
>> >> On 9/26/14, Matt Narrell <[email protected]> wrote:
>> >> > You can add a comma separated list of paths to the
>> >> “dfs.datanode.data.dir”
>> >> > property in your hdfs-site.xml
>> >> >
>> >> > mn
>> >> >
>> >> > On Sep 26, 2014, at 8:37 AM, Abdul Navaz <[email protected]>
>> >> > wrote:
>> >> >
>> >> >> Hi
>> >> >>
>> >> >> I am facing some space issue when I saving file into HDFS and/or
>> >> >> running
>> >> >> map reduce job.
>> >> >>
>> >> >> root@nn:~# df -h
>> >> >> Filesystem                                       Size  Used Avail
>> Use%
>> >> >> Mounted on
>> >> >> /dev/xvda2                                       5.9G  5.9G     0
>> 100%
>> >> >> /
>> >> >> udev                                              98M  4.0K   98M
>>  1%
>> >> >> /dev
>> >> >> tmpfs                                             48M  192K   48M
>>  1%
>> >> >> /run
>> >> >> none                                             5.0M     0  5.0M
>>  0%
>> >> >> /run/lock
>> >> >> none                                             120M     0  120M
>>  0%
>> >> >> /run/shm
>> >> >> overflow                                         1.0M  4.0K 1020K
>>  1%
>> >> >> /tmp
>> >> >> /dev/xvda4                                       7.9G  147M  7.4G
>>  2%
>> >> >> /mnt
>> >> >> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET  198G  108G   75G
>> 59%
>> >> >> /groups/ch-geni-net/Hadoop-NET
>> >> >> 172.17.253.254:/q/proj/ch-geni-net               198G  108G   75G
>> 59%
>> >> >> /proj/ch-geni-net
>> >> >> root@nn:~#
>> >> >>
>> >> >>
>> >> >> I can see there is no space left on /dev/xvda2.
>> >> >>
>> >> >> How can I make hadoop to see newly mounted /dev/xvda4 ? Or do I
>> >> >> need
>> >> >> to
>> >> >> move the file manually from /dev/xvda2 to xvda4 ?
>> >> >>
>> >> >>
>> >> >>
>> >> >> Thanks & Regards,
>> >> >>
>> >> >> Abdul Navaz
>> >> >> Research Assistant
>> >> >> University of Houston Main Campus, Houston TX
>> >> >> Ph: 281-685-0388
>> >> >>
>> >> >
>> >> >
>> >>
>> >
>>
>

Re: No space when running a hadoop job

Reply via email to