Re: balance blocks between small and bigger disks in the same datanode.

Patai Sangbutsarakum Tue, 25 Oct 2011 10:06:39 -0700

Good morning Harsh,
Thanks for late night reply ;-)

>> Quick q: were some disks added later, as part of this datanode?
there is no new disks added.. i just planned to load off data blk from
that small partition to other bigger partitions,
but seem to me that bring down 130 nodes just for moving blk is sth
need to seriously considered, and later on
if i ran rebalance, /hadoop1 will be filled back again.


Is there anyway to tell hadoop to stop using _a partition_ once free
space of a partition hit certain limit ?

as far as I researched, it point to "dfs.datanode.du.reserved" which
in this case if i put dfs.datanode.du.reserved = (33G in byte)

DFS still continue using /hadoop2, /hadoop3... but not fill more blk
on /hadoop1?

Please suggest,
-Patai



On Tue, Oct 25, 2011 at 1:49 AM, Harsh J <[email protected]> wrote:
> Patai,
>
> 1. HDFS as the whole service.
> 2.1. Yes.
> 2.2. Yes, the directory parent must be current.
> 2.3. Yes you can move the whole subdirectory.
>
> Quick q: were some disks added later, as part of this datanode?
>
> On Tuesday, October 25, 2011, Patai Sangbutsarakum <[email protected]>
> wrote:
>> Hi All,
>>
>> I was looking into FAQ, but well still have questions.
>> Datanodes in my production are running low in the space of one of
> dfs.data.dir
>>
>>
>> /dev/sda5             --> 355G   322G    33G  91% /hadoop1  <----
>> /dev/sdb1             --> 484G   324G   161G  67% /hadoop2
>> /dev/sdc1                   484G   318G   167G  66% /hadoop3
>>
>> /hadoop1 has smaller space since the very beginning because its drive
>> is being shared with operating system.
>> I found one FAQ in wiki page
>> "3.12. On an individual data node, how do you balance the blocks on the
> disk?
>>
>> Hadoop currently does not have a method by which to do this
>> automatically. To do this manually:
>>
>> 1    Take down the HDFS
>> 2   Use the UNIX mv command to move the individual blocks and meta
>> pairs from one directory to another on each host
>> 3    Restart the HDFS "
>>
>>
>> Question of step 1, take down the hdfs.
>> does that mean the whole cluster OR just datanode process of a
>> datanode/tasktracker host?
>>
>> Question of step 2,
>>
>> 2.1 "moving blk and meta pair."
>>
>> are blk and meta pairs referring to
>>
>> cd /hadoop1/data/current
>> $ ls -al *8816473533602921489*
>> -rw-rw-r-- 1 apps apps 1734467 Aug 27 21:03 blk_-8816473533602921489
>> -rw-rw-r-- 1 apps apps      63 Aug 27 21:03
>> blk_-8816473533602921489_78445781.meta
>>
>> ???
>>
>> 2.2 "from one directory to another on each host"
>>
>> does it needs to be like blk(and meta) from "current" has to be landed
>> to "current" directory of another dfs.data.dir
>> mv /hadoop1/data/current/*8816473533602921489* /hadoop2/data/current/
>>
>> or it can be different directory name in destination side.
>>
>>
>> 2.3 how about subdirXX?
>>
>> under /hadoop1/data/current/
>> ....
>> ....
>> 55G     subdir36
>> 49G     subdir37
>> .....
>> .....
>>
>> it is so tempting to move subdir36, subdir37 because they are huge.
>> should it look like
>>
>> mv /hadoop1/data/current/subdir36/*  /hadoop2/data/current/subdir36/
>>
>> well... under /hadoop2/data/current/subdir36/
>> also have bunch of blk(and meta) and bunch of subdirectories as well
>> which mean if i do move, it might be some collide ?
>>
>>
>> Thanks in advances.
>> -P
>>
>
> --
> Harsh J
>

Re: balance blocks between small and bigger disks in the same datanode.

Reply via email to