Re: Disk on data node full

Yanbo Liang Sat, 24 Mar 2012 20:32:38 -0700

I wonder why this unbalance produce?

2012/3/17 Zizon Qiu <zzd...@gmail.com>


> if there are only dfs files under /data and /data2,it will be ok when
> filled up.
> unless some other files like mapreduce teme folder or even a namenode
> image,it may broken the cluster when disk was filled up(as namenode can not
> do a checkpoint or mapreduce framework can not continue as no disk space
> for intermediate files).
>
> 1) bring down HDFS and just manually move ~50% of the
> /data/dfs/dn/current/subdir* directories over to /data2 and then bring HDFS
> back up
> moving around the files may work,but I not sure.
> the datanode MAY report it back to namenode the updated location.
>
> But
> 2) bring a data node down one at a time, clean our /data and /data2, put
> the node back into rotation and let the balancer distribute replication
> data back onto the node and since it will round robin to both (now empty)
> disks, I will wind up with a nicely balanced data node. Repeat this process
> for the remaining nodes.
> this works fine.
>
> Tips:
>   Your may config the *dfs.datanode.du.reserved* to setup volume quota
> for each datanode volume.but take care of the formula hadoop used to
> calculate the free disk space.
>
> On Sat, Mar 17, 2012 at 8:57 PM, Tom Wilberding <t...@wilberding.com>wrote:
>
>> Hi there,
>>
>> Our data nodes all have 2 disks, one which is nearly full and one which
>> is nearly empty:
>>
>> $ df -h
>> Filesystem            Size  Used Avail Use% Mounted on
>> /dev/mapper/VolGroup00-LogVol00
>>                      120G   11G  104G   9% /
>> /dev/cciss/c0d0p1      99M   35M   60M  37% /boot
>> tmpfs                 7.9G     0  7.9G   0% /dev/shm
>> /dev/cciss/c0d1       1.8T  1.7T  103G  95% /data
>> /dev/cciss/c0d2       1.8T   76G  1.8T   5% /data2
>>
>>
>> Reading through the docs and mailing list archives, my understanding is
>> that HDFS will continue to round robin to both disks until /data is
>> completely full and then only write to /data2. Is this correct? Does it
>> really write until the disk is 100% full (or as close to full as possible?)
>>
>> Ignoring performance of this situation and the monitoring hassles of
>> having full disks, I just want to be sure that nothing bad is going to
>> happen over the next couple of days as we fill up that /data partition.
>>
>> I understand that my best two options to rebalance each data node would
>> be to either:
>> 1) bring down HDFS and just manually move ~50% of the
>> /data/dfs/dn/current/subdir* directories over to /data2 and then bring HDFS
>> back up
>> 2) bring a data node down one at a time, clean our /data and /data2, put
>> the node back into rotation and let the balancer distribute replication
>> data back onto the node and since it will round robin to both (now empty)
>> disks, I will wind up with a nicely balanced data node. Repeat this process
>> for the remaining nodes.
>>
>> I'm relatively new to HDFS, so can someone please confirm whether what
>> I'm saying is correct? Any tips, tricks or things to watch out for would
>> also be greatly appreciated.
>>
>> Thanks,
>> Tom
>
>
>

Re: Disk on data node full

Reply via email to