I wonder why this unbalance produce? 2012/3/17 Zizon Qiu <zzd...@gmail.com>
> if there are only dfs files under /data and /data2,it will be ok when > filled up. > unless some other files like mapreduce teme folder or even a namenode > image,it may broken the cluster when disk was filled up(as namenode can not > do a checkpoint or mapreduce framework can not continue as no disk space > for intermediate files). > > 1) bring down HDFS and just manually move ~50% of the > /data/dfs/dn/current/subdir* directories over to /data2 and then bring HDFS > back up > moving around the files may work,but I not sure. > the datanode MAY report it back to namenode the updated location. > > But > 2) bring a data node down one at a time, clean our /data and /data2, put > the node back into rotation and let the balancer distribute replication > data back onto the node and since it will round robin to both (now empty) > disks, I will wind up with a nicely balanced data node. Repeat this process > for the remaining nodes. > this works fine. > > Tips: > Your may config the *dfs.datanode.du.reserved* to setup volume quota > for each datanode volume.but take care of the formula hadoop used to > calculate the free disk space. > > On Sat, Mar 17, 2012 at 8:57 PM, Tom Wilberding <t...@wilberding.com>wrote: > >> Hi there, >> >> Our data nodes all have 2 disks, one which is nearly full and one which >> is nearly empty: >> >> $ df -h >> Filesystem Size Used Avail Use% Mounted on >> /dev/mapper/VolGroup00-LogVol00 >> 120G 11G 104G 9% / >> /dev/cciss/c0d0p1 99M 35M 60M 37% /boot >> tmpfs 7.9G 0 7.9G 0% /dev/shm >> /dev/cciss/c0d1 1.8T 1.7T 103G 95% /data >> /dev/cciss/c0d2 1.8T 76G 1.8T 5% /data2 >> >> >> Reading through the docs and mailing list archives, my understanding is >> that HDFS will continue to round robin to both disks until /data is >> completely full and then only write to /data2. Is this correct? Does it >> really write until the disk is 100% full (or as close to full as possible?) >> >> Ignoring performance of this situation and the monitoring hassles of >> having full disks, I just want to be sure that nothing bad is going to >> happen over the next couple of days as we fill up that /data partition. >> >> I understand that my best two options to rebalance each data node would >> be to either: >> 1) bring down HDFS and just manually move ~50% of the >> /data/dfs/dn/current/subdir* directories over to /data2 and then bring HDFS >> back up >> 2) bring a data node down one at a time, clean our /data and /data2, put >> the node back into rotation and let the balancer distribute replication >> data back onto the node and since it will round robin to both (now empty) >> disks, I will wind up with a nicely balanced data node. Repeat this process >> for the remaining nodes. >> >> I'm relatively new to HDFS, so can someone please confirm whether what >> I'm saying is correct? Any tips, tricks or things to watch out for would >> also be greatly appreciated. >> >> Thanks, >> Tom > > >