Hi,
I have a HDFS cluster consisting of several hosts.
On each node, I add a new disk when the current capacity is close to full.
Right now, every server has more or less such distribution of data:
/dev/sdf 493G 468G 51M 100% /data1
/dev/sdg 493G 468G 51M 100% /data2
/dev/sdh 493G 103G 365G 22% /data3
/dev/sdi 493G 100G 368G 22% /data4
So, /dev/sdf and /dev/sdg almost 100% full, and there is lots of free
space on /dev/sdh and /dev/sdi.
Disks which are 100% full don't make monitoring very happy.
Is it possible to rebalance data on the disks on one HDFS server (or,
more servers)?
"hadoop balancer" will want to rebalance data between the servers, but
not between the disks on one server.
--
Tomasz Chmielewski
http://wpkg.org