Eli Collins wrote:
Hey Mag,

You can bring down the datanode daemon, add the extra dfs.data.dir and
then restart. Since blocks are round robin'd the new directory will
have lower utilization (one other directories are full it will start
catching up). If that's not OK you can re-balance the directories by
hand with cp when the datanode is down (before you restart it).  If
this takes you longer than 10 minutes the blocks on that datanode will
start getting re-replicated but when you bring the datanode back up
the namenode will notice the over-replicated blocks and remove them.

that brings up a couple of issues I've been thinking about now that workers can go to 6+ HDDs/node

* a way to measure the distribution across disks, rather than just nodes. DfsClient doesn't provide enough info here yet. * a way to triger some rebalancing on a single node, to say "position stuff more fairly". You don't need to worry about network traffic, just local disk load and CPU time, so it should be simpler.

Reply via email to