Hello all,
I'd like to take the datanode's capability to handle multiple
directories to a somewhat-extreme, and get feedback on how well this
might work.
We have a few large RAID servers (12 to 48 disks) which we'd like to
transition to Hadoop. I'd like to mount each of the disks
individually (i.e., /mnt/disk1, /mnt/disk2, ....) and take advantage
of Hadoop's replication - instead of pay the overhead to set up a RAID
and still have to pay the overhead of replication.
However, we're a bit concerned about how well Hadoop might handle one
of the directories disappearing from underneath it. If a single
volume, say, /mnt/disk1 starts returning I/O errors, is Hadoop smart
enough to figure out that this whole volume is broken? Or will we
have to restart the datanode after any disk failure for it to search
the directory realize everything is broken? What happens if you start
up the datanode with a data directory that it can't write into?
Is anyone running in this fashion (i.e., multiple data directories
corresponding to different disk volumes ... even better if you're
doing it with more than a few disks)?
Brian
- Datanode handling of single disk failure Brian Bockelman
-