Dear hadoopers,

Has anyone been confronted to deploying a cluster in a traditional IT shop whose admins handle thousands of servers ? They traditionally use SAN or NAS storage for app data, rely on RAID 1 for system disks and in the few cases where internal disks are used, they configure them with RAID 5 provided by the internal HW controller.

Using a JBOD setup , as advised in each and every Hadoop doc I ever laid my hands on, means that each HDD failure will imply, on top of the physical replacement of the drive, that an admin performs at least an mkfs. Added to the fact that these operations will become more frequent since more internal disks will be used, it can be perceived as an annoying disruption in industrial handling of numerous servers.

In Tom White's guide there is a discussion of RAID 0, stating that Yahoo benchmarks showed a 10% loss in performance so we can expect even worse perf with RAID 5 but I found no figures.

I also found an Hortonworks interview of StackIQ who provides software to automate such failure fix up. But it would be rather painful to go straight to another solution, contract and so on while starting with Hadoop.

Please share your experiences around RAID for redundancy (1, 5 or other) in Hadoop conf.

Thank you
Ulul


Reply via email to