Runping Qi wrote:
Hi,
We at Yahoo did some Hadoop benchmarking experiments on clusters with JBOD
and RAID0. We found that under heavy loads (such as gridmix), JBOD cluster
performed better.
Gridmix tests:
Load: gridmix2
Cluster size: 190 nodes
Test results:
RAID0: 75 minutes
JBOD: 67 minutes
Difference: 10%
Tests on HDFS writes performances
We ran map only jobs writing data to dfs concurrently on different clusters.
The overall dfs write throughputs on the jbod cluster are 30% (with a 58
nodes cluster) and 50% (with an 18 nodes cluster) better than that on the
raid0 cluster, respectively.
To understand why, we did some file level benchmarking on both clusters.
We found that the file write throughput on a JBOD machine is 30% higher than
that on a comparable machine with RAID0. This performance difference may be
explained by the fact that the throughputs of different disks can vary 30%
to 50%. With such variations, the overall throughput of a raid0 system may
be bottlenecked by the slowest disk.
-- Runping
This is really interesting. Thank you for sharing these results!
Presumably the servers were all set up with "nominally" homogenous
hardware? And yet still the variations existed. That would be something
to experiment with on new versus old clusters to see if it gets worse
over time.
Here we have a batch of desktop workstations all bought at the same
time, to the same spec, but one of them, "lucky" is more prone to race
conditions than any of the others. We don't know why, and assume its do
with the (multiple) Xeon CPU chips being at different ends of the bell
curve or something. all we know is: test on that box before shipping to
find race conditions early.
-steve