On 2011-06-10, at 10:58 AM, David Noriega wrote: > I was checking out zfsonlinux.org to see how things have been going > lately and I had a question. Whats the difference, or whats better: > Use a hardware raid5(or 6) or use zfs to create a raidz pool? In terms > of Lustre, is one preferred over another?
ZFS much prefers to have direct access to the individual disks in a JBOD, instead of via h/w RAID-5/6. There are several reasons: - it "knows" where the data and parity are located, and if there is an error reading data from disk it can retry with different data/parity combinations until the checksum matches, even trying single-bit error recovery in extreme cases - it is easier to locate multiple copies of the metadata on different disks and if it has direct access to the individual disks - it has more IO queues and can schedule IO better for individual disks, keeping the IO queue relatively shallow so that read latency isn't hurt - pooled storage, in theory, allows all space/bandwidth to be used by any thread doing IO. In practice this doesn't perform as well as in theory. - no read-modify-write when writing "partial block" data (there isn't really such a thing as a "partial block write" for RAID-Z" The main drawback is that RAID-Z needs a lot more effort when rebuilding a failed disk compared to a normal RAID-5/6. ZFS proponents will claim that "it only needs to rebuild the used parts of the filesystem", but most HPC filesystems are kept 70-80% full, so the RAID-Z overhead wipes out any advantage gained by not rebuilding the 20% of unused space. See zfsonlinux.org/docs/SC10_BoF_ZFS_on_Linux_for_Lustre.pdf for some performance comparisons. Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
