On Thu, Jul 3, 2008 at 3:09 PM, Aaron Blew <[EMAIL PROTECTED]> wrote:
> My take is that since RAID-Z creates a stripe for every block
> (http://blogs.sun.com/bonwick/entry/raid_z), it should be able to
> rebuild the bad sectors on a per block basis.  I'd assume that the
> likelihood of having bad sectors on the same places of all the disks
> is pretty low since we're only reading the sectors related to the
> block being rebuilt.  It also seems that fragmentation would work in
> your favor here since the stripes would be distributed across more of
> the platter(s), hopefully protecting you from a wonky manufacturing
> defect that causes UREs on the same place on the disk.
>
> -Aaron

The per-block statement above is important - zfs will only rebuild the
blocks that have data.  A 100TB pool with 1 GB in use will rebuild 1
GB.  As such, it is more a factor of the amount of data rather than
the size of the RAID device.  A periodic zpool scrub will likely turn
up read errors before you have a drive failure AND unrelated read
errors.

Since ZFS merges the volume management and file system layers such an
uncorrectable read would turn into zfs saying "file /a/b/c is corrupt
- you need to restore it" rather than traditional RAID5 saying "this
12 TB volume is corrupt - restore it".  ZFS already makes multiple
copies of metadata so if  you were "lucky" and the corruption happens
to the metadata it should be able to get a working copy from
elsewhere.

Of course, raidz2 further decreases your chances of losing data.  I
would highly recommend reading Richard Elling's comments in this area.
 For example:

http://blogs.sun.com/relling/entry/raid_recommendations_space_vs_mttdl
http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_performance
http://blogs.sun.com/relling/entry/a_story_of_two_mttdl
http://opensolaris.org/jive/thread.jspa?threadID=65564#255257

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to