Re: [zfs-discuss] Why RAID 5 stops working in 2009

2008-08-18 Thread Roch - PAE
Kyle McDonald writes:
  Ross wrote:
   Just re-read that and it's badly phrased.  What I meant to say is that a 
   raid-z / raid-5 array based on 500GB drives seems to have around a 1 in 10 
   chance of loosing some data during a full rebuild.


 
  Actually, I think it's been explained already why this is actually one 
  area where RAID-Z will really start to show some of the was it's 
  different than it's RAID-5 ancestors. For one, A RAID-5 controller has 
  no idea of the filesystem, and there for has to rebuild every bit on the 
  disk, whether it's used or not, and if it cant' it will declare the 
  whole array unusable. RAID-Z on the other hand since it is integrated 
  with the filesystem, only needs to rebuild the *used* data, and won't 
  care if unused parts of the disks can't be rebuilt.
  
  Second, a factor that the author of that article leaves out is that 
  decent RAID-5, and RAID-Z can do 'scrubs' of the data at regular 
  intervals, and this will many times catch and deal with these read 
  problems well before they have a chance to take all your data with them. 
  The types of errors the author writes about many times are caused by how 
  accurately the block was written and not a defect of the media, so many 
  times they can be fixed by just rewriting the data to the same block. On 
  ZFS this will almost never happen, because of COW it will always choose 
  a new block to write to. I don't think many (if any) RAID-5 
  implementaions can change the location of data on a drive.
  

Moreover, ZFS stores redundant copies of metadata so even if
a full raid-z stripe  goes south, we  can  still rebuild  most of
pool data. It seems that at  worst, such double failures would lead
to a handful of un-recovered files.

-r


   -Kyle
  
   This message posted from opensolaris.org
   ___
   zfs-discuss mailing list
   zfs-discuss@opensolaris.org
   http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
  
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why RAID 5 stops working in 2009

2008-07-05 Thread Ross
I've read various articles along those lines.  My understanding is that a 500GB 
odd raid-z / raid-5 array has around a 1 in 10 chance of loosing at least some 
data during a rebuild.

I'd have raid-5 arrays fail at least 4 times, twice during a rebuild.  In most 
cases I've been able to recover the data (once by re-attaching the original 
failed drive since it proved more reliable than the 2nd one that failed).  
However on more than one occasion I've had to revert to backups.  Raid-6 was 
something I was waiting a long time for.

Now I use dual parity for everything I buy.  At home I've a six drive raid-z2 
box, at work the main server is a 16 drive 2 way mirror setup.  When using SATA 
drives capacity is cheap enough (that work server is still 2.5TB for around 
£2,500) and the peace of mind, particularly on the company servers is worth 
every penny.

If you're stuck with single parity raid-z, my advice would be to simply take a 
good set of backups and leave it at that until you can upgrade to dual parity.  
At the end of the day, the risk is relatively slight, and you're data's 
probably as much risk if you try to pro-actively replace a drive as if you just 
replace one when it fails.

Just scrub every so often, and make sure you've got good backups.  I don't 
expect you'll see too many problems.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why RAID 5 stops working in 2009

2008-07-05 Thread Ross
Just re-read that and it's badly phrased.  What I meant to say is that a raid-z 
/ raid-5 array based on 500GB drives seems to have around a 1 in 10 chance of 
loosing some data during a full rebuild.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why RAID 5 stops working in 2009

2008-07-05 Thread Kyle McDonald
Ross wrote:
 Just re-read that and it's badly phrased.  What I meant to say is that a 
 raid-z / raid-5 array based on 500GB drives seems to have around a 1 in 10 
 chance of loosing some data during a full rebuild.
  
  
   
Actually, I think it's been explained already why this is actually one 
area where RAID-Z will really start to show some of the was it's 
different than it's RAID-5 ancestors. For one, A RAID-5 controller has 
no idea of the filesystem, and there for has to rebuild every bit on the 
disk, whether it's used or not, and if it cant' it will declare the 
whole array unusable. RAID-Z on the other hand since it is integrated 
with the filesystem, only needs to rebuild the *used* data, and won't 
care if unused parts of the disks can't be rebuilt.

Second, a factor that the author of that article leaves out is that 
decent RAID-5, and RAID-Z can do 'scrubs' of the data at regular 
intervals, and this will many times catch and deal with these read 
problems well before they have a chance to take all your data with them. 
The types of errors the author writes about many times are caused by how 
accurately the block was written and not a defect of the media, so many 
times they can be fixed by just rewriting the data to the same block. On 
ZFS this will almost never happen, because of COW it will always choose 
a new block to write to. I don't think many (if any) RAID-5 
implementaions can change the location of data on a drive.

 -Kyle

 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Why RAID 5 stops working in 2009

2008-07-03 Thread Jim
Anyone here read the article Why RAID 5 stops working in 2009 at 
http://blogs.zdnet.com/storage/?p=162

Does RAIDZ have the same chance of unrecoverable read error as RAID5 in Linux 
if the RAID has to be rebuilt because of a faulty disk?  I imagine so because 
of the physical constraints that plague our hds.  Granted, the chance of 
failure in my case shouldn't be nearly as high as I will most likely recruit 
four or three 750gb drives- not in the order of 10tb.

With my opensolaris NAS, I will be scrubbing every week (consumer grade 
drives[every month for enterprise-grade]) as recommended in the ZFS best 
practices guide.  If I zpool status and I see that the scrub is increasingly 
fixing errors, would that mean that the disk is in fact headed towards failure 
or perhaps that the natural expansion of disk usage is to blame?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why RAID 5 stops working in 2009

2008-07-03 Thread Aaron Blew
My take is that since RAID-Z creates a stripe for every block
(http://blogs.sun.com/bonwick/entry/raid_z), it should be able to
rebuild the bad sectors on a per block basis.  I'd assume that the
likelihood of having bad sectors on the same places of all the disks
is pretty low since we're only reading the sectors related to the
block being rebuilt.  It also seems that fragmentation would work in
your favor here since the stripes would be distributed across more of
the platter(s), hopefully protecting you from a wonky manufacturing
defect that causes UREs on the same place on the disk.

-Aaron


On Thu, Jul 3, 2008 at 12:24 PM, Jim [EMAIL PROTECTED] wrote:
 Anyone here read the article Why RAID 5 stops working in 2009 at 
 http://blogs.zdnet.com/storage/?p=162

 Does RAIDZ have the same chance of unrecoverable read error as RAID5 in Linux 
 if the RAID has to be rebuilt because of a faulty disk?  I imagine so because 
 of the physical constraints that plague our hds.  Granted, the chance of 
 failure in my case shouldn't be nearly as high as I will most likely recruit 
 four or three 750gb drives- not in the order of 10tb.

 With my opensolaris NAS, I will be scrubbing every week (consumer grade 
 drives[every month for enterprise-grade]) as recommended in the ZFS best 
 practices guide.  If I zpool status and I see that the scrub is 
 increasingly fixing errors, would that mean that the disk is in fact headed 
 towards failure or perhaps that the natural expansion of disk usage is to 
 blame?


 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why RAID 5 stops working in 2009

2008-07-03 Thread Mike Gerdts
On Thu, Jul 3, 2008 at 3:09 PM, Aaron Blew [EMAIL PROTECTED] wrote:
 My take is that since RAID-Z creates a stripe for every block
 (http://blogs.sun.com/bonwick/entry/raid_z), it should be able to
 rebuild the bad sectors on a per block basis.  I'd assume that the
 likelihood of having bad sectors on the same places of all the disks
 is pretty low since we're only reading the sectors related to the
 block being rebuilt.  It also seems that fragmentation would work in
 your favor here since the stripes would be distributed across more of
 the platter(s), hopefully protecting you from a wonky manufacturing
 defect that causes UREs on the same place on the disk.

 -Aaron

The per-block statement above is important - zfs will only rebuild the
blocks that have data.  A 100TB pool with 1 GB in use will rebuild 1
GB.  As such, it is more a factor of the amount of data rather than
the size of the RAID device.  A periodic zpool scrub will likely turn
up read errors before you have a drive failure AND unrelated read
errors.

Since ZFS merges the volume management and file system layers such an
uncorrectable read would turn into zfs saying file /a/b/c is corrupt
- you need to restore it rather than traditional RAID5 saying this
12 TB volume is corrupt - restore it.  ZFS already makes multiple
copies of metadata so if  you were lucky and the corruption happens
to the metadata it should be able to get a working copy from
elsewhere.

Of course, raidz2 further decreases your chances of losing data.  I
would highly recommend reading Richard Elling's comments in this area.
 For example:

http://blogs.sun.com/relling/entry/raid_recommendations_space_vs_mttdl
http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_performance
http://blogs.sun.com/relling/entry/a_story_of_two_mttdl
http://opensolaris.org/jive/thread.jspa?threadID=65564#255257

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss