Re: Proposed RAID5 design changes.

Dan Jones Thu, 15 Mar 2001 14:16:58 -0800
Max TenEyck Woodbury wrote:
> 
> I'm not too happy with the linux RAID5 implementation. In my
> opinion, a number of changes need to be made, but I'm not sure how to
> make them or get them accepted into the official distribution if I did
> make the changes.
> 
> The changes I think should be made in order of priority are:
> 

Good topic.

Meta comment: Error recovery is a challenging topic. A consistent approach
requires co-ordination between the different levels i.e. drive firmware,
driver, and mid-layer. I don't pretend to know enough about the kernel
architecture to recommend where everything should go, but it seems that
there should be some organizing principles. We should have a recommended
SCSI modepage error recovery default, so the driver can make some 
simplifying assumptions as to what an error means i.e. how much the drive
has tried already. Then, the question becomes how tailored the error
recovery action should be depending upon the SCSI sense key and status
qualifiers. I realize that this can be complicated by vendor uniqueness.

> 1) Read and write errors should be retried at least once before kicking
>    the drive out of the array.
> 

This doesn't seem unreasonable on the face of it.

> 2) On more persistent read errors, the failed block (or whatever unit is
>    represented by a buffer) should be reconstructed from the parity set,
>    and the buffer marked dirty so good data is written back to the disk
>    with the error.
> 

Generally, it is a good idea to try to rewrite (assuming the original data
can be recovered via another method) as sector. If we were really good, we
would run a short (changing random) pattern test on the sector before the 
rewrite to determine whether sparing would be a good idea. Micro defects or
other write glitches can cause a persistent read error, which does not
repeat after rewriting the data. The other school of thought is that good
systems & drives should not get many errors for any reason, so reconstruct
the data and spare the sector and get on with life.

> 3) Drives should not be kicked out of the array unless they are having
>    really persistent problems. I've an idea on how to define 'really
>    persistent' but it requires a bit of math to explain, so I'll only
>    go into it if someone is interested.
> 

Depends upon the error. Although I like error recovery and all that entails,
I also know that error recovery code requires very thoughtful design, is not 
exercised very often, and is often buggy. Otherwise, I love it.

> Then there are two changes that might improve recovery performance:
> 
> 4) If the drive being kicked out is not totally inoperable and there is
>    a spare drive to replace it, try to copy the data from the failing
>    drive to the spare rather than reconstructing the data from all the
>    other disks. Fall back to full reconstruction if the error rate gets
>    too high.
> 
> 5) When doing (4) use the SCSI 'copy' command if the drives are on the
>    same bus, and the host adapter and driver supports 'copy'. However,
>    this should be done with caution. 'copy' is not generally used and
>    any number of undetected firmware bugs might make it unreliable.
>    An additional category may need to be added to the device black list
>    to flag devices that can not do 'copy' reliably.
> 
Using 'copy' comes under the heading of dangerous for the reasons Max
mentions. It should never be the default.

> [EMAIL PROTECTED]
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to [EMAIL PROTECTED]

-- 
Dan Jones, Manager, Storage Products          VA Linux Systems
V:(510)687-6737 F:(510)683-8602               47071 Bayside Parkway
[EMAIL PROTECTED]                            Fremont, CA 94538
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
Re: Proposed RAID5 design changes.

Reply via email to