Re: Failed disk triggers raid5.c bug?

Andreas Haumer Mon, 25 Jun 2001 01:06:40 -0700
Hi Neil,

thanks for your reply!

Neil Brown wrote:
> 
[...]
> 
> I'm not sure that  you want to know this, but it looks like you might
> have been able to recover your data.... though it is only a "might".
> 
Yes, I already figured that out (with help from Martin Bene)

> > Jun 19 09:18:23 wien kernel: (read) scsi/host0/bus0/target0/lun0/part4's sb 
>offset: 16860096 [events: 00000024]
> > Jun 19 09:18:23 wien kernel: (read) scsi/host0/bus0/target1/lun0/part4's sb 
>offset: 16860096 [events: 00000024]
> > Jun 19 09:18:23 wien kernel: (read) scsi/host0/bus0/target2/lun0/part4's sb 
>offset: 16860096 [events: 00000023]
> > Jun 19 09:18:23 wien kernel: (read) scsi/host0/bus0/target3/lun0/part4's sb 
>offset: 16860096 [events: 00000028]
> 
> The reason that this array couldn't restart was that the 4th drive had
> the highest event count and it was alone in this.  It didn't even have
> any valid data!!
> Had you unplugged this drive and booted, it would have tried to
> assemble an array out of the first two (event count 24).  This might
> have worked (though it might not, read on).
> 
Is there any way for the RAID code to be smarter when deciding 
about those event counters? Does it have any chance (theoretically)
to _know_ that it shouldn't use the drive with event count 28?
And even if that can't be done automatically, what about a small
utility for the admin where he can give some advise to support 
the RAID code on those decisions?
Will mdctl have this functionality? That would be great!

> Alternately, you could have created a raidtab which said that the
> third drive was failed, and then run "mkraid"...
> 
> mdctl, when it is finished, should be able to make this all much
> easier.
> 
> But what went wrong?  I don't know the whole story but:
> 
> - On the first error, the drive was disabled and reconstruction was
>   started.
> - On the second error, the reconstruction was inapprporiately
>   interrupted.  This is an error that I will have to fix in 2.4.
>   However it isn't really fatal error.
> - Things were then going fairly OK, though noisy, until:
> 
> > Jun 19 09:10:07 wien kernel: attempt to access beyond end of device
> > Jun 19 09:10:07 wien kernel: 08:04: rw=0, want=1788379804, limit=16860217
> > Jun 19 09:10:07 wien kernel: dev 09:00 blksize=1024 blocknr=447094950 
>sector=-718207696 size=4096 count=1
> 
>  For some reason, it tried to access well beyond the end of one of the
>  underlying drives.  This caused that drive to fail.  This relates to
>  the subsequent message:
> 
> > Jun 19 09:10:07 wien kernel: raid5: restarting stripe 3576759600
> 
>  which strongly suggests that the filesystem actually asked the raid5
>  array for a block that was well out of range.
>  In 2.4, this will be caught before the request gets to raid5.  In 2.2
>  it isn't.  The request goes on to raid5, raid5 blindly passes a bad
>  request down to the disc.  The disc reports an error, and raid5
>  thinks the disc has failed, rather than realise that it never should
>  have made such a silly request.
> 
Hm, does the RAID code disable a drive on _every_ error condition?
Isn't there a distinction between, let's say, "soft errors" and "hard
errors"?
(I have to admit I don't know the internals of Linux device drivers
enough to answer that question)
Shouldn't the RAID code leave a drive which reports "soft errors"
in the array and disable drives with "hard errors" only?

In that case, the filesystem might have been corrupt, but the array
would have been re-synced automatically, wouldn't it?

>  But why did the filesystem ask for a block that was out of range?
>  This is the part that I cannot fathom.  It would seem as though the
>  filesystem got corrupt somehow.  Maybe an indirect block got replaced
>  with garbage, and ext2fs believed the indirect block and went seeking
>  way off the end of the array.  But I don't know how the corruption
>  happened.
> 
Perhaps the read errors from the drive triggered that problem?

>  Had you known enough to restart the array from the two apparently
>  working drives, and then run fsck, it might have fixed things enough
>  to keep going.  Or it might not, depending on how much corruption
>  there was.
> 
>  So, Summary of problems:
>   1/ md responds to a failure on a know-failed drive inappropriately.
>     This shouldn't be fatal but needs fixing.
>   2/ md isn't thoughtful enough about updating the event counter on
>      superblocks and can easily leave an array in an unbuildable
>      state.  This needs to be fix.  It's on my list...
>   3/ raid5 responds to a request for an out-of-bounds device address
>     by passing on out-of-bounds device addresses the drives, and then
>     thinking that those drives are failed.
>     This is fixed in 2.4
>   4/ Something caused some sort of filesystem corruption.  I don't
>      know what.
> 
At least the event produced enough log messages to make a well 
documented test-case for those problems ;-)
 
Thanks again,

- andreas

-- 
Andreas Haumer                     | mailto:[EMAIL PROTECTED]
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
Re: Failed disk triggers raid5.c bug?

Reply via email to