On Monday June 25, [EMAIL PROTECTED] wrote:
> Is there any way for the RAID code to be smarter when deciding 
> about those event counters? Does it have any chance (theoretically)
> to _know_ that it shouldn't use the drive with event count 28?

My current thinking is that once a raid array becomes unusable - in the
case of raid5, this means two failures - the array should immediately
be marked read-only, including the superblocks.   Then if you ever
manage to get enough drives together to form a working array, it will
start working again, and if not, it won't really matter whether the
superblock was updated to not.



> And even if that can't be done automatically, what about a small
> utility for the admin where he can give some advise to support 
> the RAID code on those decisions?
> Will mdctl have this functionality? That would be great!

"mdctl --assemble" will have a "--force" option to tell it to ignore
event numbers and assemble the array anyway.  This could result in
data corruption if you include an old disc, but would be able to get
you out of a tight spot.  Ofcourse, once the above change goes into
the kernel it shouldn't be necessary.

> > 
> Hm, does the RAID code disable a drive on _every_ error condition?
> Isn't there a distinction between, let's say, "soft errors" and "hard
> errors"?
> (I have to admit I don't know the internals of Linux device drivers
> enough to answer that question)
> Shouldn't the RAID code leave a drive which reports "soft errors"
> in the array and disable drives with "hard errors" only?

A Linux block device doesn't report soft errors. There is either
success or failure.  The driver for the disc drive should retry any
soft errors and only report an error up through the block-device layer
when it is definately hard.

Arguably the RAID layer should catch read errors and try to get the
data from elsewhere and then re-write over the failed read, just
incase it was a single block error.
But a write error should always be fatal and fail the drive. I cannot
think of any other reasonable approach.

> 
> In that case, the filesystem might have been corrupt, but the array
> would have been re-synced automatically, wouldn't it?

yes, and it would have if it hadn't collapsed in a heap while trying :-(

> 
> >  But why did the filesystem ask for a block that was out of range?
> >  This is the part that I cannot fathom.  It would seem as though the
> >  filesystem got corrupt somehow.  Maybe an indirect block got replaced
> >  with garbage, and ext2fs believed the indirect block and went seeking
> >  way off the end of the array.  But I don't know how the corruption
> >  happened.
> > 
> Perhaps the read errors from the drive triggered that problem?

They shouldn't do, but seeing don't know where the corruption came
from, and I'm not even 100% confident that there was corruption, maybe
they could.

The closest I can come to a workable scenario is that maybe some
parity block had the wrong data.  Normally this wouldn't be noticed,
but when you have a failed drive you have to use the parity to
calculate the value of a missing block, and bad parity would make this
block bad.  But I cannot imaging who you would have a bad parity
block.  After any unclean shutdown the parity should be recalculated.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]

Reply via email to