On 2005-08-18T15:28:41, Neil Brown <[EMAIL PROTECTED]> wrote:
> If we want to mirror a single drive in a raid5 array, I would really
> like to do that using the raid1 personality.
> e.g.
> suspend io
> remove the drive
> build a raid1 (with no superblock) using the drive.
> add that back into the array
> resume io.
I hate to say this, but this is something where the Device Mapper
framework, with it's suspend/resume options and the ability to change
the mapping atomically.
Maybe copying some of the ideas would be useful.
Freeze, reconfigure one disk to be RAID1, resume - all IO goes on while
at the same time said RAID1 re-mirrors to the new disk. Repeat with a
removal later.
> To handle read failures, I would like the first step to be to re-write
> the failed block. I believe most (all?) drives will relocate the
> block if a write cannot succeed at the normal location, so this will
> often fix the problem.
Yes. This would be highly useful.
> A userspace process can then notice an unacceptable failure rate and
> start a miror/swap process as above.
Agreed. Combined with SMART monitoring, this could provide highly useful
features.
> This possible doesn't handle the possibility of a write failing very
> well, but I'm not sure what your approach does in that case. Could
> you explain that?
I think a failed write can't really be handled - it might be retried
once or twice, but then the way to proceed is to kick the drive and
rebuild the array.
> It also means that if the raid1 rebuild hits a read-error it cannot
> cope whereas your code would just reconstruct the block from the rest
> of the raid5.
Good point. One way to fix this would be to have a callback to one level
up "Hi, I can't read this section, can you reconstruct and give it to
me?". (Which is a pretty ugly hack.)
However, that would also assume that the data on the disk which _can_ be
read still can be trusted. I'm not sure I'd buy that myself, untrusted.
But a periodic background consistency check for RAID might help convince
users that this is indeed the case ;-)
If you can no longer pro-actively reconstruct the disk because it has
indeed failed, maybe treating it like a failed disk and rebuilding the
array in the "classic" fashion isn't the worst idea, though.
Sincerely,
Lars Marowsky-Brée <[EMAIL PROTECTED]>
--
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html