Per Jessen wrote:
>
> I have a RAID setup, 3 Compaq 4Gb drives running of an Adaptec2940UW.
> Kernel 2.2.18 with RAID-patches etc.
>
> I have been trying out various options, doing some stress-testing etc.,
> and I have now arrived at the following situation that I cannot explain:
>
> when running the 3 drives in a RAID5 config, one of the drives (always the
> same one) will always fail in during heavy IO or during a resync phase. It
> appears to produce one IO error (judging from messages in the log), upon
> which it is promptly removed from the array.
> I can then hotremove the failing drive, then hotadd it - and resync starts,
> and quite often completes. This scenario is consistently repeatable.
>
> So, it would seem that this one drive has a hardware problem. So I ran badblocks
> with write-check on it, couple of times - came out 100% clean.
I've had similar problems. Same number of disks, different size and brand,
both on the drives and host adapter. I've had a couple people tell me that
it's a bus or hardware problem, but the symptoms are not quite consistent
with that diagnosis. It may just be my paranoia, but the failure pattern
reminds me more of a race condition than anything else.
> I then built a RAID0 array instead - and started driving lots of IO on it -
> it's still running - not a problem. Filled up the array, still no probs.
>
> So, except when the drive is in a RAID5 config, it seems ok.
>
> Any suggestions ? I would like to confirm whether or whether not the
> drive has a problem.
Eyeball the raid5.c and md.c code, looking for potential race conditions?
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]