On Monday March 13, [EMAIL PROTECTED] wrote:
> Hi all,
>
> I just experienced some kind of lockup accessing my 8-drive raid5
> (2.6.16-rc4-mm2). The system has been up for 16 days running fine, but
> now processes that try to read the md device hang. ps tells me they are
> all sleeping in get_active_stripe. There is nothing in the syslog, and I
> can read from the individual drives fine with dd. mdadm says the state
> is "active".
Hmmm... That's sad. That's going to be very hard to track down.
If you could
echo t > /proc/sysrq-trigger
and send me the dump that appears in the kernel log, I would
appreciate it. I doubt it will be very helpful, but it is the best
bet I can come up with.
>
> I'm not sure what to do now. Is it safe to try to reboot the system or
> could that cause the device to get corrupted if it's hung in the middle
> of some important operation?
You could try increasing the size of the stripe cache
echo 512 > /sys/block/mdX/md/stripe_cache_size
(choose and appropriate 'X').
Maybe check the content of
/sys/block/mdX/md/stripe_cache_active
as well.
Other than that, just reboot. The raid5 will do a resync, but the
data should be fine.
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html