Hi All!
Lately, we have been experiencing some serious problems with our Linux servers
using RAID0 on Adaptec 2940U2W. The machines, which are under quite some load,
suddenly dies and must be cold-restarted. When they get back online again,
there's is no sign of anything going awry in any logfile. The just plunge into
deep-freeze, zero-Kelvin mode. *argh*
Currently, the machines are running Linux 2.2.14 with latest raid-patches
(Mingo's raid-2.2.14-B1-patch), but we've seen the problem under 2.2.13 as
well.
As I said, there's nothing in the log files that would indicate what's wrong.
Installing the software watchdog kernel module/watchdogd didn't help either.
The situation is getting somewhat embarrassing, as we've been pushing pretty
hard towards Linux. We're considering moving all servers to non-RAID
configurations, but we'd really prefer RAID0.
I've also noticed a few other postings about problems/hangups with 2940/AIC79xx
on Linux RAID, so it seems we're not alone with this problem.
Does anyone have any kind of information as to the status of this. Is the
bug(s) identified? Is there a solution (other than stop using RAID)?
Hardware setup: RH Linux 6.1/2.2.14/raid-2.2.14-B1 on dual PIII motherboards
(ASUS P2B-DS) and U2W SCSI IBM disks, 512+ MB RAM.
/m