I've been using RAID 0.90 with the 2.0 kernel on a bunch of production
boxes (RAID5) and the disk failure handling and reconstruction has
worked fine, both in tests and (once) in real life when a disk failed.
I'm now trying 2.2.14 + raid-2.2.14-B1 (as shipped in the Red Hat 6.x
kernel) and have come across both a problem with testing disk failure
and also an apparent bug in RAID error handling:
------------------------------ cut here ------------------------------
SCSI disk error : host 0 channel 0 id 8 lun 0 return code = 28000002
[valid=0] Info fld=0x0, Current sd08:61: sense key Not Ready
Additional sense indicates Logical unit not ready, initializing command required
scsidisk I/O error: dev 08:61, sector 265176
md: bug in file raid5.c, line 659
**********************************
* <COMPLETE RAID STATE PRINTOUT> *
**********************************
------------------------------ cut here ------------------------------
followed by a detailed dump of the RAID superblock information. After
that, any commands (including raidhotremove/raidhotadd) which try to
touch the RAID array hang in uninterruptible sleep and so do any
processes which were accessing the RAID filesystem at the time of the
failure. The above was triggered by my simulation of a disk failure
which I did by spinning the disk down with the SCSI_IOCTL_STOP_UNIT
ioctl.
That leads to the second problem: the reason I used that method of
simulating a disk failure was that the old method:
echo "scsi remove-single-device 0 0 3 0" > /proc/scsi/scsi
has stopped working with kernel 2.2. strace shows that the write()
returns with errno EBUSY. linux/drivers/scsi/scsi.c shows that this
is because the access_count of Scsi_Device structure is non-zero.
Looking at the equivalent 2.0 source doesn't seem to show any semantic
changes and yet the same command under 2.0 works fine. Please can
anyone help otherwise this server is going to have to run without the
added reliability of RAID5 which would be disappointing?
As an act of desperation I even wrote a little kernel module to change
the access_count back to zero and then ran the
"...remove-single-device...". This time, the device did get removed
properly, RAID noticed the removal and went properly into degraded
mode. Unfortunately, once again, all processes accessing the RAID
filesystem and then any raidhotadd/raidhotremove/umount commands all
hung in uninterruptible state. Nothing in this mailing list or
anywhere else I can find with web searches seems to have had this
problem so I'm at a loss what to do. Any help would be gratefully
received. In case it matters, this is on an SMP system (2 CPUs) and
the disks are all SCSI disks on a bus with an Adaptec 7899 adapter,
using the aic7xxx driver 5.1.72/3.2.4. In case anyone wants the kernel
module to alter a SCSI device access_count, here it is:
------------------------------ cut here ------------------------------
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/blk.h>
#include "/usr/src/linux/drivers/scsi/scsi.h"
#include "/usr/src/linux/drivers/scsi/hosts.h"
static int host = 0;
static int channel = 0;
static int id = 0;
static int lun = 0;
static int delta = 0;
MODULE_PARM(host, "i");
MODULE_PARM(channel, "i");
MODULE_PARM(id, "i");
MODULE_PARM(lun, "i");
MODULE_PARM(delta, "i");
int init_module(void)
{
struct Scsi_Host *hba;
Scsi_Device *scd;
printk("scsiaccesscount starting\n");
for (hba = scsi_hostlist; hba; hba = hba->next)
if (hba->host_no == host)
break;
if (!hba)
return -ENODEV;
for (scd = hba->host_queue; scd; scd = scd->next)
if (scd->channel == channel && scd->id == id && scd->lun == lun)
break;
if (!scd)
return -ENODEV;
printk("access_count is %d\n", scd->access_count);
if (delta) {
scd->access_count += delta;
printk("changed access_count to %d\n", scd->access_count);
}
return -EIO;
}
------------------------------ cut here ------------------------------
Use it as
insmod scsiaccesscount.o host=0 channel=0 id=3 lun=0
to show the access count for ID 3 on bus 0 channel 0 and
insmod scsiaccesscount.o host=0 channel=0 id=3 lun=0 delta=-1
to substract one from the access_count. Obviously this is just for
debugging and may not be safe to do at all (and indeed wasn't in my
case).
--Malcolm
--
Malcolm Beattie <[EMAIL PROTECTED]>
Unix Systems Programmer
Oxford University Computing Services