Trouble replacing drive in array / hot standby

2000-03-27 Thread Stuart Clark

Hi,

I have got a few questions about an array I have here, running on a RedHat 
6.0 distribution with a 2.2.5-22 kernel, and raidtools 0.9.

The array has 4 SCSI disks, where one has failed:

# cat /proc/mdstat
Personalities : [raid5]
read_ahead 1024 sectors
md0 : active raid5 sda1[0](F) sdd1[3] sdc1[2] sde1[1] 26627328 blocks level 
5, 128k chunk, algorithm 2 [4/3] [_UUU]
unused devices: none


I tried to replace sda1 (the drive with SCSI ID 0) with another physical 
drive I have outside the machine (which was once part of this RAID 
array.  The drive that went in had ID 0 too.  I played around trying to get 
the array working, but for some reason it would not work (sorry I do not 
have any output from this time).  I suspect that the fact that the drive 
had old information on it may have caused a problem, since putting the 
faulty drive with time inconsistencies allowed the array to be started up 
again.

I have a spare disk in this machine which has been added to the array with 
raidhotadd.

I wanted this spare disk to be automatically added as a hotspare drive, but 
I have been unable to get this working (now commented out in raidtab.conf 
file below).

Can anyone give me some insight into what is going on here.  Should I 
format the partition on the drive that did not work so that superblock/etc 
information is no longer present?  Should I seriously consider compiling a 
2.2.14 kernel with the latest raidtools patch?  Should I pull my last hair 
out of my head?

Kind regards,  Stuart.




# raidstart --version
raidstart v0.3d compiled for md raidtools-0.90



# cat /etc/raidtab
raiddev /dev/md0
raid-level  5
nr-raid-disks   4
chunk-size  128
persistent-superblock   1
parity-algorithmleft-symmetric

# Spare disks for hot reconstruction
#nr-spare-disks  1

device  /dev/sda1
raid-disk   0

device  /dev/sde1
raid-disk   1

device  /dev/sdc1
raid-disk   2

device  /dev/sdd1
raid-disk   3

#device /dev/sdb1
#spare-disk 0




# cat /var/log/dmesg
wansea University Computer Society NET3.039
NET4: Unix domain sockets 1.0 for Linux NET4.0.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
Initializing RT netlink socket
Starting kswapd v 1.5
Detected PS/2 Mouse Port.
Serial driver version 4.27 with MANY_PORTS MULTIPORT SHARE_IRQ enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
pty: 256 Unix98 ptys configured
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.9)
Real Time Clock Driver v1.09
RAM disk driver initialized:  16 RAM disks of 4096K size
PIIX: IDE controller on PCI bus 00 dev 38
PIIX: not 100% native mode: will probe irqs later
PIIX: neither IDE port enabled (BIOS)
PIIX: IDE controller on PCI bus 00 dev 39
PIIX: not 100% native mode: will probe irqs later
 ide0: BM-DMA at 0xe800-0xe807, BIOS settings: hda:pio, hdb:pio
 ide1: BM-DMA at 0xe808-0xe80f, BIOS settings: hdc:pio, hdd:pio
hda: ST31720A, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: ST31720A, 1626MB w/0kB Cache, CHS=826/64/63
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
md driver 0.90.0 MAX_MD_DEVS=256, MAX_REAL=12
raid5: measuring checksumming speed
8regs :   169.545 MB/sec
32regs:   149.352 MB/sec
using fastest function: 8regs (169.545 MB/sec)
scsi : 0 hosts.
scsi : detected total.
md.c: sizeof(mdp_super_t) = 4096
Partition check:
  hda: hda1 hda2 hda3
RAMDISK: Compressed image found at block 0
autodetecting RAID arrays
autorun ...
... autorun DONE.
VFS: Mounted root (ext2 filesystem).
(scsi0) Adaptec AHA-294X SCSI host adapter found at PCI 10/0
(scsi0) Narrow Channel, SCSI ID=7, 16/255 SCBs
(scsi0) Downloading sequencer code... 406 instructions downloaded
scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.16/3.2.4
Adaptec AHA-294X SCSI host adapter
scsi : 1 host.
(scsi0:0:0:0) Synchronous at 10.0 Mbyte/sec, offset 15.
   Vendor: SEAGATE   Model: ST19171N  Rev: 0023
   Type:   Direct-Access  ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
(scsi0:0:1:0) Synchronous at 10.0 Mbyte/sec, offset 15.
   Vendor: SEAGATE   Model: ST19171N  Rev: 0024
   Type:   Direct-Access  ANSI SCSI revision: 02
Detected scsi disk sdb at scsi0, channel 0, id 1, lun 0
(scsi0:0:2:0) Synchronous at 10.0 Mbyte/sec, offset 15.
   Vendor: SEAGATE   Model: ST19171N  Rev: 0023
   Type:   Direct-Access  ANSI SCSI revision: 02
Detected scsi disk sdc at scsi0, channel 0, id 2, lun 0
(scsi0:0:3:0) Synchronous at 10.0 Mbyte/sec, offset 15.
   Vendor: SEAGATE   Model: ST39140N  Rev: 1498
   Type:   Direct-Access  ANSI SCSI revision: 02
Detected scsi disk sdd at scsi0, channel 0, id 3, lun 0
(scsi0:0:4:0) Synchronous at 

Raid1 - dangerous resync after power-failure?

2000-03-27 Thread Sam

I'm setting up a web server with Raid-1, using raidtools 0.90-5
and linux kernel 2.2.12 (this is the Redhat 6.1 distr).  I want to
mirror all my data across two disks (hda and hdc).

The problem I've noticed from testing is that if I shut off the power
and then reboot, the raidtools software will start re-syncing the
mirrors,
even though there was no write activity at all when the power went off
and even
though both parts of the mirror have the exact same event counter.

The problem I see with this is as follows:

- Assume a power outage hits and wipes out some sectors on the
  hda disk, but leaves the superblock alone.  I think this scenario
  is a fairly likely one.

- After the power outage, the system boots up and starts up a
resync,
   copying data from hda to hdc

- The system tries to access the bad sectors on hda

What would happen at this point?  I assume the data would be lost,
since hdc is undergoing a re-sync, and the sectors on hda are already
bad.
Even though at boot time hdc contained good copies of these sectors,
the raid software starting re-syncing onto hdc and lost that data.  If
however
the raid code had just left hdc alone it could've recovered these
sectors.

I looked at the raidtools code, and it looks to me what is happening is
that
there is a SB_CLEAN flag in the superblock that is set to false when
raid
is started on an md device.  This SB_CLEAN flag is only set to true if a
clean
shutdown is performed.  So if a power outage hits, this flag is always
going
to be false since no clean shutdown is performed.  At boot time the md
code
then checks the SB_CLEAN flag and if it is false a resync is performed.

It seems to me that a resync should only be required if the system is in
the
middle of a write where some data has been sent to one disk, but not yet
to another.
I think the event counter already performs this function so I don't see
why the
SB_CLEAN flag is even needed.

What do you think?  Could this SB_CLEAN flag be eliminated to reduce the

risk of a resync damaging good data?




raid-2.2.14-B1 reconstruction bug and problems

2000-03-27 Thread Malcolm Beattie

I've been using RAID 0.90 with the 2.0 kernel on a bunch of production
boxes (RAID5) and the disk failure handling and reconstruction has
worked fine, both in tests and (once) in real life when a disk failed.
I'm now trying 2.2.14 + raid-2.2.14-B1 (as shipped in the Red Hat 6.x
kernel) and have come across both a problem with testing disk failure
and also an apparent bug in RAID error handling:

-- cut here --
SCSI disk error : host 0 channel 0 id 8 lun 0 return code = 2802 
[valid=0] Info fld=0x0, Current sd08:61: sense key Not Ready 
Additional sense indicates Logical unit not ready, initializing command required 
scsidisk I/O error: dev 08:61, sector 265176 
md: bug in file raid5.c, line 659 
  
   ** 
   * COMPLETE RAID STATE PRINTOUT * 
   ** 
-- cut here --

followed by a detailed dump of the RAID superblock information. After
that, any commands (including raidhotremove/raidhotadd) which try to
touch the RAID array hang in uninterruptible sleep and so do any
processes which were accessing the RAID filesystem at the time of the
failure. The above was triggered by my simulation of a disk failure
which I did by spinning the disk down with the SCSI_IOCTL_STOP_UNIT
ioctl.

That leads to the second problem: the reason I used that method of
simulating a disk failure was that the old method:
echo "scsi remove-single-device 0 0 3 0"  /proc/scsi/scsi
has stopped working with kernel 2.2. strace shows that the write()
returns with errno EBUSY. linux/drivers/scsi/scsi.c shows that this
is because the access_count of Scsi_Device structure is non-zero.
Looking at the equivalent 2.0 source doesn't seem to show any semantic
changes and yet the same command under 2.0 works fine. Please can
anyone help otherwise this server is going to have to run without the
added reliability of RAID5 which would be disappointing?

As an act of desperation I even wrote a little kernel module to change
the access_count back to zero and then ran the
"...remove-single-device...". This time, the device did get removed
properly, RAID noticed the removal and went properly into degraded
mode. Unfortunately, once again, all processes accessing the RAID
filesystem and then any raidhotadd/raidhotremove/umount commands all
hung in uninterruptible state. Nothing in this mailing list or
anywhere else I can find with web searches seems to have had this
problem so I'm at a loss what to do. Any help would be gratefully
received. In case it matters, this is on an SMP system (2 CPUs) and
the disks are all SCSI disks on a bus with an Adaptec 7899 adapter,
using the aic7xxx driver 5.1.72/3.2.4. In case anyone wants the kernel
module to alter a SCSI device access_count, here it is:

-- cut here --
#include linux/kernel.h
#include linux/module.h
#include linux/blk.h
#include "/usr/src/linux/drivers/scsi/scsi.h"
#include "/usr/src/linux/drivers/scsi/hosts.h"

static int host = 0;
static int channel = 0;
static int id = 0;
static int lun = 0;
static int delta = 0;

MODULE_PARM(host, "i");
MODULE_PARM(channel, "i");
MODULE_PARM(id, "i");
MODULE_PARM(lun, "i");
MODULE_PARM(delta, "i");

int init_module(void)
{
struct Scsi_Host *hba;
Scsi_Device *scd;

printk("scsiaccesscount starting\n");
for (hba = scsi_hostlist; hba; hba = hba-next)
if (hba-host_no == host)
break;

if (!hba)
return -ENODEV;

for (scd = hba-host_queue; scd; scd = scd-next)
if (scd-channel == channel  scd-id == id  scd-lun == lun)
break;

if (!scd)
return -ENODEV;

printk("access_count is %d\n", scd-access_count);
if (delta) {
scd-access_count += delta;
printk("changed access_count to %d\n", scd-access_count);
}

return -EIO;
}
-- cut here --

Use it as
insmod scsiaccesscount.o host=0 channel=0 id=3 lun=0
to show the access count for ID 3 on bus 0 channel 0 and 
insmod scsiaccesscount.o host=0 channel=0 id=3 lun=0 delta=-1
to substract one from the access_count. Obviously this is just for
debugging and may not be safe to do at all (and indeed wasn't in my
case).

--Malcolm

-- 
Malcolm Beattie [EMAIL PROTECTED]
Unix Systems Programmer
Oxford University Computing Services



What happened?

2000-03-27 Thread doug egan

I have been configuring a RAID 5 system.  I have a 3 disk raid on a promise
MAX-II controller.  Each was on its own controller port.  I had a boot disk
on /dev/hda and a disk I was using for restoring on /dev/hdb.  The
partitions on the latter disk were only mounted when needed.

I shutdown and removed the /dev/hdb and reconnected the cdrom to the slave
of the 1st ide.  Now when I reboot, I get a "corrupt superblock" message
suggesting that I try e2fsck -b 8193 to recover it.  These messages are for
my /dev/mdx raid drives.

When I go into maintenance mode, I can mount and e2fsck all my /dev/mdx
drives and they check clean.  I can mount the file systems and they all
work.  mdstat indicates all is well.

Anyone have any ideas what happened and how to fix it?

Thanks,

Doug Egan




Re: RAID5 array not coming up after repaired disk

2000-03-27 Thread Marc Haber

On Sat, 25 Mar 2000 13:10:13 GMT, you wrote:
On Fri, 24 Mar 2000 19:36:18 -0500, you wrote:
Ok, maybe I'm on crack and need to lay off the pipe a little while, but
it appears that sdf7 doesn't have a partition type of "fd" and as such
isn't getting considered for inclusion in md0.  

Nope, all partitions /dev/sd{a,b,c,d,e,f}7 have type fd.

After moving sdf7 on the top in the /etc/raidtab, the array came up in
degraded mode and I was able to raidhotadd the new disk.

I feel that the RAID should have recovered from this failure without
requiring manual intervention. Or maybe I did something wrong?

Greetings
Marc

-- 
-- !! No courtesy copies, please !! -
Marc Haber  |   " Questions are the | Mailadresse im Header
Karlsruhe, Germany  | Beginning of Wisdom " | Fon: *49 721 966 32 15
Nordisch by Nature  | Lt. Worf, TNG "Rightful Heir" | Fax: *49 721 966 31 29



Status of Raid-0.9

2000-03-27 Thread Nikolaus Froehlich

Hello,
Sorry for taking a minute of your valuable time, but since all other
attempts to get information failed, I hope that someone from this list
will be able to answer 3 quick questions for me:

1) What is the status of the RAID development? 
   From the archives on ftp.kernel.org and the mailing list archives it
   appears that all traffic conearning the deleopment
   stopped end of Aug 99.   Is that true?  Why did the development get
   abandoned?  Are there major bugs in the code?

2) Can the distributed raidset raidtools0.9 and raid0145-19992408-2.2.11
   be considered stable for a RAID-1 application in a 2.2.x kernel?

3) Are there still efforts to include a 'new' Raid implementation in the
   new stanard kernels (e.g. 2.5)?


Thank you again for taking the time to answer those questions!
Yours,
Nikolaus.




Re: superblock or the partition table is corrupt?

2000-03-27 Thread root


h. Looks like I forgot this step. I have a raid 1 setup under
RedHat 6.1's stock 2.2.12 kernel with raidtools already installed.
It works fine, a disk failed last weekend and I was able to recover 
that disk while the array continued to function.


My question is this: Is it too late to run mke2fs on /dev/md0
now that /dev/md0 contains data?
Also I dont see fd as an option for fdisk.
Thanks.

On Sat, 25 Mar 2000, m. allan noah wrote:

 
 so, you need to run mke2fs on /dev/md0, rather than the individual partitions,
 then you should be fine.
 
 allan
 




Re: superblock or the partition table is corrupt?

2000-03-27 Thread David Cooley

At 12:25 PM 3/27/2000, root wrote:

h. Looks like I forgot this step. I have a raid 1 setup under
RedHat 6.1's stock 2.2.12 kernel with raidtools already installed.
It works fine, a disk failed last weekend and I was able to recover
that disk while the array continued to function.


My question is this: Is it too late to run mke2fs on /dev/md0
now that /dev/md0 contains data?
Also I dont see fd as an option for fdisk.


If you needed to run mke2fs, you wouldn't be able to access the raid as a 
file system right now...  That is the equivalent of a DOS format.
Option FD doesn't show in the list on fdisk... you just have to select type 
then enter fd and return.


===
David Cooley N5XMT Internet: [EMAIL PROTECTED]
Packet: N5XMT@KQ4LO.#INT.NC.USA.NA T.A.P.R. Member #7068
We are Borg... Prepare to be assimilated!
===




Re: superblock or the partition table is corrupt?

2000-03-27 Thread root

Thanks, 
so I should be able to do the following without loss to date right?




1. umount /dev/md0 
2. raidstop /dev/md0
3.  change partition types on /dev/sda5 and /dev/sdb5 to fd (was linux)
4. raidstart /dev/md0
5  mount -t ext2 /dev/md0 /mirrored_databases


My /etc/raidtab:

# persistent RAID1 array with no  spare disk.
raiddev /dev/md0
nr-raid-disks 2
nr-spare-disks0
persistent-superblock 1
chunk-size4
device/dev/sda5
raid-disk 0
device/dev/sdb5
raid-disk 1



One last note: I'm running this machine as a backend database 
for a busy website, is the chunksize too small?
Everything seems to be running ok. If it ain't broke?


/proc/mdstat:

Personalities : [raid1] 
read_ahead 1024 sectors
md0 : active raid1 sdb5[1] sda5[0] 136 blocks [2/2] [UU]
unused devices: none



On Mon, 27 Mar 2000, David Cooley wrote:

 If you needed to run mke2fs, you wouldn't be able to access the raid as a 
 file system right now...  That is the equivalent of a DOS format.
 Option FD doesn't show in the list on fdisk... you just have to select type 
 then enter fd and return.
 
 
 ===
 David Cooley N5XMT Internet: [EMAIL PROTECTED]
 Packet: N5XMT@KQ4LO.#INT.NC.USA.NA T.A.P.R. Member #7068
 We are Borg... Prepare to be assimilated!
 ===
 




Re: superblock or the partition table is corrupt?

2000-03-27 Thread David Cooley

At 01:57 PM 3/27/2000, root wrote:
Thanks,
so I should be able to do the following without loss to date right?




1. umount /dev/md0
2. raidstop /dev/md0
3.  change partition types on /dev/sda5 and /dev/sdb5 to fd (was linux)
4. raidstart /dev/md0
5  mount -t ext2 /dev/md0 /mirrored_databases

I don't know if re-writing the superblock with the type FD will preserve 
data or make the disk look empty...
Better answered by someone with a little more experience than myself.

===
David Cooley N5XMT Internet: [EMAIL PROTECTED]
Packet: N5XMT@KQ4LO.#INT.NC.USA.NA T.A.P.R. Member #7068
We are Borg... Prepare to be assimilated!
===




Re: System Hangs -- Which Is Most Stable Kernel?

2000-03-27 Thread Jeff Hill

Thanks to everyone for the assistance.

I did recompile the kernel with Translucent disabled (I don't know why
it is enabled by default?). Unfortunately, this has not affected the
problem.

As for the Adaptec, I had checked on a hardware discussion list and
understood that, while some Adaptec's were problematic, the unit I
purchased (the 2940U2W with matching factory cables) was working well
for several Linux users. 

However, it seems to me from my limited experience that the Adaptec may
be the problem as it would fit the type of hanging that seems to occur
(no error messages, everything just freezes -- possibly waiting for the
Adaptec to send the data through).

I am still unable to find any log or anyway of tracing the system hangs.
I may try debugging on the SCSI (haven't a clue how) for a few days
before trying to turn off RAID. Before buying another card (I have no
others), I'll hope some reconfiguration of the Adaptec will do the
trick. I hate to dump all that money down the drain.

Thanks again to everyone for the assistance.

Jeff Hill

"m. allan noah" wrote:
 
 jeff- i am using 2.2.14 with mingo patch, and it is great. i have a dozen or
 so boxes, 512meg, SMP pIII 450, ncr scsi, etc in this config. all are fine.
 
 it would be interesting to see if raid is the issue, or your adaptec (i am
 inclined to think the latter).
--snip--



Re: Status of Raid-0.9

2000-03-27 Thread Jakob Østergaard

On Mon, 27 Mar 2000, Nikolaus Froehlich wrote:

 Hello,
 Sorry for taking a minute of your valuable time, but since all other
 attempts to get information failed, I hope that someone from this list
 will be able to answer 3 quick questions for me:
 
 1) What is the status of the RAID development? 
From the archives on ftp.kernel.org and the mailing list archives it
appears that all traffic conearning the deleopment
stopped end of Aug 99.   Is that true?  Why did the development get
abandoned?  Are there major bugs in the code?

RAID 0.90 development has continued, and the most current patch is available
for 2.2.14 at http://people.redhat.com/mingo

 2) Can the distributed raidset raidtools0.9 and raid0145-19992408-2.2.11
be considered stable for a RAID-1 application in a 2.2.x kernel?

Go with 2.2.14, or even better, 2.2.15pre15 (you'll have to fix a reject
in raid1.c if you need RAID-1 though - but it's an easy one)

 3) Are there still efforts to include a 'new' Raid implementation in the
new stanard kernels (e.g. 2.5)?

It's currently being merged into 2.3.X and will be in 2.4 when it comes out.

-- 

: [EMAIL PROTECTED]  : And I see the elder races, :
:.: putrid forms of man:
:   Jakob Østergaard  : See him rise and claim the earth,  :
:OZ9ABN   : his downfall is at hand.   :
:.:{Konkhra}...:



Re: System Hangs -- Which Is Most Stable Kernel?

2000-03-27 Thread David Cooley

Is the PC overclocked in any way?
I had troubles with my 2940U2W in both Windows and Linux when I overclocked 
the Front Side Bus from 100MHz to 103MHz.
Seems the Adaptec cards can't handle *ANYTHING* over 33.3 MHz on the PCI bus.



At 02:53 PM 3/27/2000, Jeff Hill wrote:
Thanks to everyone for the assistance.

I did recompile the kernel with Translucent disabled (I don't know why
it is enabled by default?). Unfortunately, this has not affected the
problem.

As for the Adaptec, I had checked on a hardware discussion list and
understood that, while some Adaptec's were problematic, the unit I
purchased (the 2940U2W with matching factory cables) was working well
for several Linux users.

However, it seems to me from my limited experience that the Adaptec may
be the problem as it would fit the type of hanging that seems to occur
(no error messages, everything just freezes -- possibly waiting for the
Adaptec to send the data through).

I am still unable to find any log or anyway of tracing the system hangs.
I may try debugging on the SCSI (haven't a clue how) for a few days
before trying to turn off RAID. Before buying another card (I have no
others), I'll hope some reconfiguration of the Adaptec will do the
trick. I hate to dump all that money down the drain.

Thanks again to everyone for the assistance.

Jeff Hill

"m. allan noah" wrote:
 
  jeff- i am using 2.2.14 with mingo patch, and it is great. i have a 
 dozen or
  so boxes, 512meg, SMP pIII 450, ncr scsi, etc in this config. all are fine.
 
  it would be interesting to see if raid is the issue, or your adaptec (i am
  inclined to think the latter).
 --snip--

===
David Cooley N5XMT Internet: [EMAIL PROTECTED]
Packet: N5XMT@KQ4LO.#INT.NC.USA.NA T.A.P.R. Member #7068
We are Borg... Prepare to be assimilated!
===




Re: System Hangs -- Which Is Most Stable Kernel?

2000-03-27 Thread Stephen Waters

have you tried the folks on the [EMAIL PROTECTED] list? this is the list recommended 
in
LINUX/drivers/scsi/README.aic7xxx ...
-s

Jeff Hill wrote:
 
 Thanks to everyone for the assistance.
 
 I did recompile the kernel with Translucent disabled (I don't know why
 it is enabled by default?). Unfortunately, this has not affected the
 problem.
 
 As for the Adaptec, I had checked on a hardware discussion list and
 understood that, while some Adaptec's were problematic, the unit I
 purchased (the 2940U2W with matching factory cables) was working well
 for several Linux users.
 
 However, it seems to me from my limited experience that the Adaptec may
 be the problem as it would fit the type of hanging that seems to occur
 (no error messages, everything just freezes -- possibly waiting for the
 Adaptec to send the data through).
 
 I am still unable to find any log or anyway of tracing the system hangs.
 I may try debugging on the SCSI (haven't a clue how) for a few days
 before trying to turn off RAID. Before buying another card (I have no
 others), I'll hope some reconfiguration of the Adaptec will do the
 trick. I hate to dump all that money down the drain.
 
 Thanks again to everyone for the assistance.
 
 Jeff Hill
 
 "m. allan noah" wrote:
 
  jeff- i am using 2.2.14 with mingo patch, and it is great. i have a dozen or
  so boxes, 512meg, SMP pIII 450, ncr scsi, etc in this config. all are fine.
 
  it would be interesting to see if raid is the issue, or your adaptec (i am
  inclined to think the latter).
 --snip--



Re: Swapping onto RAID: Good idea?

2000-03-27 Thread Martin Bene

At 23:51 27.03.00, Godfrey Livingstone wrote:
I have raid 1 working and am swapping onto /dev/md1 I am interested to know
what modifacations you made to the startup scripts. We run redhat 6.1 with a
patched (for raid and promise ide controller ) 2.14 kernel. Anyway what 
scripts
do I need to change and how.  Presumably by altering rc.sysinit and checking
that the resynchronisation of the swap partation is complete (how?) before
issusing the command "swapon -a" ?

Let's see:

1) rc.sysinit:
* remove swapon -a
* add

# Start up swapping.
/sbin/raidswapon 

after mount of /proc filesystem

* possibly comment out the "add raid devices" section; if you use raid 
autostart chances are that the code in this section won't help you but it 
WILL keep your system from comming up if there's anything bad in your 
raidtab (happened to me when I did some testing and forgot to remove the 
test - entries in raidtab).

my /sbin/raidswapon: (adapted from a post to this list - sorry, I don't 
remember the original author).

#!/bin/sh
#

RAIDDEVS=`grep swap /etc/fstab | grep /dev/md|cut -f1|cut -d/ -f3`

for raiddev in $RAIDDEVS
do
#  echo "testing $raiddev"
while grep $raiddev /proc/mdstat | grep -q "resync="
do
# echo "`date`: $raiddev resyncing"  /var/log/raidswap-status
   sleep 20
done
/sbin/swapon /dev/$raiddev
done

exit 0

This won't turn on swap on raid devices until resync has finished.

Bye, Martin

"you have moved your mouse, please reboot to make this change take effect"
--
  Martin Bene   vox: +43-316-813824
  simon media   fax: +43-316-813824-6
  Andreas-Hofer-Platz 9 e-mail: [EMAIL PROTECTED]
  8010 Graz, Austria
--
finger [EMAIL PROTECTED] for PGP public key