Back in March, Mike Bilow and others discussed the problem with RAID
resynchronization happening on swap devices when a swapon was exectued
causing kernel panics and such. Different suggestions went by, the best of
which seemed to be to wrap a check in some of the startup scripts to see if
there is resync happening, and if so, don't do a swapon. Unfortunately, this
disables swap even if you *don't* have swap on a RAID device and a devices
resync on boot.
This weekend, my system crashed, it came back up and needed to resync. This
caused swap not to be turned on, which it *should* have because I do not
have swap on a RAID device... not having swap is not a good thing. I found
this discussion and the resultant bug fix submitted to debian to change the
startup scripts, and felt it needed to be modified, please let me know if I
am doing this the wrong way.
Here is the bug submission that Mike sent to debian bugs:
>Package: sysvinit
>Version: 2.78-2
>Severity: Critical
>If software RAID is in use and any swap area is on a RAID set, a kernel
>panic is certain to occur and filesystem corruption is very likely to
>occur if swapping is started while the RAID set is being
>resynchronized. Since the new software RAID system allows the kernel to
>start such a resynchronization at boot time ahead of and independent of
>startup script processing, it is critical that some protection be
>provided which at least prevents disaster.
>I recommend wrapping the invocations of "swapon" in mountall.sh and
>checkroot.sh with a simple check for the presence of the keyword
>"resync" in /proc/mdstat. The only complication is that this must be
>done in some way that is robust if there is no kernel support for RAID,
>in which case /proc/mdstat will not exist. I suggest something like
>this:
> if [ ! -e /proc/mdstat ] || \
> [ grep -ci resync /proc/mdstat -eq 0 ] ; then
> swapon -a 2> /dev/null
> fi
>This approach has the disadvantage that swapping will not be started at
>all if a RAID set is being resynchronized, but this far better than
>crashing and corruption. Several alternative approaches have been
>proposed on the linux-raid mailing list which work by having the startup
>script start a separate process which loops on a check for the "resync"
>keyword and finally turns on swapping when the "resync" keyword
>disappears. However, at least some minimal protection should be
>included in the base system; any more elaborate solution is
>appropriately an issue for the raidtools2 package rather than the
>sysvinit package.
>Software RAID will never spontaneously start resynchronizing. Once RAID
>detects a drive failure and disables that volume, it will not re-enable
>it unless a reboot occurs or the system administrator takes some sort of
>action manually.
I am suggesting that this be changed so that if you do *not* have a swap
device in RAID (which if you ask me, is silly, it seems better to set up
many swap partitions give them equal priority) then the swapon gets run:
if [ ! -e /proc/mdstat ] || \
[ ! grep `grep swap /etc/fstab | awk -F/ '(print $3}'` \
/proc/mdstat | grep resync ] ; then
swapon -a 2> /dev/null
fi
This would check for a swap in /etc/fstab that had a /dev/md# and then check
/proc/mdstat for md# resyncing, if there is not a swap on a md device or if
that swap device is not resyncing then swap is turned on.
Comments, suggestions? I am going to submit this to Debian as a bug, but I
thought I'd run it by this mailing list first.
Micah
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]