Back in March, Mike Bilow and others discussed the problem with RAID
resynchronization happening on swap devices when a swapon was exectued
causing kernel panics and such. Different suggestions went by, the best of
which seemed to be to wrap a check in some of the startup scripts to see if
there is resync happening, and if so, don't do a swapon. Unfortunately, this
disables swap even if you *don't* have swap on a RAID device and a devices
resync on boot.

This weekend, my system crashed, it came back up and needed to resync. This
caused swap not to be turned on, which it *should* have because I do not
have swap on a RAID device... not having swap is not a good thing. I found
this discussion and the resultant bug fix submitted to debian to change the
startup scripts, and felt it needed to be modified, please let me know if I
am doing this the wrong way.

Here is the bug submission that Mike sent to debian bugs:

>Package: sysvinit
>Version: 2.78-2
>Severity: Critical

>If software RAID is in use and any swap area is on a RAID set, a kernel
>panic is certain to occur and filesystem corruption is very likely to
>occur if swapping is started while the RAID set is being
>resynchronized.  Since the new software RAID system allows the kernel to
>start such a resynchronization at boot time ahead of and independent of
>startup script processing, it is critical that some protection be
>provided which at least prevents disaster.

>I recommend wrapping the invocations of "swapon" in mountall.sh and
>checkroot.sh with a simple check for the presence of the keyword
>"resync" in /proc/mdstat.  The only complication is that this must be
>done in some way that is robust if there is no kernel support for RAID,
>in which case /proc/mdstat will not exist.  I suggest something like
>this:

>     if [ ! -e /proc/mdstat ] || \
>               [ grep -ci resync /proc/mdstat -eq 0 ] ; then
>          swapon -a 2> /dev/null
>     fi

>This approach has the disadvantage that swapping will not be started at
>all if a RAID set is being resynchronized, but this far better than
>crashing and corruption.  Several alternative approaches have been
>proposed on the linux-raid mailing list which work by having the startup
>script start a separate process which loops on a check for the "resync"
>keyword and finally turns on swapping when the "resync" keyword
>disappears.  However, at least some minimal protection should be
>included in the base system; any more elaborate solution is
>appropriately an issue for the raidtools2 package rather than the
>sysvinit package.

>Software RAID will never spontaneously start resynchronizing.  Once RAID
>detects a drive failure and disables that volume, it will not re-enable
>it unless a reboot occurs or the system administrator takes some sort of
>action manually.


I am suggesting that this be changed so that if you do *not* have a swap
device in RAID (which if you ask me, is silly, it seems better to set up
many swap partitions give them equal priority) then the swapon gets run:

if [ ! -e /proc/mdstat ] || \
               [ ! grep `grep swap /etc/fstab | awk -F/ '(print $3}'` \
                 /proc/mdstat | grep resync ] ; then
          swapon -a 2> /dev/null
fi        

This would check for a swap in /etc/fstab that had a /dev/md# and then check
/proc/mdstat for md# resyncing, if there is not a swap on a md device or if
that swap device is not resyncing then swap is turned on.

Comments, suggestions? I am going to submit this to Debian as a bug, but I
thought I'd run it by this mailing list first.

Micah
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]

Reply via email to