Hey folks,

I've got some pretty big problems with my Linux software RAID(-5) array.
This is what I think happened: Two days ago, the power went out on my file
server.  When the power came back up, the RAID driver automatically
started re-synchronizing the array. During the re-sync, the power went out
again.  Now, according to the kernel, the event counters on all three
disks in the array are out of sync (17, 15, and 14), and the kernel's
internal consistency checks refuse to allow me to mount the array (even
after removing most of the checks...I'm quite impressed with all the error
checking in drivers/block/md.c).

Here's the actual configuration:

        Linux kernel 2.2.3 on an AlphaStation 200
        raid0145-19990309-2.2.3 applied
        raidtools-19990309-0.90 installed

        /dev/sdb1 is disk 0 (event counter of 17)
        /dev/sdc1 is disk 1 (event counter of 15)
        /dev/sdd1 is disk 2 (event counter of 14)

        These three disks form /dev/md0, which is mounted on /var
        (/ and /usr are on a separate, smaller disk).

And no, I don't have a working UPS (I found out the hard way that the
battery is bad) and any backup I had was long overwritten.  I *really*
need to convince the kernel to mount the array, even though it is
inconsistent.  The things that might be corrupt (/var/spool/mail,
/var/run, /var/log --- the only things I can think of that might have been
open when the computer crashed both times) are pretty minor, whereas
there's around two or three gigabytes of static data that I'd really like
to recover (i.e. /var/users, /var/etc, /var/yp, /var/sbin, /var/shlib,
and /var/local).

My question: How can I convince the kernel to mount the array anyway? I've
tried to remove the various consistency checks the kernel does, but the MD
driver is just built too well.  :)  I'd like to modify the RAID
superblocks to all have the same event ID, then mount the array read-only
and try to get my files off (ignoring errors).  I'm currently hacking out
raidstart.c (from raidtools) so I can at least see the RAID superblock
(i.e. call analyze_sb()), and maybe modification is as simple as changing
that structure (or maybe I'll have to hack up a special version of
upgrade_sb() to do that...I don't know).

Anyway, any suggestions or help would be greatly appreciated.  I don't
really know what I'm doing, but that's probably painfully obvious.

Thanks in advance,
#\Matthew

Reply via email to