Help, there's something weird going on on our fileserver! I'm on vacation and had a colleague do this over the phone. Please CC me in replies because I don't have access to my regular mail.
raid0 is level 1, sd0a sd1a raid1 is level 5, sd2a .. sd9a sd0/1 are scsibus0, targets 0/1 sd2..9 are scsibus1, targets 0..8 The machine paniced After reboot, parity rewrite on raid0 succeded and failed on raid1 because of a read error on sd2a. He did scsictl stop sd2, scsictl detach scsibus1 0 0, replaced sd2, scsictl scan scsibus1 0 0. Something strange must have happened and sd2 was async. He nevertheless started the reconstruction (raidctl -R sd2a raid1), but raidctl -S estimated 24 hours. I asked him to stop the reconstruction, but neither failing sd2 nor detaching scsibus1 0 0 stopped it. Shortly after, the machine paniced again. It came up with raid0 and raid1 configured correctly, but fsck raid1a railed. We now have no disklabel on raid1 (disklabel -r says something about not being able to read it and disklabel without -r shows the fabricated one). Since fsck raid1a said somethin like "incorrect fs size". I assume the superblock of raid1a is still there, only the disklabel is broken. Any hints? He is currently running the reconstruction and we'll see whether the disklabel returns or what happens if we re-write it from the backup we have in /var. Is there a sane way to stop an on-going reconstruction? May trying to stop it have corrupted the raid1 contents?
