On Fri, Sep 01, 2006 at 12:20:04PM -0500, Robert Jones wrote:
> On Fri, Sep 01, 2006 at 03:19:02PM +0200, Joachim Schipper wrote:
> > On Fri, Sep 01, 2006 at 02:32:08AM -0500, Robert Jones wrote:
> > > All -
> > >
> > > I am having issues with filesystem corruption on a RAID1 array using
> > > RAIDframe on OpenBSD 3.7. If I copy a large (500M+) file onto a
> > > filesystem on the array, the copied file will end up corrupted, other
> > > files on the filesystem may end up corrupted and fsck will show various
> > > errors in the filesystem metadata. The drives themselves check out fine
> > > and are showing no damaged sectors or other problems. Can anyone help me
> > > identify and fix the problem?
> > >
> > > Contents of /etc/raid0.conf, fdisk info and dmesg are below. I'll
> > > provide any additional information as necessary.
> >
> > Disklabel would be useful. Oh, and for the record, upgrade. Not that
> > that would solve the problem at hand...
>
> Disklabels:
> $ sudo disklabel wd1
> # using MBR partition 3: type A6 off 63 (0x3f) size 781417602 (0x2e937c82)
> # size offset fstype [fsize bsize cpg]
> a: 781417602 63 RAID # Cyl
> 0*-775215*
> c: 781422768 0 unused 0 0 # Cyl 0
> -775220
> $ sudo disklabel wd2
> # using MBR partition 3: type A6 off 63 (0x3f) size 781417602 (0x2e937c82)
> # size offset fstype [fsize bsize cpg]
> a: 781417602 63 RAID # Cyl
> 0*-775215*
> c: 781422768 0 unused 0 0 # Cyl 0
> -775220
Well, those are obviously correct. ;-)
> $ sudo disklabel raid0
> # /dev/rraid0c:
> type: RAID
> disk: raid
> label: fictitious
> flags:
> bytes/sector: 512
> sectors/track: 128
> tracks/cylinder: 8
> sectors/cylinder: 1024
> cylinders: 763103
> total sectors: 781417472
> rpm: 3600
> interleave: 1
> trackskew: 0
> cylinderskew: 0
> headswitch: 0 # microseconds
> track-to-track seek: 0 # microseconds
> drivedata: 0
>
> 16 partitions:
> # size offset fstype [fsize bsize cpg]
> a: 629145537 63 4.2BSD 2048 16384 323 # Cyl
> 0*-614399
> b: 20971520 629145600 4.2BSD 2048 16384 323 # Cyl 614400
> -634879
> c: 781417472 0 unused 0 0 # Cyl 0
> -763102
> d: 20971520 650117120 4.2BSD 2048 16384 323 # Cyl 634880
> -655359
> e: 83886080 671088640 4.2BSD 2048 16384 323 # Cyl 655360
> -737279
> f: 20971520 754974720 4.2BSD 2048 16384 323 # Cyl 737280
> -757759
Looks fine to me.
> > > $ dmesg
> > > OpenBSD 3.7 (ORBITAL.SP) #1: Wed Jul 27 01:29:04 PDT 2005
> > > [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/ORBITAL.SP
> >
> > > Kernelized RAIDframe activated
> > > raid0 (root): (RAID Level 1) total number of sectors is 781417472 (381551
> > > MB)
> > > dkcsum: wd0 matched BIOS disk 80
> > > dkcsum: wd1 had no matching BIOS disk
> > > dkcsum: wd2 had no matching BIOS disk
> > > root on wd0a
> > > rootdev=0x0 rrootdev=0x300 rawdev=0x302
> > > raid0: Device already configured!
> >
> > Looks a little fishy here, what are you trying to do? In any case,
> > /etc/raidX.conf is only required with kernels that do not automount, or
> > with RAID sets that are not configured to be automounted. I usually
> > rename them to /etc/raidX.autoconf or similar to keep the configuration
> > in a logical place while not interfering with normal boot.
>
> As far as I can tell, that's because I set the array to autoconfigure and
> left raid0.conf lying around. I've moved that file to a safer name but I
> don't think it's the cause of the problem.
It shouldn't be, I agree.
> > Anyway, the most likely issue is some size mismatch - say, a swap space
> > configured in the same place on the disk as your RAID array. I'd check
> > all disklabels, including that on raid0, with extreme care.
Hmm, the disklabels are fine. I suppose you didn't do anything strange,
like configuring swap on raid0b - and the symptoms you mention don't
seem logical, given this assumption.
In that case, I am afraid I'm out of good advice. Re-checking the disks
would make sense, IMHO - copy a couple of data files around without RAID
in the way. However, that's about all I can say.
Also, I presume there is no difference (if you noted the above after a
reboot) between doing this and mount, cp, umount, fsck?
Joachim