Re: Filesystem corruption w/ RAID array

Joachim Schipper Fri, 01 Sep 2006 14:12:15 -0700

On Fri, Sep 01, 2006 at 12:20:04PM -0500, Robert Jones wrote:
> On Fri, Sep 01, 2006 at 03:19:02PM +0200, Joachim Schipper wrote:
> > On Fri, Sep 01, 2006 at 02:32:08AM -0500, Robert Jones wrote:
> > > All -
> > > 
> > >     I am having issues with filesystem corruption on a RAID1 array using 
> > > RAIDframe on OpenBSD 3.7.  If I copy a large (500M+) file onto a 
> > > filesystem on the array, the copied file will end up corrupted, other 
> > > files on the filesystem may end up corrupted and fsck will show various 
> > > errors in the filesystem metadata.  The drives themselves check out fine 
> > > and are showing no damaged sectors or other problems.  Can anyone help me 
> > > identify and fix the problem?
> > > 
> > >     Contents of /etc/raid0.conf, fdisk info and dmesg are below.  I'll 
> > > provide any additional information as necessary.
> > 
> > Disklabel would be useful. Oh, and for the record, upgrade. Not that
> > that would solve the problem at hand...
> 
> Disklabels:
> $ sudo disklabel wd1
> # using MBR partition 3: type A6 off 63 (0x3f) size 781417602 (0x2e937c82)


> #             size        offset  fstype [fsize bsize  cpg]
>   a:     781417602            63    RAID                   # Cyl     
> 0*-775215*
>   c:     781422768             0  unused      0     0      # Cyl     0 
> -775220
> $ sudo disklabel wd2

> # using MBR partition 3: type A6 off 63 (0x3f) size 781417602 (0x2e937c82)
> #             size        offset  fstype [fsize bsize  cpg]
>   a:     781417602            63    RAID                   # Cyl     
> 0*-775215*
>   c:     781422768             0  unused      0     0      # Cyl     0 
> -775220

Well, those are obviously correct. ;-)

> $ sudo disklabel raid0
> # /dev/rraid0c:
> type: RAID
> disk: raid
> label: fictitious
> flags:
> bytes/sector: 512
> sectors/track: 128
> tracks/cylinder: 8
> sectors/cylinder: 1024
> cylinders: 763103
> total sectors: 781417472
> rpm: 3600
> interleave: 1
> trackskew: 0
> cylinderskew: 0
> headswitch: 0           # microseconds
> track-to-track seek: 0  # microseconds
> drivedata: 0
> 
> 16 partitions:
> #             size        offset  fstype [fsize bsize  cpg]
>   a:     629145537            63  4.2BSD   2048 16384  323 # Cyl     
> 0*-614399
>   b:      20971520     629145600  4.2BSD   2048 16384  323 # Cyl 614400 
> -634879
>   c:     781417472             0  unused      0     0      # Cyl     0 
> -763102
>   d:      20971520     650117120  4.2BSD   2048 16384  323 # Cyl 634880 
> -655359
>   e:      83886080     671088640  4.2BSD   2048 16384  323 # Cyl 655360 
> -737279
>   f:      20971520     754974720  4.2BSD   2048 16384  323 # Cyl 737280 
> -757759

Looks fine to me.

> > > $ dmesg
> > > OpenBSD 3.7 (ORBITAL.SP) #1: Wed Jul 27 01:29:04 PDT 2005
> > >     [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/ORBITAL.SP
> > 
> > > Kernelized RAIDframe activated
> > > raid0 (root): (RAID Level 1) total number of sectors is 781417472 (381551 
> > > MB)
> > > dkcsum: wd0 matched BIOS disk 80
> > > dkcsum: wd1 had no matching BIOS disk
> > > dkcsum: wd2 had no matching BIOS disk
> > > root on wd0a
> > > rootdev=0x0 rrootdev=0x300 rawdev=0x302
> > > raid0: Device already configured!
> > 
> > Looks a little fishy here, what are you trying to do? In any case,
> > /etc/raidX.conf is only required with kernels that do not automount, or
> > with RAID sets that are not configured to be automounted. I usually
> > rename them to /etc/raidX.autoconf or similar to keep the configuration
> > in a logical place while not interfering with normal boot.
> 
> As far as I can tell, that's because I set the array to autoconfigure and 
> left raid0.conf lying around.  I've moved that file to a safer name but I 
> don't think it's the cause of the problem.

It shouldn't be, I agree.

> > Anyway, the most likely issue is some size mismatch - say, a swap space
> > configured in the same place on the disk as your RAID array. I'd check
> > all disklabels, including that on raid0, with extreme care.

Hmm, the disklabels are fine. I suppose you didn't do anything strange,
like configuring swap on raid0b - and the symptoms you mention don't
seem logical, given this assumption.

In that case, I am afraid I'm out of good advice. Re-checking the disks
would make sense, IMHO - copy a couple of data files around without RAID
in the way. However, that's about all I can say.

Also, I presume there is no difference (if you noted the above after a
reboot) between doing this and mount, cp, umount, fsck?

                Joachim

Re: Filesystem corruption w/ RAID array

Reply via email to