Linux-Misc Digest #7, Volume #25                  Sat, 1 Jul 00 06:13:02 EDT

Contents:
  RAID syncronization problem (Max TenEyck Woodbury)

----------------------------------------------------------------------------

From: Max TenEyck Woodbury <[EMAIL PROTECTED]>
Subject: RAID syncronization problem
Date: Sat, 01 Jul 2000 06:14:37 -0400

It's late and my locution may suffer as a result. Please
excuse me if I am using terms in a non-standard manner.

I recently set up a 3 drive software RAID 5 set on
an APX 164(SX?). The machine is running Red Hat 6.2
Alpha distribution right out of the box.

I added the following hardware:

NCR53C8xx SCSI controller - LVD
External LVD drive mounting box with 8 5 1/4" bays
3' LVD external interconnect cable
LVD terminator (indicator shows LVD mode is working).
3 Quantum Atlas 10K 36 GB drives
LVD swap trays and frames for each drive (NOT hot swap)
Extra fans.

I partitioned each drive to 35000 cyl. built the following raidtab:

# /etc/raidtab - RAID configuration table
#
#
raiddev /dev/md0
    raid-level                  5
    nr-raid-disks               3
    nr-spare-disks              0
    persistent-superblock       1
    parity-algorithm            left-symmetric
    chunk-size                  128

    device                      /dev/sdc1
    raid-disk                   0
    device                      /dev/sdd1
    raid-disk                   1
    device                      /dev/sde1
    raid-disk                   2

I did a 'mkraid /dev/md0' with no problems.
Similarly a 'mke2fs /dev/md0 -b 4096 -R stride=128'
took its time and generated a working file system.
Sync completed about 5 AM that morning.

I started having problems after an unplanned reboot.
(I.e. after the UPS got knocked off line.) Since then,
I've not been able to get all the drives resynced.
I even replaced the drive that seemed to be the source
of the problem.
The relevant /var/log/messages sections are:

...
Jun 30 20:28:33 oscar kernel: md driver 0.90.0 MAX_MD_DEVS=256, MAX_REAL=12 
Jun 30 20:28:33 oscar kernel: linear personality registered 
Jun 30 20:28:33 oscar kernel: raid0 personality registered 
Jun 30 20:28:33 oscar kernel: raid1 personality registered 
Jun 30 20:28:33 oscar kernel: raid5 personality registered 
Jun 30 20:28:33 oscar kernel: raid5: measuring checksumming speed 
Jun 30 20:28:33 oscar kernel:    8regs     :   552.000 MB/sec 
Jun 30 20:28:33 oscar kernel:    32regs    :   608.000 MB/sec 
Jun 30 20:28:33 oscar kernel: using fastest function: 32regs (608.000 MB/sec) 
Jun 30 20:28:33 oscar kernel: (scsi0) <Adaptec AHA-294X SCSI host adapter> found at 
PCI 0/6/0 
Jun 30 20:28:33 oscar kernel: (scsi0) Wide Channel, SCSI ID=7, 16/255 SCBs 
Jun 30 20:28:33 oscar kernel: (scsi0) Downloading sequencer code... 416 instructions 
downloaded 
Jun 30 20:28:33 oscar kernel: sym53c8xx: at PCI bus 0, device 9, function 0 
Jun 30 20:28:33 oscar kernel: sym53c8xx: 53c895 detected with Symbios NVRAM 
Jun 30 20:28:33 oscar kernel: sym53c895-0: rev=0x02, base=0xa002000, io_port=0x9000, 
irq=19 
Jun 30 20:28:33 oscar kernel: sym53c895-0: Symbios format NVRAM, ID 7, Fast-40, Parity 
Checking 
Jun 30 20:28:33 oscar kernel: sym53c895-0: initial SCNTL3/DMODE/DCNTL/CTEST3/4/5 = 
(hex) 07/8e/a0/00/00/24 
Jun 30 20:28:33 oscar kernel: sym53c895-0: final   SCNTL3/DMODE/DCNTL/CTEST3/4/5 = 
(hex) 07/4e/80/00/08/24 
Jun 30 20:28:33 oscar kernel: sym53c895-0: on-chip RAM at 0xa003000 
Jun 30 20:28:33 oscar kernel: sym53c895-0: resetting, command processing suspended for 
2 seconds 
Jun 30 20:25:23 oscar rc.sysinit: Mounting proc filesystem succeeded 
...
Jun 30 20:25:23 oscar fsck: /dev/sda2: clean, 83815/328704 files, 1574557/2622464 
blocks 
Jun 30 20:25:23 oscar rc.sysinit: Checking root filesystem succeeded 
Jun 30 20:25:23 oscar rc.sysinit: Remounting root filesystem in read-write mode 
succeeded 
Jun 30 20:25:25 oscar rc.sysinit: Finding module dependencies succeeded 
Jun 30 20:25:25 oscar fsck: /dev/sda5: clean, 6856/761024 files, 66322/1519356 blocks 
Jun 30 20:25:25 oscar fsck: /dev/sdb2: clean, 2172/204800 files, 114775/819100 blocks 
Jun 30 20:25:25 oscar fsck: /dev/md0 contains a file system with errors, check forced. 
Jun 30 20:25:25 oscar fsck: Logical sector size is zero. 
Jun 30 20:25:25 oscar fsck: dosfsck 2.2, 06 Jul 1999, FAT32, LFN 
Jun 30 20:27:53 oscar fsck: /dev/md0: 5168/8962048 files (0.1% non-contiguous), 
374307/17919936 blocks 
Jun 30 20:27:53 oscar fsck: Logical sector size is zero. 
Jun 30 20:27:53 oscar fsck: dosfsck 2.2, 06 Jul 1999, FAT32, LFN 
Jun 30 20:27:53 oscar fsck: /dev/sdb5: clean, 44664/263160 files, 870899/1052226 
blocks 
Jun 30 20:27:54 oscar fsck: /dev/sdb6: clean, 7383/505856 files, 744338/2016126 blocks 
Jun 30 20:27:54 oscar rc.sysinit: Checking filesystems succeeded 
Jun 30 20:28:09 oscar rc.sysinit: Mounting local filesystems succeeded 
Jun 30 20:28:09 oscar rc.sysinit: Turning on user and group quotas for local 
filesystems succeeded 
Jun 30 20:28:10 oscar rc.sysinit: Enabling swap space succeeded 
...
Jun 30 20:28:33 oscar kernel: sym53c895-0: restart (scsi reset). 
Jun 30 20:28:34 oscar kernel: sym53c895-0: enabling clock multiplier 
Jun 30 20:28:34 oscar kernel: sym53c895-0: Downloading SCSI SCRIPTS. 
Jun 30 20:28:34 oscar kernel: ncr53c8xx: at PCI bus 0, device 9, function 0 
Jun 30 20:28:34 oscar kernel: ncr53c8xx: IO region 0x9000 to 0x907f is in use 
Jun 30 20:28:34 oscar kernel: DC390: 0 adapters found 
Jun 30 20:28:35 oscar kernel: scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast 
SCSI) 5.1.28/3.2.4 
Jun 30 20:28:35 oscar kernel:        <Adaptec AHA-294X SCSI host adapter> 
Jun 30 20:28:35 oscar kernel: scsi1 : sym53c8xx - version 1.3g 
Jun 30 20:28:35 oscar kernel: scsi : 2 hosts. 
Jun 30 20:28:35 oscar kernel: sym53c895-0: command processing resumed 
Jun 30 20:28:35 oscar kernel: (scsi0:0:0:0) Synchronous at 10.0 Mbyte/sec, offset 15. 
Jun 30 20:28:35 oscar kernel:   Vendor: QUANTUM   Model: ATLAS_V__9_WLS    Rev: 0200 
Jun 30 20:28:35 oscar kernel:   Type:   Direct-Access                      ANSI SCSI 
revision: 03 
Jun 30 20:28:35 oscar kernel: Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 
Jun 30 20:28:35 oscar kernel: (scsi0:0:1:0) Synchronous at 10.0 Mbyte/sec, offset 15. 
Jun 30 20:28:35 oscar kernel:   Vendor: SEAGATE   Model: ST15230W          Rev: 0638 
Jun 30 20:28:35 oscar kernel:   Type:   Direct-Access                      ANSI SCSI 
revision: 02 
Jun 30 20:28:35 oscar kernel: Detected scsi disk sdb at scsi0, channel 0, id 1, lun 0 
Jun 30 20:28:35 oscar kernel: (scsi0:0:3:0) Synchronous at 10.0 Mbyte/sec, offset 15. 
Jun 30 20:28:35 oscar kernel:   Vendor: NEC       Model: CD-ROM DRIVE:462  Rev: 1.15 
Jun 30 20:28:35 oscar kernel:   Type:   CD-ROM                             ANSI SCSI 
revision: 02 
Jun 30 20:28:35 oscar kernel: Detected scsi CD-ROM sr0 at scsi0, channel 0, id 3, lun 
0 
Jun 30 20:28:35 oscar kernel: (scsi0:0:6:0) Synchronous at 10.0 Mbyte/sec, offset 8. 
Jun 30 20:28:35 oscar kernel:   Vendor: PLEXTOR   Model: CD-R   PX-W4220T  Rev: 1.01 
Jun 30 20:28:35 oscar kernel:   Type:   CD-ROM                             ANSI SCSI 
revision: 02 
Jun 30 20:28:35 oscar kernel: Detected scsi CD-ROM sr1 at scsi0, channel 0, id 6, lun 
0 
Jun 30 20:28:35 oscar kernel:   Vendor: QUANTUM   Model: ATLAS 10K 36WLS   Rev: UCP0 
Jun 30 20:28:35 oscar kernel:   Type:   Direct-Access                      ANSI SCSI 
revision: 03 
Jun 30 20:28:35 oscar kernel: Detected scsi disk sdc at scsi1, channel 0, id 0, lun 0 
Jun 30 20:28:35 oscar kernel:   Vendor: QUANTUM   Model: ATLAS 10K 36WLS   Rev: UCP0 
Jun 30 20:28:35 oscar kernel:   Type:   Direct-Access                      ANSI SCSI 
revision: 03 
Jun 30 20:28:35 oscar kernel: Detected scsi disk sdd at scsi1, channel 0, id 1, lun 0 
Jun 30 20:28:35 oscar kernel:   Vendor: QUANTUM   Model: ATLAS 10K 36WLS   Rev: UCP0 
Jun 30 20:28:35 oscar kernel:   Type:   Direct-Access                      ANSI SCSI 
revision: 03 
Jun 30 20:28:35 oscar kernel: Detected scsi disk sde at scsi1, channel 0, id 2, lun 0 
Jun 30 20:28:35 oscar kernel: sym53c895-0-<0,0>: tagged command queue depth set to 8 
Jun 30 20:28:35 oscar kernel: sym53c895-0-<1,0>: tagged command queue depth set to 8 
Jun 30 20:28:35 oscar kernel: sym53c895-0-<2,0>: tagged command queue depth set to 8 
Jun 30 20:28:35 oscar kernel: scsi : detected 2 SCSI cdroms 5 SCSI disks total. 
Jun 30 20:28:35 oscar kernel: Uniform CDROM driver Revision: 2.56 
Jun 30 20:28:35 oscar kernel: sr1: scsi3-mmc drive: 20x/20x writer cd/rw xa/form2 cdda 
tray 
Jun 30 20:28:35 oscar kernel: SCSI device sda: hdwr sector= 512 bytes. Sectors= 
17930694 [8755 MB] [8.8 GB] 
Jun 30 20:28:35 oscar kernel: SCSI device sdb: hdwr sector= 512 bytes. Sectors= 
8386733 [4095 MB] [4.1 GB] 
Jun 30 20:28:35 oscar kernel: sym53c895-0-<0,*>: WIDE SCSI (16 bit) enabled. 
Jun 30 20:28:35 oscar kernel: sym53c895-0-<0,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns, 
offset 31) 
Jun 30 20:28:35 oscar kernel: SCSI device sdc: hdwr sector= 512 bytes. Sectors= 
71755944 [35037 MB] [35.0 GB] 
Jun 30 20:28:35 oscar kernel: sym53c895-0-<1,*>: WIDE SCSI (16 bit) enabled. 
Jun 30 20:28:35 oscar kernel: sym53c895-0-<1,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns, 
offset 31) 
Jun 30 20:28:35 oscar kernel: SCSI device sdd: hdwr sector= 512 bytes. Sectors= 
71755944 [35037 MB] [35.0 GB] 
Jun 30 20:28:35 oscar kernel: sym53c895-0-<2,*>: WIDE SCSI (16 bit) enabled. 
Jun 30 20:28:35 oscar kernel: sym53c895-0-<2,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns, 
offset 31) 
Jun 30 20:28:35 oscar kernel: SCSI device sde: hdwr sector= 512 bytes. Sectors= 
71755944 [35037 MB] [35.0 GB] 
Jun 30 20:28:35 oscar kernel: Partition check: 
Jun 30 20:28:35 oscar kernel:  sda: sda1 sda2 sda3 sda4 < sda5 > 
Jun 30 20:28:35 oscar kernel:  sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 > 
Jun 30 20:28:35 oscar kernel:  sdc: unknown partition table 
   <This drive had just been replaced>
Jun 30 20:28:35 oscar kernel:  sdd: sdd1 
Jun 30 20:28:35 oscar kernel:  sde: sde1 
Jun 30 20:28:35 oscar kernel: md.c: sizeof(mdp_super_t) = 4104 
Jun 30 20:28:35 oscar kernel: autodetecting RAID arrays 
Jun 30 20:28:35 oscar kernel: (read) sdd1's sb offset: 35839872 [events: 00000008] 
Jun 30 20:28:35 oscar kernel: (read) sde1's sb offset: 35839872 [events: 00000008] 
Jun 30 20:28:35 oscar kernel: autorun ... 
Jun 30 20:28:36 oscar kernel: considering sde1 ... 
Jun 30 20:28:36 oscar kernel:   adding sde1 ... 
Jun 30 20:28:36 oscar kernel:   adding sdd1 ... 
Jun 30 20:28:36 oscar kernel: created md0 
Jun 30 20:28:36 oscar kernel: bind<sdd1,1> 
Jun 30 20:28:36 oscar kernel: bind<sde1,2> 
Jun 30 20:28:36 oscar kernel: running: <sde1><sdd1> 
Jun 30 20:28:36 oscar kernel: now! 
Jun 30 20:28:36 oscar kernel: sde1's event counter: 00000008 
Jun 30 20:28:36 oscar kernel: sdd1's event counter: 00000008 
Jun 30 20:28:36 oscar kernel: md0: removing former faulty sdc1! 
Jun 30 20:28:36 oscar kernel: md0: max total readahead window set to 1024k 
Jun 30 20:28:36 oscar kernel: md0: 2 data-disks, max readahead per data-disk: 512k 
Jun 30 20:28:36 oscar kernel: raid5: device sde1 operational as raid disk 2 
Jun 30 20:28:36 oscar kernel: raid5: device sdd1 operational as raid disk 1 
Jun 30 20:28:36 oscar kernel: raid5: md0, not all disks are operational -- trying to 
recover array 
Jun 30 20:28:36 oscar kernel: raid5: allocated 6379kB for md0 
Jun 30 20:28:36 oscar kernel: raid5: raid level 5 set md0 active with 2 out of 3 
devices, algorithm 2 
Jun 30 20:28:36 oscar kernel: RAID5 conf printout: 
Jun 30 20:28:36 oscar kernel:  --- rd:3 wd:2 fd:1 
Jun 30 20:28:36 oscar kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1 
Jun 30 20:28:36 oscar kernel:  disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1 
Jun 30 20:28:36 oscar kernel:  disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel: RAID5 conf printout: 
Jun 30 20:28:36 oscar kernel:  --- rd:3 wd:2 fd:1 
Jun 30 20:28:36 oscar kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1 
Jun 30 20:28:36 oscar kernel:  disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1 
Jun 30 20:28:36 oscar kernel:  disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel:  disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:28:36 oscar kernel: md: updating md0 RAID superblock on device 
Jun 30 20:28:36 oscar kernel: sde1 [events: 00000009](write) sde1's sb offset: 
35839872 
Jun 30 20:28:36 oscar kernel: md: recovery thread got woken up ... 
Jun 30 20:28:36 oscar kernel: md0: no spare disk to reconstruct array! -- continuing 
in degraded mode 
Jun 30 20:28:36 oscar kernel: md: recovery thread finished ... 
Jun 30 20:28:36 oscar kernel: sdd1 [events: 00000009](write) sdd1's sb offset: 
35839872 
Jun 30 20:28:36 oscar kernel: . 
Jun 30 20:28:36 oscar kernel: ... autorun DONE. 
Jun 30 20:28:36 oscar kernel: VFS: Mounted root (ext2 filesystem) readonly. 
Jun 30 20:28:36 oscar kernel: Freeing unused kernel memory: 200k freed 
Jun 30 20:28:36 oscar kernel: Adding Swap: 263152k swap-space (priority -1) 
Jun 30 20:28:36 oscar kernel: Adding Swap: 104400k swap-space (priority -2) 
...
   Build the partition table on the new drive, set the type and check it.
...
Jun 30 20:31:20 oscar kernel: SCSI device sdc: hdwr sector= 512 bytes. Sectors= 
71755944 [35037 MB] [35.0 GB] 
Jun 30 20:31:20 oscar kernel:  sdc: sdc1 
Jun 30 20:31:22 oscar kernel: SCSI device sdc: hdwr sector= 512 bytes. Sectors= 
71755944 [35037 MB] [35.0 GB] 
Jun 30 20:31:22 oscar kernel:  sdc: sdc1 
...
   Try to add it back into the RAID
...
Jun 30 20:32:00 oscar kernel: trying to hot-add sdc1 to md0 ...  
Jun 30 20:32:00 oscar kernel: bind<sdc1,3> 
Jun 30 20:32:00 oscar kernel: RAID5 conf printout: 
Jun 30 20:32:00 oscar kernel:  --- rd:3 wd:2 fd:1 
Jun 30 20:32:00 oscar kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1 
Jun 30 20:32:00 oscar kernel:  disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1 
Jun 30 20:32:00 oscar kernel:  disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel: RAID5 conf printout: 
Jun 30 20:32:00 oscar kernel:  --- rd:3 wd:2 fd:1 
Jun 30 20:32:00 oscar kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1 
Jun 30 20:32:00 oscar kernel:  disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1 
Jun 30 20:32:00 oscar kernel:  disk 3, s:1, o:0, n:3 rd:3 us:1 dev:sdc1 
Jun 30 20:32:00 oscar kernel:  disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel: md: updating md0 RAID superblock on device 
Jun 30 20:32:00 oscar kernel: sdc1 [events: 0000000a](write) sdc1's sb offset: 
35839872 
Jun 30 20:32:00 oscar kernel: sde1 [events: 0000000a](write) sde1's sb offset: 
35839872 
Jun 30 20:32:00 oscar kernel: sdd1 [events: 0000000a](write) sdd1's sb offset: 
35839872 
Jun 30 20:32:00 oscar kernel: . 
Jun 30 20:32:00 oscar kernel: md: recovery thread got woken up ... 
Jun 30 20:32:00 oscar kernel: md0: resyncing spare disk sdc1 to replace failed disk 
Jun 30 20:32:00 oscar kernel: RAID5 conf printout: 
Jun 30 20:32:00 oscar kernel:  --- rd:3 wd:2 fd:1 
Jun 30 20:32:00 oscar kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1 
Jun 30 20:32:00 oscar kernel:  disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1 
Jun 30 20:32:00 oscar kernel:  disk 3, s:1, o:0, n:3 rd:3 us:1 dev:sdc1 
Jun 30 20:32:00 oscar kernel:  disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel: RAID5 conf printout: 
Jun 30 20:32:00 oscar kernel:  --- rd:3 wd:2 fd:1 
Jun 30 20:32:00 oscar kernel:  disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00] 
Jun 30 20:32:00 oscar kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1 
Jun 30 20:32:01 oscar kernel:  disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1 
Jun 30 20:32:01 oscar kernel:  disk 3, s:1, o:1, n:3 rd:3 us:1 dev:sdc1 
Jun 30 20:32:01 oscar kernel:  disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:01 oscar kernel:  disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:01 oscar kernel:  disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:01 oscar kernel:  disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:01 oscar kernel:  disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:01 oscar kernel:  disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:01 oscar kernel:  disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:01 oscar kernel:  disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00] 
Jun 30 20:32:01 oscar kernel: md: syncing RAID array md0 
Jun 30 20:32:01 oscar kernel: md: minimum _guaranteed_ reconstruction speed: 100 
KB/sec. 
Jun 30 20:32:01 oscar kernel: md: using maximum available idle IO bandwith for 
reconstruction. 
Jun 30 20:32:01 oscar kernel: md: using 1024k window. 
Jun 30 20:32:01 oscar kernel: md: updating md0 RAID superblock on device 
Jun 30 20:32:01 oscar kernel: sdc1 [events: 0000000b](write) sdc1's sb offset: 
35839872 
Jun 30 20:32:01 oscar kernel: sde1 [events: 0000000b](write) sde1's sb offset: 
35839872 
Jun 30 20:32:01 oscar kernel: sdd1 [events: 0000000b](write) sdd1's sb offset: 
35839872 
Jun 30 20:32:01 oscar kernel: . 
...
   And things look OK until
...
Jun 30 20:32:17 oscar kernel: scsi1 channel 0 : resetting for second half of retries. 
Jun 30 20:32:17 oscar kernel: SCSI bus is being reset for host 1 channel 0. 
Jun 30 20:32:17 oscar kernel: sym53c8xx_reset: pid=46767 reset_flags=1 serial_number=0 
serial_number_at_timeout=0 
Jun 30 20:32:17 oscar kernel: sym53c895-0: resetting, command processing suspended for 
2 seconds 
Jun 30 20:32:17 oscar kernel: scsi1: device driver called scsi_done() for a syncronous 
reset. 
Jun 30 20:32:17 oscar kernel: sym53c895-0: restart (scsi reset). 
Jun 30 20:32:17 oscar kernel: sym53c895-0: enabling clock multiplier 
Jun 30 20:32:17 oscar kernel: sym53c895-0: Downloading SCSI SCRIPTS. 
Jun 30 20:32:17 oscar kernel: sym53c895-0: command processing resumed 
Jun 30 20:32:17 oscar kernel: sym53c895-0-<0,*>: WIDE SCSI (16 bit) enabled. 
Jun 30 20:32:17 oscar kernel: sym53c895-0-<0,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns, 
offset 31) 
Jun 30 20:32:17 oscar kernel: SCSI disk error : host 1 channel 0 id 0 lun 0 return 
code = 18000002 
Jun 30 20:32:17 oscar kernel: Info fld=0x20920, Current sd08:21: sense key Aborted 
Command 
Jun 30 20:32:17 oscar kernel: Additional sense indicates Scsi parity error 
Jun 30 20:32:17 oscar kernel: scsidisk I/O error: dev 08:21, sector 133376 
Jun 30 20:32:17 oscar kernel: interrupting MD-thread pid 6 
Jun 30 20:32:17 oscar kernel: raid5: Disk failure on spare sdc1 
Jun 30 20:32:17 oscar kernel:  <SPARE FAILED!> 
Jun 30 20:32:17 oscar kernel:  <6>md0: spare disk sdc1 failed, skipping to next spare. 
Jun 30 20:32:17 oscar kernel: md: updating md0 RAID superblock on device 
Jun 30 20:32:17 oscar kernel: (skipping faulty sdc1 ) 
Jun 30 20:32:17 oscar kernel: sde1 [events: 0000000c](write) sde1's sb offset: 
35839872 
Jun 30 20:32:17 oscar kernel: sym53c895-0-<2,*>: WIDE SCSI (16 bit) enabled. 
Jun 30 20:32:17 oscar kernel: sym53c895-0-<2,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns, 
offset 31) 
Jun 30 20:32:17 oscar kernel: sdd1 [events: 0000000c](write) sdd1's sb offset: 
35839872 
Jun 30 20:32:17 oscar kernel: sym53c895-0-<1,*>: WIDE SCSI (16 bit) enabled. 
Jun 30 20:32:17 oscar kernel: sym53c895-0-<1,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns, 
offset 31) 
Jun 30 20:32:17 oscar kernel: . 
Jun 30 20:32:17 oscar kernel: md0: no spare disk to reconstruct array! -- continuing 
in degraded mode 
Jun 30 20:32:17 oscar kernel: md: recovery thread finished ... 
Jun 30 20:32:17 oscar kernel: mdrecoveryd(6) flushing signals. 
Jun 30 20:32:17 oscar kernel: md: recovery thread got woken up ... 
Jun 30 20:32:17 oscar kernel: md0: no spare disk to reconstruct array! -- continuing 
in degraded mode 
Jun 30 20:32:17 oscar kernel: md: recovery thread finished ... 
...
   ARRRG!!!!!!!!

Additional observations -
  It's not always the same sector, but it's usually in the same range
    on two different disks!
  I tried an Atlas V (7200 RPM) drive as well and it hung the SCSI bus
    after innumerable resets.
  I tried zeroing (dd of=/dev/sdc if=/dev/zero) the drive after 
    removing it from the array and re-adding - same thing.
  Sometimes it resets and continues, but eventually it fails before
    syncing.
  I also changed swap trays at least once.

I could use some help on this...

[EMAIL PROTECTED]

------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and comp.os.linux.misc) via:

    Internet: [EMAIL PROTECTED]

Linux may be obtained via one of these FTP sites:
    ftp.funet.fi                                pub/Linux
    tsx-11.mit.edu                              pub/linux
    sunsite.unc.edu                             pub/Linux

End of Linux-Misc Digest
******************************

Reply via email to