Linux-Misc Digest #7, Volume #25 Sat, 1 Jul 00 06:13:02 EDT
Contents:
RAID syncronization problem (Max TenEyck Woodbury)
----------------------------------------------------------------------------
From: Max TenEyck Woodbury <[EMAIL PROTECTED]>
Subject: RAID syncronization problem
Date: Sat, 01 Jul 2000 06:14:37 -0400
It's late and my locution may suffer as a result. Please
excuse me if I am using terms in a non-standard manner.
I recently set up a 3 drive software RAID 5 set on
an APX 164(SX?). The machine is running Red Hat 6.2
Alpha distribution right out of the box.
I added the following hardware:
NCR53C8xx SCSI controller - LVD
External LVD drive mounting box with 8 5 1/4" bays
3' LVD external interconnect cable
LVD terminator (indicator shows LVD mode is working).
3 Quantum Atlas 10K 36 GB drives
LVD swap trays and frames for each drive (NOT hot swap)
Extra fans.
I partitioned each drive to 35000 cyl. built the following raidtab:
# /etc/raidtab - RAID configuration table
#
#
raiddev /dev/md0
raid-level 5
nr-raid-disks 3
nr-spare-disks 0
persistent-superblock 1
parity-algorithm left-symmetric
chunk-size 128
device /dev/sdc1
raid-disk 0
device /dev/sdd1
raid-disk 1
device /dev/sde1
raid-disk 2
I did a 'mkraid /dev/md0' with no problems.
Similarly a 'mke2fs /dev/md0 -b 4096 -R stride=128'
took its time and generated a working file system.
Sync completed about 5 AM that morning.
I started having problems after an unplanned reboot.
(I.e. after the UPS got knocked off line.) Since then,
I've not been able to get all the drives resynced.
I even replaced the drive that seemed to be the source
of the problem.
The relevant /var/log/messages sections are:
...
Jun 30 20:28:33 oscar kernel: md driver 0.90.0 MAX_MD_DEVS=256, MAX_REAL=12
Jun 30 20:28:33 oscar kernel: linear personality registered
Jun 30 20:28:33 oscar kernel: raid0 personality registered
Jun 30 20:28:33 oscar kernel: raid1 personality registered
Jun 30 20:28:33 oscar kernel: raid5 personality registered
Jun 30 20:28:33 oscar kernel: raid5: measuring checksumming speed
Jun 30 20:28:33 oscar kernel: 8regs : 552.000 MB/sec
Jun 30 20:28:33 oscar kernel: 32regs : 608.000 MB/sec
Jun 30 20:28:33 oscar kernel: using fastest function: 32regs (608.000 MB/sec)
Jun 30 20:28:33 oscar kernel: (scsi0) <Adaptec AHA-294X SCSI host adapter> found at
PCI 0/6/0
Jun 30 20:28:33 oscar kernel: (scsi0) Wide Channel, SCSI ID=7, 16/255 SCBs
Jun 30 20:28:33 oscar kernel: (scsi0) Downloading sequencer code... 416 instructions
downloaded
Jun 30 20:28:33 oscar kernel: sym53c8xx: at PCI bus 0, device 9, function 0
Jun 30 20:28:33 oscar kernel: sym53c8xx: 53c895 detected with Symbios NVRAM
Jun 30 20:28:33 oscar kernel: sym53c895-0: rev=0x02, base=0xa002000, io_port=0x9000,
irq=19
Jun 30 20:28:33 oscar kernel: sym53c895-0: Symbios format NVRAM, ID 7, Fast-40, Parity
Checking
Jun 30 20:28:33 oscar kernel: sym53c895-0: initial SCNTL3/DMODE/DCNTL/CTEST3/4/5 =
(hex) 07/8e/a0/00/00/24
Jun 30 20:28:33 oscar kernel: sym53c895-0: final SCNTL3/DMODE/DCNTL/CTEST3/4/5 =
(hex) 07/4e/80/00/08/24
Jun 30 20:28:33 oscar kernel: sym53c895-0: on-chip RAM at 0xa003000
Jun 30 20:28:33 oscar kernel: sym53c895-0: resetting, command processing suspended for
2 seconds
Jun 30 20:25:23 oscar rc.sysinit: Mounting proc filesystem succeeded
...
Jun 30 20:25:23 oscar fsck: /dev/sda2: clean, 83815/328704 files, 1574557/2622464
blocks
Jun 30 20:25:23 oscar rc.sysinit: Checking root filesystem succeeded
Jun 30 20:25:23 oscar rc.sysinit: Remounting root filesystem in read-write mode
succeeded
Jun 30 20:25:25 oscar rc.sysinit: Finding module dependencies succeeded
Jun 30 20:25:25 oscar fsck: /dev/sda5: clean, 6856/761024 files, 66322/1519356 blocks
Jun 30 20:25:25 oscar fsck: /dev/sdb2: clean, 2172/204800 files, 114775/819100 blocks
Jun 30 20:25:25 oscar fsck: /dev/md0 contains a file system with errors, check forced.
Jun 30 20:25:25 oscar fsck: Logical sector size is zero.
Jun 30 20:25:25 oscar fsck: dosfsck 2.2, 06 Jul 1999, FAT32, LFN
Jun 30 20:27:53 oscar fsck: /dev/md0: 5168/8962048 files (0.1% non-contiguous),
374307/17919936 blocks
Jun 30 20:27:53 oscar fsck: Logical sector size is zero.
Jun 30 20:27:53 oscar fsck: dosfsck 2.2, 06 Jul 1999, FAT32, LFN
Jun 30 20:27:53 oscar fsck: /dev/sdb5: clean, 44664/263160 files, 870899/1052226
blocks
Jun 30 20:27:54 oscar fsck: /dev/sdb6: clean, 7383/505856 files, 744338/2016126 blocks
Jun 30 20:27:54 oscar rc.sysinit: Checking filesystems succeeded
Jun 30 20:28:09 oscar rc.sysinit: Mounting local filesystems succeeded
Jun 30 20:28:09 oscar rc.sysinit: Turning on user and group quotas for local
filesystems succeeded
Jun 30 20:28:10 oscar rc.sysinit: Enabling swap space succeeded
...
Jun 30 20:28:33 oscar kernel: sym53c895-0: restart (scsi reset).
Jun 30 20:28:34 oscar kernel: sym53c895-0: enabling clock multiplier
Jun 30 20:28:34 oscar kernel: sym53c895-0: Downloading SCSI SCRIPTS.
Jun 30 20:28:34 oscar kernel: ncr53c8xx: at PCI bus 0, device 9, function 0
Jun 30 20:28:34 oscar kernel: ncr53c8xx: IO region 0x9000 to 0x907f is in use
Jun 30 20:28:34 oscar kernel: DC390: 0 adapters found
Jun 30 20:28:35 oscar kernel: scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast
SCSI) 5.1.28/3.2.4
Jun 30 20:28:35 oscar kernel: <Adaptec AHA-294X SCSI host adapter>
Jun 30 20:28:35 oscar kernel: scsi1 : sym53c8xx - version 1.3g
Jun 30 20:28:35 oscar kernel: scsi : 2 hosts.
Jun 30 20:28:35 oscar kernel: sym53c895-0: command processing resumed
Jun 30 20:28:35 oscar kernel: (scsi0:0:0:0) Synchronous at 10.0 Mbyte/sec, offset 15.
Jun 30 20:28:35 oscar kernel: Vendor: QUANTUM Model: ATLAS_V__9_WLS Rev: 0200
Jun 30 20:28:35 oscar kernel: Type: Direct-Access ANSI SCSI
revision: 03
Jun 30 20:28:35 oscar kernel: Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
Jun 30 20:28:35 oscar kernel: (scsi0:0:1:0) Synchronous at 10.0 Mbyte/sec, offset 15.
Jun 30 20:28:35 oscar kernel: Vendor: SEAGATE Model: ST15230W Rev: 0638
Jun 30 20:28:35 oscar kernel: Type: Direct-Access ANSI SCSI
revision: 02
Jun 30 20:28:35 oscar kernel: Detected scsi disk sdb at scsi0, channel 0, id 1, lun 0
Jun 30 20:28:35 oscar kernel: (scsi0:0:3:0) Synchronous at 10.0 Mbyte/sec, offset 15.
Jun 30 20:28:35 oscar kernel: Vendor: NEC Model: CD-ROM DRIVE:462 Rev: 1.15
Jun 30 20:28:35 oscar kernel: Type: CD-ROM ANSI SCSI
revision: 02
Jun 30 20:28:35 oscar kernel: Detected scsi CD-ROM sr0 at scsi0, channel 0, id 3, lun
0
Jun 30 20:28:35 oscar kernel: (scsi0:0:6:0) Synchronous at 10.0 Mbyte/sec, offset 8.
Jun 30 20:28:35 oscar kernel: Vendor: PLEXTOR Model: CD-R PX-W4220T Rev: 1.01
Jun 30 20:28:35 oscar kernel: Type: CD-ROM ANSI SCSI
revision: 02
Jun 30 20:28:35 oscar kernel: Detected scsi CD-ROM sr1 at scsi0, channel 0, id 6, lun
0
Jun 30 20:28:35 oscar kernel: Vendor: QUANTUM Model: ATLAS 10K 36WLS Rev: UCP0
Jun 30 20:28:35 oscar kernel: Type: Direct-Access ANSI SCSI
revision: 03
Jun 30 20:28:35 oscar kernel: Detected scsi disk sdc at scsi1, channel 0, id 0, lun 0
Jun 30 20:28:35 oscar kernel: Vendor: QUANTUM Model: ATLAS 10K 36WLS Rev: UCP0
Jun 30 20:28:35 oscar kernel: Type: Direct-Access ANSI SCSI
revision: 03
Jun 30 20:28:35 oscar kernel: Detected scsi disk sdd at scsi1, channel 0, id 1, lun 0
Jun 30 20:28:35 oscar kernel: Vendor: QUANTUM Model: ATLAS 10K 36WLS Rev: UCP0
Jun 30 20:28:35 oscar kernel: Type: Direct-Access ANSI SCSI
revision: 03
Jun 30 20:28:35 oscar kernel: Detected scsi disk sde at scsi1, channel 0, id 2, lun 0
Jun 30 20:28:35 oscar kernel: sym53c895-0-<0,0>: tagged command queue depth set to 8
Jun 30 20:28:35 oscar kernel: sym53c895-0-<1,0>: tagged command queue depth set to 8
Jun 30 20:28:35 oscar kernel: sym53c895-0-<2,0>: tagged command queue depth set to 8
Jun 30 20:28:35 oscar kernel: scsi : detected 2 SCSI cdroms 5 SCSI disks total.
Jun 30 20:28:35 oscar kernel: Uniform CDROM driver Revision: 2.56
Jun 30 20:28:35 oscar kernel: sr1: scsi3-mmc drive: 20x/20x writer cd/rw xa/form2 cdda
tray
Jun 30 20:28:35 oscar kernel: SCSI device sda: hdwr sector= 512 bytes. Sectors=
17930694 [8755 MB] [8.8 GB]
Jun 30 20:28:35 oscar kernel: SCSI device sdb: hdwr sector= 512 bytes. Sectors=
8386733 [4095 MB] [4.1 GB]
Jun 30 20:28:35 oscar kernel: sym53c895-0-<0,*>: WIDE SCSI (16 bit) enabled.
Jun 30 20:28:35 oscar kernel: sym53c895-0-<0,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns,
offset 31)
Jun 30 20:28:35 oscar kernel: SCSI device sdc: hdwr sector= 512 bytes. Sectors=
71755944 [35037 MB] [35.0 GB]
Jun 30 20:28:35 oscar kernel: sym53c895-0-<1,*>: WIDE SCSI (16 bit) enabled.
Jun 30 20:28:35 oscar kernel: sym53c895-0-<1,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns,
offset 31)
Jun 30 20:28:35 oscar kernel: SCSI device sdd: hdwr sector= 512 bytes. Sectors=
71755944 [35037 MB] [35.0 GB]
Jun 30 20:28:35 oscar kernel: sym53c895-0-<2,*>: WIDE SCSI (16 bit) enabled.
Jun 30 20:28:35 oscar kernel: sym53c895-0-<2,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns,
offset 31)
Jun 30 20:28:35 oscar kernel: SCSI device sde: hdwr sector= 512 bytes. Sectors=
71755944 [35037 MB] [35.0 GB]
Jun 30 20:28:35 oscar kernel: Partition check:
Jun 30 20:28:35 oscar kernel: sda: sda1 sda2 sda3 sda4 < sda5 >
Jun 30 20:28:35 oscar kernel: sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 >
Jun 30 20:28:35 oscar kernel: sdc: unknown partition table
<This drive had just been replaced>
Jun 30 20:28:35 oscar kernel: sdd: sdd1
Jun 30 20:28:35 oscar kernel: sde: sde1
Jun 30 20:28:35 oscar kernel: md.c: sizeof(mdp_super_t) = 4104
Jun 30 20:28:35 oscar kernel: autodetecting RAID arrays
Jun 30 20:28:35 oscar kernel: (read) sdd1's sb offset: 35839872 [events: 00000008]
Jun 30 20:28:35 oscar kernel: (read) sde1's sb offset: 35839872 [events: 00000008]
Jun 30 20:28:35 oscar kernel: autorun ...
Jun 30 20:28:36 oscar kernel: considering sde1 ...
Jun 30 20:28:36 oscar kernel: adding sde1 ...
Jun 30 20:28:36 oscar kernel: adding sdd1 ...
Jun 30 20:28:36 oscar kernel: created md0
Jun 30 20:28:36 oscar kernel: bind<sdd1,1>
Jun 30 20:28:36 oscar kernel: bind<sde1,2>
Jun 30 20:28:36 oscar kernel: running: <sde1><sdd1>
Jun 30 20:28:36 oscar kernel: now!
Jun 30 20:28:36 oscar kernel: sde1's event counter: 00000008
Jun 30 20:28:36 oscar kernel: sdd1's event counter: 00000008
Jun 30 20:28:36 oscar kernel: md0: removing former faulty sdc1!
Jun 30 20:28:36 oscar kernel: md0: max total readahead window set to 1024k
Jun 30 20:28:36 oscar kernel: md0: 2 data-disks, max readahead per data-disk: 512k
Jun 30 20:28:36 oscar kernel: raid5: device sde1 operational as raid disk 2
Jun 30 20:28:36 oscar kernel: raid5: device sdd1 operational as raid disk 1
Jun 30 20:28:36 oscar kernel: raid5: md0, not all disks are operational -- trying to
recover array
Jun 30 20:28:36 oscar kernel: raid5: allocated 6379kB for md0
Jun 30 20:28:36 oscar kernel: raid5: raid level 5 set md0 active with 2 out of 3
devices, algorithm 2
Jun 30 20:28:36 oscar kernel: RAID5 conf printout:
Jun 30 20:28:36 oscar kernel: --- rd:3 wd:2 fd:1
Jun 30 20:28:36 oscar kernel: disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1
Jun 30 20:28:36 oscar kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1
Jun 30 20:28:36 oscar kernel: disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: RAID5 conf printout:
Jun 30 20:28:36 oscar kernel: --- rd:3 wd:2 fd:1
Jun 30 20:28:36 oscar kernel: disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1
Jun 30 20:28:36 oscar kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1
Jun 30 20:28:36 oscar kernel: disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:28:36 oscar kernel: md: updating md0 RAID superblock on device
Jun 30 20:28:36 oscar kernel: sde1 [events: 00000009](write) sde1's sb offset:
35839872
Jun 30 20:28:36 oscar kernel: md: recovery thread got woken up ...
Jun 30 20:28:36 oscar kernel: md0: no spare disk to reconstruct array! -- continuing
in degraded mode
Jun 30 20:28:36 oscar kernel: md: recovery thread finished ...
Jun 30 20:28:36 oscar kernel: sdd1 [events: 00000009](write) sdd1's sb offset:
35839872
Jun 30 20:28:36 oscar kernel: .
Jun 30 20:28:36 oscar kernel: ... autorun DONE.
Jun 30 20:28:36 oscar kernel: VFS: Mounted root (ext2 filesystem) readonly.
Jun 30 20:28:36 oscar kernel: Freeing unused kernel memory: 200k freed
Jun 30 20:28:36 oscar kernel: Adding Swap: 263152k swap-space (priority -1)
Jun 30 20:28:36 oscar kernel: Adding Swap: 104400k swap-space (priority -2)
...
Build the partition table on the new drive, set the type and check it.
...
Jun 30 20:31:20 oscar kernel: SCSI device sdc: hdwr sector= 512 bytes. Sectors=
71755944 [35037 MB] [35.0 GB]
Jun 30 20:31:20 oscar kernel: sdc: sdc1
Jun 30 20:31:22 oscar kernel: SCSI device sdc: hdwr sector= 512 bytes. Sectors=
71755944 [35037 MB] [35.0 GB]
Jun 30 20:31:22 oscar kernel: sdc: sdc1
...
Try to add it back into the RAID
...
Jun 30 20:32:00 oscar kernel: trying to hot-add sdc1 to md0 ...
Jun 30 20:32:00 oscar kernel: bind<sdc1,3>
Jun 30 20:32:00 oscar kernel: RAID5 conf printout:
Jun 30 20:32:00 oscar kernel: --- rd:3 wd:2 fd:1
Jun 30 20:32:00 oscar kernel: disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1
Jun 30 20:32:00 oscar kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1
Jun 30 20:32:00 oscar kernel: disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: RAID5 conf printout:
Jun 30 20:32:00 oscar kernel: --- rd:3 wd:2 fd:1
Jun 30 20:32:00 oscar kernel: disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1
Jun 30 20:32:00 oscar kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1
Jun 30 20:32:00 oscar kernel: disk 3, s:1, o:0, n:3 rd:3 us:1 dev:sdc1
Jun 30 20:32:00 oscar kernel: disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: md: updating md0 RAID superblock on device
Jun 30 20:32:00 oscar kernel: sdc1 [events: 0000000a](write) sdc1's sb offset:
35839872
Jun 30 20:32:00 oscar kernel: sde1 [events: 0000000a](write) sde1's sb offset:
35839872
Jun 30 20:32:00 oscar kernel: sdd1 [events: 0000000a](write) sdd1's sb offset:
35839872
Jun 30 20:32:00 oscar kernel: .
Jun 30 20:32:00 oscar kernel: md: recovery thread got woken up ...
Jun 30 20:32:00 oscar kernel: md0: resyncing spare disk sdc1 to replace failed disk
Jun 30 20:32:00 oscar kernel: RAID5 conf printout:
Jun 30 20:32:00 oscar kernel: --- rd:3 wd:2 fd:1
Jun 30 20:32:00 oscar kernel: disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1
Jun 30 20:32:00 oscar kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1
Jun 30 20:32:00 oscar kernel: disk 3, s:1, o:0, n:3 rd:3 us:1 dev:sdc1
Jun 30 20:32:00 oscar kernel: disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: RAID5 conf printout:
Jun 30 20:32:00 oscar kernel: --- rd:3 wd:2 fd:1
Jun 30 20:32:00 oscar kernel: disk 0, s:0, o:0, n:0 rd:0 us:1 dev:[dev 00:00]
Jun 30 20:32:00 oscar kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdd1
Jun 30 20:32:01 oscar kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sde1
Jun 30 20:32:01 oscar kernel: disk 3, s:1, o:1, n:3 rd:3 us:1 dev:sdc1
Jun 30 20:32:01 oscar kernel: disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:01 oscar kernel: disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:01 oscar kernel: disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:01 oscar kernel: disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:01 oscar kernel: disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:01 oscar kernel: disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:01 oscar kernel: disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:01 oscar kernel: disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
Jun 30 20:32:01 oscar kernel: md: syncing RAID array md0
Jun 30 20:32:01 oscar kernel: md: minimum _guaranteed_ reconstruction speed: 100
KB/sec.
Jun 30 20:32:01 oscar kernel: md: using maximum available idle IO bandwith for
reconstruction.
Jun 30 20:32:01 oscar kernel: md: using 1024k window.
Jun 30 20:32:01 oscar kernel: md: updating md0 RAID superblock on device
Jun 30 20:32:01 oscar kernel: sdc1 [events: 0000000b](write) sdc1's sb offset:
35839872
Jun 30 20:32:01 oscar kernel: sde1 [events: 0000000b](write) sde1's sb offset:
35839872
Jun 30 20:32:01 oscar kernel: sdd1 [events: 0000000b](write) sdd1's sb offset:
35839872
Jun 30 20:32:01 oscar kernel: .
...
And things look OK until
...
Jun 30 20:32:17 oscar kernel: scsi1 channel 0 : resetting for second half of retries.
Jun 30 20:32:17 oscar kernel: SCSI bus is being reset for host 1 channel 0.
Jun 30 20:32:17 oscar kernel: sym53c8xx_reset: pid=46767 reset_flags=1 serial_number=0
serial_number_at_timeout=0
Jun 30 20:32:17 oscar kernel: sym53c895-0: resetting, command processing suspended for
2 seconds
Jun 30 20:32:17 oscar kernel: scsi1: device driver called scsi_done() for a syncronous
reset.
Jun 30 20:32:17 oscar kernel: sym53c895-0: restart (scsi reset).
Jun 30 20:32:17 oscar kernel: sym53c895-0: enabling clock multiplier
Jun 30 20:32:17 oscar kernel: sym53c895-0: Downloading SCSI SCRIPTS.
Jun 30 20:32:17 oscar kernel: sym53c895-0: command processing resumed
Jun 30 20:32:17 oscar kernel: sym53c895-0-<0,*>: WIDE SCSI (16 bit) enabled.
Jun 30 20:32:17 oscar kernel: sym53c895-0-<0,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns,
offset 31)
Jun 30 20:32:17 oscar kernel: SCSI disk error : host 1 channel 0 id 0 lun 0 return
code = 18000002
Jun 30 20:32:17 oscar kernel: Info fld=0x20920, Current sd08:21: sense key Aborted
Command
Jun 30 20:32:17 oscar kernel: Additional sense indicates Scsi parity error
Jun 30 20:32:17 oscar kernel: scsidisk I/O error: dev 08:21, sector 133376
Jun 30 20:32:17 oscar kernel: interrupting MD-thread pid 6
Jun 30 20:32:17 oscar kernel: raid5: Disk failure on spare sdc1
Jun 30 20:32:17 oscar kernel: <SPARE FAILED!>
Jun 30 20:32:17 oscar kernel: <6>md0: spare disk sdc1 failed, skipping to next spare.
Jun 30 20:32:17 oscar kernel: md: updating md0 RAID superblock on device
Jun 30 20:32:17 oscar kernel: (skipping faulty sdc1 )
Jun 30 20:32:17 oscar kernel: sde1 [events: 0000000c](write) sde1's sb offset:
35839872
Jun 30 20:32:17 oscar kernel: sym53c895-0-<2,*>: WIDE SCSI (16 bit) enabled.
Jun 30 20:32:17 oscar kernel: sym53c895-0-<2,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns,
offset 31)
Jun 30 20:32:17 oscar kernel: sdd1 [events: 0000000c](write) sdd1's sb offset:
35839872
Jun 30 20:32:17 oscar kernel: sym53c895-0-<1,*>: WIDE SCSI (16 bit) enabled.
Jun 30 20:32:17 oscar kernel: sym53c895-0-<1,*>: FAST-40 WIDE SCSI 80.0 MB/s (25 ns,
offset 31)
Jun 30 20:32:17 oscar kernel: .
Jun 30 20:32:17 oscar kernel: md0: no spare disk to reconstruct array! -- continuing
in degraded mode
Jun 30 20:32:17 oscar kernel: md: recovery thread finished ...
Jun 30 20:32:17 oscar kernel: mdrecoveryd(6) flushing signals.
Jun 30 20:32:17 oscar kernel: md: recovery thread got woken up ...
Jun 30 20:32:17 oscar kernel: md0: no spare disk to reconstruct array! -- continuing
in degraded mode
Jun 30 20:32:17 oscar kernel: md: recovery thread finished ...
...
ARRRG!!!!!!!!
Additional observations -
It's not always the same sector, but it's usually in the same range
on two different disks!
I tried an Atlas V (7200 RPM) drive as well and it hung the SCSI bus
after innumerable resets.
I tried zeroing (dd of=/dev/sdc if=/dev/zero) the drive after
removing it from the array and re-adding - same thing.
Sometimes it resets and continues, but eventually it fails before
syncing.
I also changed swap trays at least once.
I could use some help on this...
[EMAIL PROTECTED]
------------------------------
** FOR YOUR REFERENCE **
The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:
Internet: [EMAIL PROTECTED]
You can send mail to the entire list (and comp.os.linux.misc) via:
Internet: [EMAIL PROTECTED]
Linux may be obtained via one of these FTP sites:
ftp.funet.fi pub/Linux
tsx-11.mit.edu pub/linux
sunsite.unc.edu pub/Linux
End of Linux-Misc Digest
******************************