Re: Problems with raid1 -- failed disk is not removed from md

Daniel Seiler Tue, 17 Aug 1999 04:23:33 -0700
Hi again,

I was doing further experiments with the described configuration. I added 
another scsi-controller (Adaptec 3940 UW) and put one scsi-disk on the 
second controller. After pulling the power from one disk, the system 
fails in the same way as it fails with one controller: scsi-bus resets,
the cp-command hangs and there is no possibilty to reboot the system.

Perhaps somebody can tell me his raid1 hardware-configuration, which is 
proven to work, so I can change my hardware to something more functional. 

Thanks again,
                daniel


On Mon, 9 Aug 1999, Daniel Seiler wrote:

> Dear friends,
> 
> while evaluating the linux-raid-code for our site i had some trouble with a 
> failing disk and raid1-devices. I used kernel-2.2.10 with the
> raid0145-19990724-2.2.10.gz-patch applied and raidtools-19990724-0.90.tar.gz
> on a redhat-6.0 system.
> 
> My setup, described through the output of some commands:
> 
> --------------------------8<-----------------8<---------------------------
> 
> [root@test /root]# fdisk /dev/sda
> 
> Command (m for help): p
> 
> Disk /dev/sda: 255 heads, 63 sectors, 527 cylinders       
> Units = cylinders of 16065 * 512 bytes
> 
>    Device Boot    Start       End    Blocks   Id  System
> /dev/sda1             1       255   2048256   fd  Unknown
> /dev/sda2           256       259     32130   83  Linux
> /dev/sda3           260       277    144585   fd  Unknown
> /dev/sda4           278       527   2008125    5  Extended
> /dev/sda5           278       405   1028128+  fd  Unknown
> /dev/sda6           406       527    979933+  fd  Unknown
> 
> Command (m for help): q
> 
> [root@test /root]# fdisk /dev/sdb
> 
> Command (m for help): p
> 
> Disk /dev/sdb: 255 heads, 63 sectors, 527 cylinders
> Units = cylinders of 16065 * 512 bytes
> 
>    Device Boot    Start       End    Blocks   Id  System
> /dev/sdb1             1       255   2048256   fd  Unknown
> /dev/sdb2           256       259     32130   83  Linux
> /dev/sdb3           260       277    144585   fd  Unknown
> /dev/sdb4           278       527   2008125    5  Extended
> /dev/sdb5           278       405   1028128+  fd  Unknown
> /dev/sdb6           406       527    979933+  fd  Unknown
> 
> Command (m for help): q
> 
> [root@test /root]# cat /proc/mdstat
> Personalities : [raid1]
> read_ahead 1024 sectors
> md0 : active raid1 sdb1[0] sda1[1] 2048192 blocks [2/2] [UU]
> md1 : active raid1 sdb3[0] sda3[1] 144512 blocks [2/2] [UU]
> md2 : active raid1 sdb5[0] sda5[1] 1028032 blocks [2/2] [UU]
> md3 : active raid1 sdb6[0] sda6[1] 979840 blocks [2/2] [UU]
> unused devices: <none>
> [root@test /root]#  mount
> /dev/md0 on / type ext2 (rw)
> none on /proc type proc (rw)
> /dev/sdb2 on /boot type ext2 (rw)
> /dev/sda2 on /bootb type ext2 (rw)
> /dev/md3 on /home type ext2 (rw)
> /dev/md2 on /var type ext2 (rw)
> none on /dev/pts type devpts (rw,mode=0622)
> [root@test /root]#
> 
> --------------------------8<-----------------8<---------------------------
> 
> On startup, the kernel says something like this:
> 
> --------------------------8<-----------------8<---------------------------
> 
> (scsi0) <Adaptec AHA-294X Ultra SCSI host adapter> found at PCI 10/0
> (scsi0) Wide Channel, SCSI ID=7, 16/255 SCBs
> (scsi0) Downloading sequencer code... 413 instructions downloaded
> scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.17/3.2.4
>        <Adaptec AHA-294X Ultra SCSI host adapter>
> scsi : 1 host.
> (scsi0:0:0:0) Synchronous at 40.0 Mbyte/sec, offset 8.
>   Vendor: IBM       Model: DCAS-34330W       Rev: S61A
>   Type:   Direct-Access                      ANSI SCSI revision: 02
> Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
> (scsi0:0:1:0) Synchronous at 40.0 Mbyte/sec, offset 8.
>   Vendor: IBM       Model: DCAS-34330W       Rev: S61A
>   Type:   Direct-Access                      ANSI SCSI revision: 02
> Detected scsi disk sdb at scsi0, channel 0, id 1, lun 0
> (scsi0:0:6:0) Synchronous at 10.0 Mbyte/sec, offset 8.
>   Vendor: TOSHIBA   Model: CD-ROM XM-5701TA  Rev: 3136
>   Type:   CD-ROM                             ANSI SCSI revision: 02
> Detected scsi CD-ROM sr0 at scsi0, channel 0, id 6, lun 0
> scsi : detected 1 SCSI cdrom 2 SCSI disks total.
> 
> --------------------------8<-----------------8<---------------------------
> No problem so far, the mirrors seem to run.
> Now, i am starting a "cp -a /usr/lib /home/" or something, so the disks 
> are busy and then I pull the power from on of the disks (not the one 
> terminating the bus), cause I would like to test, if my system will survive a
> disk failure. Then I get the following errors (many and they don't stop):
> -------------------8<---------------8<----------------------------------
> SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 26030000
> scsidisk I/O error: dev 08:16, sector 1654798
> md: recovery thread got woken up ...
> md2: no spare disk to reconstruct array! -- continuing in degraded mode
> md3: no spare disk to reconstruct array! -- continuing in degraded mode
> md: recovery thread finished ...
> scsi0 channel 0 : resetting for second half of retries.
> SCSI bus is being reset for host 0 channel 0.
> (scsi0:0:0:0) Synchronous at 40.0 Mbyte/sec, offset 8.
> -------------------8<---------------8<----------------------------------
> 
> The shell with the cp-command hangs after that, and from other consoles I
> can do "dmesg", but I can't do "ps aux".
> 
> Perhaps someone on this list can tell what I am doing wrong. Is there some
> conceptual error in my setup? Are there known bugs in the mirror-code, 
> with will cause the above seen behavior?
> 
> Thanks a lot in advance,
>                         Daniel Seiler
> 
> 
>
Re: Problems with raid1 -- failed disk is not removed from md

Reply via email to