Trouble replacing drive in array / hot standby
Hi, I have got a few questions about an array I have here, running on a RedHat 6.0 distribution with a 2.2.5-22 kernel, and raidtools 0.9. The array has 4 SCSI disks, where one has failed: # cat /proc/mdstat Personalities : [raid5] read_ahead 1024 sectors md0 : active raid5 sda1[0](F) sdd1[3] sdc1[2] sde1[1] 26627328 blocks level 5, 128k chunk, algorithm 2 [4/3] [_UUU] unused devices: none I tried to replace sda1 (the drive with SCSI ID 0) with another physical drive I have outside the machine (which was once part of this RAID array. The drive that went in had ID 0 too. I played around trying to get the array working, but for some reason it would not work (sorry I do not have any output from this time). I suspect that the fact that the drive had old information on it may have caused a problem, since putting the faulty drive with time inconsistencies allowed the array to be started up again. I have a spare disk in this machine which has been added to the array with raidhotadd. I wanted this spare disk to be automatically added as a hotspare drive, but I have been unable to get this working (now commented out in raidtab.conf file below). Can anyone give me some insight into what is going on here. Should I format the partition on the drive that did not work so that superblock/etc information is no longer present? Should I seriously consider compiling a 2.2.14 kernel with the latest raidtools patch? Should I pull my last hair out of my head? Kind regards, Stuart. # raidstart --version raidstart v0.3d compiled for md raidtools-0.90 # cat /etc/raidtab raiddev /dev/md0 raid-level 5 nr-raid-disks 4 chunk-size 128 persistent-superblock 1 parity-algorithmleft-symmetric # Spare disks for hot reconstruction #nr-spare-disks 1 device /dev/sda1 raid-disk 0 device /dev/sde1 raid-disk 1 device /dev/sdc1 raid-disk 2 device /dev/sdd1 raid-disk 3 #device /dev/sdb1 #spare-disk 0 # cat /var/log/dmesg wansea University Computer Society NET3.039 NET4: Unix domain sockets 1.0 for Linux NET4.0. NET4: Linux TCP/IP 1.0 for NET4.0 IP Protocols: ICMP, UDP, TCP, IGMP Initializing RT netlink socket Starting kswapd v 1.5 Detected PS/2 Mouse Port. Serial driver version 4.27 with MANY_PORTS MULTIPORT SHARE_IRQ enabled ttyS00 at 0x03f8 (irq = 4) is a 16550A ttyS01 at 0x02f8 (irq = 3) is a 16550A pty: 256 Unix98 ptys configured apm: BIOS version 1.2 Flags 0x03 (Driver version 1.9) Real Time Clock Driver v1.09 RAM disk driver initialized: 16 RAM disks of 4096K size PIIX: IDE controller on PCI bus 00 dev 38 PIIX: not 100% native mode: will probe irqs later PIIX: neither IDE port enabled (BIOS) PIIX: IDE controller on PCI bus 00 dev 39 PIIX: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xe800-0xe807, BIOS settings: hda:pio, hdb:pio ide1: BM-DMA at 0xe808-0xe80f, BIOS settings: hdc:pio, hdd:pio hda: ST31720A, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hda: ST31720A, 1626MB w/0kB Cache, CHS=826/64/63 Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 md driver 0.90.0 MAX_MD_DEVS=256, MAX_REAL=12 raid5: measuring checksumming speed 8regs : 169.545 MB/sec 32regs: 149.352 MB/sec using fastest function: 8regs (169.545 MB/sec) scsi : 0 hosts. scsi : detected total. md.c: sizeof(mdp_super_t) = 4096 Partition check: hda: hda1 hda2 hda3 RAMDISK: Compressed image found at block 0 autodetecting RAID arrays autorun ... ... autorun DONE. VFS: Mounted root (ext2 filesystem). (scsi0) Adaptec AHA-294X SCSI host adapter found at PCI 10/0 (scsi0) Narrow Channel, SCSI ID=7, 16/255 SCBs (scsi0) Downloading sequencer code... 406 instructions downloaded scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.16/3.2.4 Adaptec AHA-294X SCSI host adapter scsi : 1 host. (scsi0:0:0:0) Synchronous at 10.0 Mbyte/sec, offset 15. Vendor: SEAGATE Model: ST19171N Rev: 0023 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 (scsi0:0:1:0) Synchronous at 10.0 Mbyte/sec, offset 15. Vendor: SEAGATE Model: ST19171N Rev: 0024 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sdb at scsi0, channel 0, id 1, lun 0 (scsi0:0:2:0) Synchronous at 10.0 Mbyte/sec, offset 15. Vendor: SEAGATE Model: ST19171N Rev: 0023 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sdc at scsi0, channel 0, id 2, lun 0 (scsi0:0:3:0) Synchronous at 10.0 Mbyte/sec, offset 15. Vendor: SEAGATE Model: ST39140N Rev: 1498 Type: Direct-Access ANSI SCSI revision: 02 Detected scsi disk sdd at scsi0, channel 0, id 3, lun 0 (scsi0:0:4:0) Synchronous at
Raid1 - dangerous resync after power-failure?
I'm setting up a web server with Raid-1, using raidtools 0.90-5 and linux kernel 2.2.12 (this is the Redhat 6.1 distr). I want to mirror all my data across two disks (hda and hdc). The problem I've noticed from testing is that if I shut off the power and then reboot, the raidtools software will start re-syncing the mirrors, even though there was no write activity at all when the power went off and even though both parts of the mirror have the exact same event counter. The problem I see with this is as follows: - Assume a power outage hits and wipes out some sectors on the hda disk, but leaves the superblock alone. I think this scenario is a fairly likely one. - After the power outage, the system boots up and starts up a resync, copying data from hda to hdc - The system tries to access the bad sectors on hda What would happen at this point? I assume the data would be lost, since hdc is undergoing a re-sync, and the sectors on hda are already bad. Even though at boot time hdc contained good copies of these sectors, the raid software starting re-syncing onto hdc and lost that data. If however the raid code had just left hdc alone it could've recovered these sectors. I looked at the raidtools code, and it looks to me what is happening is that there is a SB_CLEAN flag in the superblock that is set to false when raid is started on an md device. This SB_CLEAN flag is only set to true if a clean shutdown is performed. So if a power outage hits, this flag is always going to be false since no clean shutdown is performed. At boot time the md code then checks the SB_CLEAN flag and if it is false a resync is performed. It seems to me that a resync should only be required if the system is in the middle of a write where some data has been sent to one disk, but not yet to another. I think the event counter already performs this function so I don't see why the SB_CLEAN flag is even needed. What do you think? Could this SB_CLEAN flag be eliminated to reduce the risk of a resync damaging good data?
raid-2.2.14-B1 reconstruction bug and problems
I've been using RAID 0.90 with the 2.0 kernel on a bunch of production boxes (RAID5) and the disk failure handling and reconstruction has worked fine, both in tests and (once) in real life when a disk failed. I'm now trying 2.2.14 + raid-2.2.14-B1 (as shipped in the Red Hat 6.x kernel) and have come across both a problem with testing disk failure and also an apparent bug in RAID error handling: -- cut here -- SCSI disk error : host 0 channel 0 id 8 lun 0 return code = 2802 [valid=0] Info fld=0x0, Current sd08:61: sense key Not Ready Additional sense indicates Logical unit not ready, initializing command required scsidisk I/O error: dev 08:61, sector 265176 md: bug in file raid5.c, line 659 ** * COMPLETE RAID STATE PRINTOUT * ** -- cut here -- followed by a detailed dump of the RAID superblock information. After that, any commands (including raidhotremove/raidhotadd) which try to touch the RAID array hang in uninterruptible sleep and so do any processes which were accessing the RAID filesystem at the time of the failure. The above was triggered by my simulation of a disk failure which I did by spinning the disk down with the SCSI_IOCTL_STOP_UNIT ioctl. That leads to the second problem: the reason I used that method of simulating a disk failure was that the old method: echo "scsi remove-single-device 0 0 3 0" /proc/scsi/scsi has stopped working with kernel 2.2. strace shows that the write() returns with errno EBUSY. linux/drivers/scsi/scsi.c shows that this is because the access_count of Scsi_Device structure is non-zero. Looking at the equivalent 2.0 source doesn't seem to show any semantic changes and yet the same command under 2.0 works fine. Please can anyone help otherwise this server is going to have to run without the added reliability of RAID5 which would be disappointing? As an act of desperation I even wrote a little kernel module to change the access_count back to zero and then ran the "...remove-single-device...". This time, the device did get removed properly, RAID noticed the removal and went properly into degraded mode. Unfortunately, once again, all processes accessing the RAID filesystem and then any raidhotadd/raidhotremove/umount commands all hung in uninterruptible state. Nothing in this mailing list or anywhere else I can find with web searches seems to have had this problem so I'm at a loss what to do. Any help would be gratefully received. In case it matters, this is on an SMP system (2 CPUs) and the disks are all SCSI disks on a bus with an Adaptec 7899 adapter, using the aic7xxx driver 5.1.72/3.2.4. In case anyone wants the kernel module to alter a SCSI device access_count, here it is: -- cut here -- #include linux/kernel.h #include linux/module.h #include linux/blk.h #include "/usr/src/linux/drivers/scsi/scsi.h" #include "/usr/src/linux/drivers/scsi/hosts.h" static int host = 0; static int channel = 0; static int id = 0; static int lun = 0; static int delta = 0; MODULE_PARM(host, "i"); MODULE_PARM(channel, "i"); MODULE_PARM(id, "i"); MODULE_PARM(lun, "i"); MODULE_PARM(delta, "i"); int init_module(void) { struct Scsi_Host *hba; Scsi_Device *scd; printk("scsiaccesscount starting\n"); for (hba = scsi_hostlist; hba; hba = hba-next) if (hba-host_no == host) break; if (!hba) return -ENODEV; for (scd = hba-host_queue; scd; scd = scd-next) if (scd-channel == channel scd-id == id scd-lun == lun) break; if (!scd) return -ENODEV; printk("access_count is %d\n", scd-access_count); if (delta) { scd-access_count += delta; printk("changed access_count to %d\n", scd-access_count); } return -EIO; } -- cut here -- Use it as insmod scsiaccesscount.o host=0 channel=0 id=3 lun=0 to show the access count for ID 3 on bus 0 channel 0 and insmod scsiaccesscount.o host=0 channel=0 id=3 lun=0 delta=-1 to substract one from the access_count. Obviously this is just for debugging and may not be safe to do at all (and indeed wasn't in my case). --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Unix Systems Programmer Oxford University Computing Services
What happened?
I have been configuring a RAID 5 system. I have a 3 disk raid on a promise MAX-II controller. Each was on its own controller port. I had a boot disk on /dev/hda and a disk I was using for restoring on /dev/hdb. The partitions on the latter disk were only mounted when needed. I shutdown and removed the /dev/hdb and reconnected the cdrom to the slave of the 1st ide. Now when I reboot, I get a "corrupt superblock" message suggesting that I try e2fsck -b 8193 to recover it. These messages are for my /dev/mdx raid drives. When I go into maintenance mode, I can mount and e2fsck all my /dev/mdx drives and they check clean. I can mount the file systems and they all work. mdstat indicates all is well. Anyone have any ideas what happened and how to fix it? Thanks, Doug Egan
Re: RAID5 array not coming up after repaired disk
On Sat, 25 Mar 2000 13:10:13 GMT, you wrote: On Fri, 24 Mar 2000 19:36:18 -0500, you wrote: Ok, maybe I'm on crack and need to lay off the pipe a little while, but it appears that sdf7 doesn't have a partition type of "fd" and as such isn't getting considered for inclusion in md0. Nope, all partitions /dev/sd{a,b,c,d,e,f}7 have type fd. After moving sdf7 on the top in the /etc/raidtab, the array came up in degraded mode and I was able to raidhotadd the new disk. I feel that the RAID should have recovered from this failure without requiring manual intervention. Or maybe I did something wrong? Greetings Marc -- -- !! No courtesy copies, please !! - Marc Haber | " Questions are the | Mailadresse im Header Karlsruhe, Germany | Beginning of Wisdom " | Fon: *49 721 966 32 15 Nordisch by Nature | Lt. Worf, TNG "Rightful Heir" | Fax: *49 721 966 31 29
Status of Raid-0.9
Hello, Sorry for taking a minute of your valuable time, but since all other attempts to get information failed, I hope that someone from this list will be able to answer 3 quick questions for me: 1) What is the status of the RAID development? From the archives on ftp.kernel.org and the mailing list archives it appears that all traffic conearning the deleopment stopped end of Aug 99. Is that true? Why did the development get abandoned? Are there major bugs in the code? 2) Can the distributed raidset raidtools0.9 and raid0145-19992408-2.2.11 be considered stable for a RAID-1 application in a 2.2.x kernel? 3) Are there still efforts to include a 'new' Raid implementation in the new stanard kernels (e.g. 2.5)? Thank you again for taking the time to answer those questions! Yours, Nikolaus.
Re: superblock or the partition table is corrupt?
h. Looks like I forgot this step. I have a raid 1 setup under RedHat 6.1's stock 2.2.12 kernel with raidtools already installed. It works fine, a disk failed last weekend and I was able to recover that disk while the array continued to function. My question is this: Is it too late to run mke2fs on /dev/md0 now that /dev/md0 contains data? Also I dont see fd as an option for fdisk. Thanks. On Sat, 25 Mar 2000, m. allan noah wrote: so, you need to run mke2fs on /dev/md0, rather than the individual partitions, then you should be fine. allan
Re: superblock or the partition table is corrupt?
At 12:25 PM 3/27/2000, root wrote: h. Looks like I forgot this step. I have a raid 1 setup under RedHat 6.1's stock 2.2.12 kernel with raidtools already installed. It works fine, a disk failed last weekend and I was able to recover that disk while the array continued to function. My question is this: Is it too late to run mke2fs on /dev/md0 now that /dev/md0 contains data? Also I dont see fd as an option for fdisk. If you needed to run mke2fs, you wouldn't be able to access the raid as a file system right now... That is the equivalent of a DOS format. Option FD doesn't show in the list on fdisk... you just have to select type then enter fd and return. === David Cooley N5XMT Internet: [EMAIL PROTECTED] Packet: N5XMT@KQ4LO.#INT.NC.USA.NA T.A.P.R. Member #7068 We are Borg... Prepare to be assimilated! ===
Re: superblock or the partition table is corrupt?
Thanks, so I should be able to do the following without loss to date right? 1. umount /dev/md0 2. raidstop /dev/md0 3. change partition types on /dev/sda5 and /dev/sdb5 to fd (was linux) 4. raidstart /dev/md0 5 mount -t ext2 /dev/md0 /mirrored_databases My /etc/raidtab: # persistent RAID1 array with no spare disk. raiddev /dev/md0 nr-raid-disks 2 nr-spare-disks0 persistent-superblock 1 chunk-size4 device/dev/sda5 raid-disk 0 device/dev/sdb5 raid-disk 1 One last note: I'm running this machine as a backend database for a busy website, is the chunksize too small? Everything seems to be running ok. If it ain't broke? /proc/mdstat: Personalities : [raid1] read_ahead 1024 sectors md0 : active raid1 sdb5[1] sda5[0] 136 blocks [2/2] [UU] unused devices: none On Mon, 27 Mar 2000, David Cooley wrote: If you needed to run mke2fs, you wouldn't be able to access the raid as a file system right now... That is the equivalent of a DOS format. Option FD doesn't show in the list on fdisk... you just have to select type then enter fd and return. === David Cooley N5XMT Internet: [EMAIL PROTECTED] Packet: N5XMT@KQ4LO.#INT.NC.USA.NA T.A.P.R. Member #7068 We are Borg... Prepare to be assimilated! ===
Re: superblock or the partition table is corrupt?
At 01:57 PM 3/27/2000, root wrote: Thanks, so I should be able to do the following without loss to date right? 1. umount /dev/md0 2. raidstop /dev/md0 3. change partition types on /dev/sda5 and /dev/sdb5 to fd (was linux) 4. raidstart /dev/md0 5 mount -t ext2 /dev/md0 /mirrored_databases I don't know if re-writing the superblock with the type FD will preserve data or make the disk look empty... Better answered by someone with a little more experience than myself. === David Cooley N5XMT Internet: [EMAIL PROTECTED] Packet: N5XMT@KQ4LO.#INT.NC.USA.NA T.A.P.R. Member #7068 We are Borg... Prepare to be assimilated! ===
Re: System Hangs -- Which Is Most Stable Kernel?
Thanks to everyone for the assistance. I did recompile the kernel with Translucent disabled (I don't know why it is enabled by default?). Unfortunately, this has not affected the problem. As for the Adaptec, I had checked on a hardware discussion list and understood that, while some Adaptec's were problematic, the unit I purchased (the 2940U2W with matching factory cables) was working well for several Linux users. However, it seems to me from my limited experience that the Adaptec may be the problem as it would fit the type of hanging that seems to occur (no error messages, everything just freezes -- possibly waiting for the Adaptec to send the data through). I am still unable to find any log or anyway of tracing the system hangs. I may try debugging on the SCSI (haven't a clue how) for a few days before trying to turn off RAID. Before buying another card (I have no others), I'll hope some reconfiguration of the Adaptec will do the trick. I hate to dump all that money down the drain. Thanks again to everyone for the assistance. Jeff Hill "m. allan noah" wrote: jeff- i am using 2.2.14 with mingo patch, and it is great. i have a dozen or so boxes, 512meg, SMP pIII 450, ncr scsi, etc in this config. all are fine. it would be interesting to see if raid is the issue, or your adaptec (i am inclined to think the latter). --snip--
Re: Status of Raid-0.9
On Mon, 27 Mar 2000, Nikolaus Froehlich wrote: Hello, Sorry for taking a minute of your valuable time, but since all other attempts to get information failed, I hope that someone from this list will be able to answer 3 quick questions for me: 1) What is the status of the RAID development? From the archives on ftp.kernel.org and the mailing list archives it appears that all traffic conearning the deleopment stopped end of Aug 99. Is that true? Why did the development get abandoned? Are there major bugs in the code? RAID 0.90 development has continued, and the most current patch is available for 2.2.14 at http://people.redhat.com/mingo 2) Can the distributed raidset raidtools0.9 and raid0145-19992408-2.2.11 be considered stable for a RAID-1 application in a 2.2.x kernel? Go with 2.2.14, or even better, 2.2.15pre15 (you'll have to fix a reject in raid1.c if you need RAID-1 though - but it's an easy one) 3) Are there still efforts to include a 'new' Raid implementation in the new stanard kernels (e.g. 2.5)? It's currently being merged into 2.3.X and will be in 2.4 when it comes out. -- : [EMAIL PROTECTED] : And I see the elder races, : :.: putrid forms of man: : Jakob Østergaard : See him rise and claim the earth, : :OZ9ABN : his downfall is at hand. : :.:{Konkhra}...:
Re: System Hangs -- Which Is Most Stable Kernel?
Is the PC overclocked in any way? I had troubles with my 2940U2W in both Windows and Linux when I overclocked the Front Side Bus from 100MHz to 103MHz. Seems the Adaptec cards can't handle *ANYTHING* over 33.3 MHz on the PCI bus. At 02:53 PM 3/27/2000, Jeff Hill wrote: Thanks to everyone for the assistance. I did recompile the kernel with Translucent disabled (I don't know why it is enabled by default?). Unfortunately, this has not affected the problem. As for the Adaptec, I had checked on a hardware discussion list and understood that, while some Adaptec's were problematic, the unit I purchased (the 2940U2W with matching factory cables) was working well for several Linux users. However, it seems to me from my limited experience that the Adaptec may be the problem as it would fit the type of hanging that seems to occur (no error messages, everything just freezes -- possibly waiting for the Adaptec to send the data through). I am still unable to find any log or anyway of tracing the system hangs. I may try debugging on the SCSI (haven't a clue how) for a few days before trying to turn off RAID. Before buying another card (I have no others), I'll hope some reconfiguration of the Adaptec will do the trick. I hate to dump all that money down the drain. Thanks again to everyone for the assistance. Jeff Hill "m. allan noah" wrote: jeff- i am using 2.2.14 with mingo patch, and it is great. i have a dozen or so boxes, 512meg, SMP pIII 450, ncr scsi, etc in this config. all are fine. it would be interesting to see if raid is the issue, or your adaptec (i am inclined to think the latter). --snip-- === David Cooley N5XMT Internet: [EMAIL PROTECTED] Packet: N5XMT@KQ4LO.#INT.NC.USA.NA T.A.P.R. Member #7068 We are Borg... Prepare to be assimilated! ===
Re: System Hangs -- Which Is Most Stable Kernel?
have you tried the folks on the [EMAIL PROTECTED] list? this is the list recommended in LINUX/drivers/scsi/README.aic7xxx ... -s Jeff Hill wrote: Thanks to everyone for the assistance. I did recompile the kernel with Translucent disabled (I don't know why it is enabled by default?). Unfortunately, this has not affected the problem. As for the Adaptec, I had checked on a hardware discussion list and understood that, while some Adaptec's were problematic, the unit I purchased (the 2940U2W with matching factory cables) was working well for several Linux users. However, it seems to me from my limited experience that the Adaptec may be the problem as it would fit the type of hanging that seems to occur (no error messages, everything just freezes -- possibly waiting for the Adaptec to send the data through). I am still unable to find any log or anyway of tracing the system hangs. I may try debugging on the SCSI (haven't a clue how) for a few days before trying to turn off RAID. Before buying another card (I have no others), I'll hope some reconfiguration of the Adaptec will do the trick. I hate to dump all that money down the drain. Thanks again to everyone for the assistance. Jeff Hill "m. allan noah" wrote: jeff- i am using 2.2.14 with mingo patch, and it is great. i have a dozen or so boxes, 512meg, SMP pIII 450, ncr scsi, etc in this config. all are fine. it would be interesting to see if raid is the issue, or your adaptec (i am inclined to think the latter). --snip--
Re: Swapping onto RAID: Good idea?
At 23:51 27.03.00, Godfrey Livingstone wrote: I have raid 1 working and am swapping onto /dev/md1 I am interested to know what modifacations you made to the startup scripts. We run redhat 6.1 with a patched (for raid and promise ide controller ) 2.14 kernel. Anyway what scripts do I need to change and how. Presumably by altering rc.sysinit and checking that the resynchronisation of the swap partation is complete (how?) before issusing the command "swapon -a" ? Let's see: 1) rc.sysinit: * remove swapon -a * add # Start up swapping. /sbin/raidswapon after mount of /proc filesystem * possibly comment out the "add raid devices" section; if you use raid autostart chances are that the code in this section won't help you but it WILL keep your system from comming up if there's anything bad in your raidtab (happened to me when I did some testing and forgot to remove the test - entries in raidtab). my /sbin/raidswapon: (adapted from a post to this list - sorry, I don't remember the original author). #!/bin/sh # RAIDDEVS=`grep swap /etc/fstab | grep /dev/md|cut -f1|cut -d/ -f3` for raiddev in $RAIDDEVS do # echo "testing $raiddev" while grep $raiddev /proc/mdstat | grep -q "resync=" do # echo "`date`: $raiddev resyncing" /var/log/raidswap-status sleep 20 done /sbin/swapon /dev/$raiddev done exit 0 This won't turn on swap on raid devices until resync has finished. Bye, Martin "you have moved your mouse, please reboot to make this change take effect" -- Martin Bene vox: +43-316-813824 simon media fax: +43-316-813824-6 Andreas-Hofer-Platz 9 e-mail: [EMAIL PROTECTED] 8010 Graz, Austria -- finger [EMAIL PROTECTED] for PGP public key