Re: mdadm --grow failed
Ok, I understand the risks which is why I did a full backup before doing this. I have subsequently recreated the array and restored my data from backup. Just for information, the e2fsck -n on the drive hung (unresponsive with no I/O) so I assume the filesystem was hosed. I suspect resyncing the array after the grow failed was a bad idea. I'm not sure how the grow operation is performed but to me it seems that their is no fault tolerance during the operation so any failure will cause a corrupt array. My 2c would be that if any drive fails during a grow operation that the operation is aborted in such a way as to allow a restart later (if possible) - as in my case a retry would've probably worked. Anyway, if you need more info to help improve growing arrays let me know. As a side note, either my hardware (Promise TX4000) card is acting up or there are still some unresolved issues with libata in general and/or sata_promise itself. Regards, Marc On Sat, 17 Feb 2007 19:40:17 +1100, Neil Brown wrote On Saturday February 17, [EMAIL PROTECTED] wrote: Is my array destroyed? Seeing as the sda disk wasn't completely synced I'm wonder how it was using to resync the array when sdc went offline. I've got a bad feeling about this :| I can understand your bad feeling... What happened there shouldn't happen, but obviously it did. There is evidence that all is not lost but obviously I cannot be sure yet. Can you fsck -n the array? does the data still seem to be intact? Can you report exactly what version of Linux kernel, and of mdadm you are using, and give the output of mdadm -E on each drive. I'll try to work out what happened and how to go forward, but am unlikely to get back to you for 24-48 hours (I have a busy weekend:-). NeilBrown -- - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20: reproducible hard lockup with RAID-5 resync
Neil Brown wrote: Ok, so the difference is CONFIG_SYSFS_DEPRECATED. If that is not defined, the kernel locks up. There's not a lot of code under #ifdef/#ifndef CONFIG_SYSFS_DEPRECATED, but since I'm not familiar with any of it I don't expect trying to locate the bug on my own would be very productive. Neil, do you have CONFIG_SYSFS_DEPRECATED enabled? If so, does disabling it reproduce my problem? If you can't reproduce it, should it take the problem over to linux-kernel? # CONFIG_SYSFS_DEPRECATED is not set No, it is not set, and yet it all still works for me. Dang, again. :) It is very hard to see how this CONFIG option can make a difference. Have you double checked that setting it removed the problem and clearing it causes the problem? Yes, it seems odd to me too, but I have double-checked. If I build a kernel with CONFIG_SYSFS_DEPRECATED enabled, it works; if I disable that option and rebuild the kernel, it locks up. I just tried running 'make defconfig' and then enabling only RAID, RAID-0, RAID-1, and RAID-4/5/6. If I then disable CONFIG_SYSFS_DEPRECATED, there aren't any problems. ...so, I'll try to isolate the problem some more later. Thanks, Corey - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm --grow failed
On Sunday February 18, [EMAIL PROTECTED] wrote: I'm not sure how the grow operation is performed but to me it seems that their is no fault tolerance during the operation so any failure will cause a corrupt array. My 2c would be that if any drive fails during a grow operation that the operation is aborted in such a way as to allow a restart later (if possible) - as in my case a retry would've probably worked. For what it's worth, the code does exactly what you suggest. It does fail gracefully. The problem is that it doesn't restart quite the way you would like. Had you stopped the array and re-assembled it, it would have resume the reshape process (at least it did in my testing). The following patch makes it retry a reshape straight away if it was aborted due to a device failure (of course, if too many devices have failed, the retry won't get anywhere, but you would expect that). Thanks for the valuable feedback. NeilBrown Restart a (raid5) reshape that has been aborted due to a read/write error. An error always aborts any resync/recovery/reshape on the understanding that it will immediately be restarted if that still makes sense. However a reshape currently doesn't get restarted. This this patch it does. To avoid restarting when it is not possible to do work, we call in to the personality to check that a reshape is ok, and strengthen raid5_check_reshape to fail if there are too many failed devices. We also break some code out into a separate function: remote_and_add_spares as the indent level for that code we getting crazy. ### Diffstat output ./drivers/md/md.c| 74 +++ ./drivers/md/raid5.c |2 + 2 files changed, 47 insertions(+), 29 deletions(-) diff .prev/drivers/md/md.c ./drivers/md/md.c --- .prev/drivers/md/md.c 2007-02-19 11:44:51.0 +1100 +++ ./drivers/md/md.c 2007-02-19 11:44:54.0 +1100 @@ -5343,6 +5343,44 @@ void md_do_sync(mddev_t *mddev) EXPORT_SYMBOL_GPL(md_do_sync); +static int remove_and_add_spares(mddev_t *mddev) +{ + mdk_rdev_t *rdev; + struct list_head *rtmp; + int spares = 0; + + ITERATE_RDEV(mddev,rdev,rtmp) + if (rdev-raid_disk = 0 + (test_bit(Faulty, rdev-flags) || +! test_bit(In_sync, rdev-flags)) + atomic_read(rdev-nr_pending)==0) { + if (mddev-pers-hot_remove_disk( + mddev, rdev-raid_disk)==0) { + char nm[20]; + sprintf(nm,rd%d, rdev-raid_disk); + sysfs_remove_link(mddev-kobj, nm); + rdev-raid_disk = -1; + } + } + + if (mddev-degraded) { + ITERATE_RDEV(mddev,rdev,rtmp) + if (rdev-raid_disk 0 +!test_bit(Faulty, rdev-flags)) { + rdev-recovery_offset = 0; + if (mddev-pers-hot_add_disk(mddev,rdev)) { + char nm[20]; + sprintf(nm, rd%d, rdev-raid_disk); + sysfs_create_link(mddev-kobj, + rdev-kobj, nm); + spares++; + md_new_event(mddev); + } else + break; + } + } + return spares; +} /* * This routine is regularly called by all per-raid-array threads to * deal with generic issues like resync and super-block update. @@ -5397,7 +5435,7 @@ void md_check_recovery(mddev_t *mddev) return; if (mddev_trylock(mddev)) { - int spares =0; + int spares = 0; spin_lock_irq(mddev-write_lock); if (mddev-safemode !atomic_read(mddev-writes_pending) @@ -5460,35 +5498,13 @@ void md_check_recovery(mddev_t *mddev) * Spare are also removed and re-added, to allow * the personality to fail the re-add. */ - ITERATE_RDEV(mddev,rdev,rtmp) - if (rdev-raid_disk = 0 - (test_bit(Faulty, rdev-flags) || ! test_bit(In_sync, rdev-flags)) - atomic_read(rdev-nr_pending)==0) { - if (mddev-pers-hot_remove_disk(mddev, rdev-raid_disk)==0) { - char nm[20]; - sprintf(nm,rd%d, rdev-raid_disk); - sysfs_remove_link(mddev-kobj, nm); - rdev-raid_disk = -1; - } - } - - if
Re: mdadm --grow failed
On Sun, 18 Feb 2007 07:13:28 -0500 (EST), Justin Piszcz wrote On Sun, 18 Feb 2007, Marc Marais wrote: On Sun, 18 Feb 2007 20:39:09 +1100, Neil Brown wrote On Sunday February 18, [EMAIL PROTECTED] wrote: Ok, I understand the risks which is why I did a full backup before doing this. I have subsequently recreated the array and restored my data from backup. Could you still please tell me exactly what kernel/mdadm version you were using? Thanks, NeilBrown 2.6.20 with the patch you supplied in response to the md6_raid5 crash email I posted in linux-raid a few days ago. Just as background, I replaced the failing drive and at the same time bought an additional drive in order to increase the array size. mdadm -V = v2.6 - 21 December 2006. Compiled under Debian (stable). Also, I've just noticed another drive failure with the new array with a similar error to what happened during the grow operation (although on a different drive) - I wonder if I should post this to linux-ide? Feb 18 00:58:10 xerces kernel: ata4: command timeout Feb 18 00:58:10 xerces kernel: ata4: no sense translation for status: 0x40 Feb 18 00:58:10 xerces kernel: ata4: translated ATA stat/err 0x40/00 to SCSI SK/ASC/ASCQ 0xb/00/00 Feb 18 00:58:10 xerces kernel: ata4: status=0x40 { DriveReady } Feb 18 00:58:10 xerces kernel: sd 4:0:0:0: SCSI error: return code = 0x0802 Feb 18 00:58:10 xerces kernel: sdd: Current [descriptor]: sense key: Aborted Command Feb 18 00:58:10 xerces kernel: Additional sense: No additional sense information Feb 18 00:58:10 xerces kernel: Descriptor sense data with sense descriptors (in hex): Feb 18 00:58:10 xerces kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Feb 18 00:58:10 xerces kernel: 00 00 00 00 Feb 18 00:58:10 xerces kernel: end_request: I/O error, dev sdd, sector 35666775 Feb 18 00:58:10 xerces kernel: raid5: Disk failure on sdd1, disabling device. Operation continuing on 3 devices Regards, Marc - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Just out of curiosity: Feb 18 00:58:10 xerces kernel: end_request: I/O error, dev sdd, sector 35666775 Can you run: smartctl -d ata -t short /dev/sdd wait 5 min smartctl -d ata -t long /dev/sdd wait 2-3 hr smartctl -d ata -a /dev/sdd And then e-mail that output to the list? Justin. Ok here we go: /dev/sdd: smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: WDC WD1600JB-00EVA0 Serial Number:WD-WMAEK2751794 Firmware Version: 15.05R15 Device is:In smartctl database [for details use: -P show] ATA Version is: 6 ATA Standard is: Exact ATA specification draft version not indicated Local Time is:Mon Feb 19 14:38:16 2007 GMT-9 SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (5073) seconds. Offline data collection capabilities:(0x79) SMART execute Offline immediate. No Auto Offline data collection support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. No General Purpose Logging support. Short self-test routine recommended polling time:( 2) minutes. Extended self-test routine recommended polling time:( 67) minutes. Conveyance self-test routine recommended polling time:( 5) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART