Re: raid5 stuck in degraded, inactive and dirty mode
On Wed, Jan 09, 2008 at 07:16:34PM +1100, CaT wrote: But I suspect that --assemble --force would do the right thing. Without more details, it is hard to say for sure. I suspect so aswell but throwing caution into the wind erks me wrt this raid array. :) Sorry. Not to be a pain but considering the previous email with all the examine dumps, etc would the above be the way to go? I just don't want to have missed something and bugger the array up totally. -- To the extent that we overreact, we proffer the terrorists the greatest tribute. - High Court Judge Michael Kirby - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid5 stuck in degraded, inactive and dirty mode
On Fri, Jan 11, 2008 at 07:21:42AM +1100, Neil Brown wrote: On Thursday January 10, [EMAIL PROTECTED] wrote: On Wed, Jan 09, 2008 at 07:16:34PM +1100, CaT wrote: But I suspect that --assemble --force would do the right thing. Without more details, it is hard to say for sure. I suspect so aswell but throwing caution into the wind erks me wrt this raid array. :) Sorry. Not to be a pain but considering the previous email with all the examine dumps, etc would the above be the way to go? I just don't want to have missed something and bugger the array up totally. Yes, definitely. Cool. The superblocks look perfectly normal for a single drive failure followed by a crash. So --assemble --force is the way to go. Technically you could have some data corruption if a write was under way at the time of the crash. In that case the parity block of that I'd expect so as I think the crash situation is one of rather severe abruptness. stripe could be wrong, so the recovered data for the missing device could be wrong. This is why you are required to use --force - to confirm that you are aware that there could be a problem. Right. It would be worth running fsck just to be sure that nothing critical has been corrupted. Also if you have a recent backup, I wouldn't recycle it until I was fairly sure that all your data was really safe. I'll be doing a fsck and checking what data I can over the weekend to see what was fragged. I suspect it'll just be something rsynced due to the time of the crash. But in my experience the chance of actual data corruption in this situation is fairly low. Yaay. :) Thanks. I'll now go and put humpty together again. For some reason Johnny Cash's 'Ring of Fire' is playing in my head. -- To the extent that we overreact, we proffer the terrorists the greatest tribute. - High Court Judge Michael Kirby - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid5 stuck in degraded, inactive and dirty mode
On Wed, Jan 09, 2008 at 05:52:57PM +1100, Neil Brown wrote: On Wednesday January 9, [EMAIL PROTECTED] wrote: I'd provide data dumps of --examine and friends but I'm in a situation where transferring the data would be a right pain. I'll do it if need be, though. So, what can I do? Well, providing the output of --examine would help a lot. Here's the output of the 3 remaining drives, the array and mdstat. /proc/mdstat: Personalities : [raid1] [raid6] [raid5] [raid4] ... md3 : inactive sdf1[0] sde1[2] sdd1[1] 1465151808 blocks ... unused devices: none /dev/md3: Version : 00.90.03 Creation Time : Thu Aug 30 15:50:01 2007 Raid Level : raid5 Device Size : 488383936 (465.76 GiB 500.11 GB) Raid Devices : 4 Total Devices : 3 Preferred Minor : 3 Persistence : Superblock is persistent Update Time : Thu Jan 3 08:51:00 2008 State : active, degraded Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K UUID : f60a1be0:5a10f35f:164afef4:10240419 Events : 0.45649 Number Major Minor RaidDevice State 0 8 810 active sync /dev/sdf1 1 8 491 active sync /dev/sdd1 2 8 652 active sync /dev/sde1 3 00- removed /dev/sdd1: Magic : a92b4efc Version : 00.90.03 UUID : f60a1be0:5a10f35f:164afef4:10240419 Creation Time : Thu Aug 30 15:50:01 2007 Raid Level : raid5 Raid Devices : 4 Total Devices : 4 Preferred Minor : 3 Update Time : Thu Jan 3 08:51:00 2008 State : active Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Checksum : cb259d08 - correct Events : 0.45649 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 1 8 491 active sync /dev/sdd1 0 0 8 810 active sync /dev/sdf1 1 1 8 491 active sync /dev/sdd1 2 2 8 652 active sync /dev/sde1 3 3 8 333 active sync /dev/sdc1 /dev/sde1: Magic : a92b4efc Version : 00.90.03 UUID : f60a1be0:5a10f35f:164afef4:10240419 Creation Time : Thu Aug 30 15:50:01 2007 Raid Level : raid5 Raid Devices : 4 Total Devices : 4 Preferred Minor : 3 Update Time : Thu Jan 3 08:51:00 2008 State : active Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Checksum : cb259d1a - correct Events : 0.45649 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 2 8 652 active sync /dev/sde1 0 0 8 810 active sync /dev/sdf1 1 1 8 491 active sync /dev/sdd1 2 2 8 652 active sync /dev/sde1 3 3 8 333 active sync /dev/sdc1 /dev/sdf1: Magic : a92b4efc Version : 00.90.03 UUID : f60a1be0:5a10f35f:164afef4:10240419 Creation Time : Thu Aug 30 15:50:01 2007 Raid Level : raid5 Raid Devices : 4 Total Devices : 4 Preferred Minor : 3 Update Time : Thu Jan 3 08:51:00 2008 State : active Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Checksum : cb259d26 - correct Events : 0.45649 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 0 8 810 active sync /dev/sdf1 0 0 8 810 active sync /dev/sdf1 1 1 8 491 active sync /dev/sdd1 2 2 8 652 active sync /dev/sde1 3 3 8 333 active sync /dev/sdc1 But I suspect that --assemble --force would do the right thing. Without more details, it is hard to say for sure. I suspect so aswell but throwing caution into the wind erks me wrt this raid array. :) -- To the extent that we overreact, we proffer the terrorists the greatest tribute. - High Court Judge Michael Kirby - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
raid5 stuck in degraded, inactive and dirty mode
Hi, I've got a 4 disk RAID5 array that had one of the disks die. The hassle is that the death was not graceful and triggered a bug in the nforce4 chipset that wound up freezing the northbridge and hence the pc. This has left the array in a degraded state where I cannot add the swanky new HD to the array and have it back up to its snazzy self. Normally I would tinker until I got it working but this being the actual backup box, I'd rather not lose the data. :) After a bit of pondering I have come to the conclusion that what may be biting me is that each individual left-over component of the RAID array still thinks that the failed drive is still around, whilst the array as a whole knows better. Setting what used to be the left-over hd failed produces a device not found error. The components all have different checksums (which seems to be the right thing judging by other, whole arrays) and the checksums are marked correct. Event numbers are all thesame. The status on each drive is active, which I also assume is wrong. Where the components list the other members of the array the missing drive is marked 'active sync'. I'd provide data dumps of --examine and friends but I'm in a situation where transferring the data would be a right pain. I'll do it if need be, though. So, what can I do? -- To the extent that we overreact, we proffer the terrorists the greatest tribute. - High Court Judge Michael Kirby - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
raid5 resizing
Hi, I'm thinking of slowly replacing disks in my raid5 array with bigger disks and then resize the array to fill up the new disks. Is this possible? Basically I would like to go from: 3 x 500gig RAID5 to 3 x 1tb RAID5, thereby going from 1tb to 2tb of storage. It seems like it should be, but... :) -- To the extent that we overreact, we proffer the terrorists the greatest tribute. - High Court Judge Michael Kirby - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid5 resizing
On Wed, Dec 19, 2007 at 10:59:41PM +1100, Neil Brown wrote: On Wednesday December 19, [EMAIL PROTECTED] wrote: Hi, I'm thinking of slowly replacing disks in my raid5 array with bigger disks and then resize the array to fill up the new disks. Is this possible? Basically I would like to go from: 3 x 500gig RAID5 to 3 x 1tb RAID5, thereby going from 1tb to 2tb of storage. It seems like it should be, but... :) Yes. mdadm --grow /dev/mdX --size=max Oh -joy-. I love linux sw raid. :) The only thing it seems to lack is battery backed-up cache. Thank you. -- To the extent that we overreact, we proffer the terrorists the greatest tribute. - High Court Judge Michael Kirby - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: strange RAID5 problem
On Mon, May 08, 2006 at 11:30:52PM -0600, Maurice Hilarius wrote: [EMAIL PROTECTED] ~]# mdadm --assemble /dev/md3 /dev/sdq1 /dev/sdr1 /dev/sds1 /dev/sdt1 /dev/sdu1 /dev/sdv1 /dev/sdw1 /dev/sdx1 /dev/sdy1 /dev/sdz1 /dev/sdaa1 /dev/sdab1 /dev/sdac1 /dev/sdad1 /dev/sdae1 /dev/sdaf1 mdadm: superblock on /dev/sdw1 doesn't match others - assembly aborted Have you tried zeroing the superblock with mdadm --misc --zero-superblock /dev/sdw1 and then adding it in? [EMAIL PROTECTED] ~]# mount /dev/md3 /all/boxw16/ /dev/md3: Invalid argument mount: /dev/md3: can't read superblock Wow that looks messy. ummm. about the only thing I can think of is failing /dev/sdw1 and removing it (I know it says it's not there but...) Also, not biggest expert on raid around here. ;) -- To the extent that we overreact, we proffer the terrorists the greatest tribute. - High Court Judge Michael Kirby - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: help wanted - 6-disk raid5 borked: _ _ U U U U
On Sun, Apr 16, 2006 at 08:46:52PM -0300, Carlos Carvalho wrote: Neil Brown ([EMAIL PROTECTED]) wrote on 17 April 2006 09:30: The easiest thing to do when you get an error on a drive is to kick the drive from the array, so that is what the code always did, and still does in many cases. It is arguable that for a read error on a degraded raid5, that may not be the best thing to do, but I'm not completely convinced. I don't see how it could be different. If the array is degraded and one more disk fails there's no way to obtain the information, so the md device just fails like a single disk. Not necessarily. You probably have something like (say) 200GB of data stripes across that disk. That one read error may affect just one or a few which means there's a whole buttload of data that could be retrieved still. Perhaps setting the entire raid array read-only on such an error would be better? That makes it a choice between potentially losing everything and having writes and some reads fail as you have a mild stroke trying to get another drive in on things. Put the drive in, let the array do the best it can to restore things, fail the bad drive, put another disk in, have it come up fully and the fsck it good. At least this way you probably have less of a chance of losing the entire array of data and who knows, only the 'less important' files might be lost. :) Anyway, my 2c. :) -- To the extent that we overreact, we proffer the terrorists the greatest tribute. - High Court Judge Michael Kirby - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: help wanted - 6-disk raid5 borked: _ _ U U U U
On Sun, Apr 16, 2006 at 09:42:34PM -0300, Carlos Carvalho wrote: CaT ([EMAIL PROTECTED]) wrote on 17 April 2006 10:25: Not necessarily. You probably have something like (say) 200GB of data stripes across that disk. That one read error may affect just one or a few which means there's a whole buttload of data that could be retrieved still. Perhaps setting the entire raid array read-only on such an error would be better? That makes it a choice between potentially losing everything and having writes and some reads fail as you have a mild stroke trying to get another drive in on things. Put the drive in, let the array do the best it can to restore things, fail the bad drive, put another disk in, have it come up fully and the fsck it good. You want the array to stay on and jump here and there getting the stripes from wherever it can, each time from a different set of disks. That's surely nice but I think it's too much to ask... That would be nice but even just setting it read-only and if it fails a read done as it normally would it just fails it and moves on. Nothing special but it might let you recover a vast chunk of your data. Then you can decide if what is lost is worth crying over. That's still better then complete data loss. Hope that makes sense. :) -- To the extent that we overreact, we proffer the terrorists the greatest tribute. - High Court Judge Michael Kirby - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2+ raid sets, sata and a missing hd question
On Wed, Feb 15, 2006 at 07:50:28AM +0100, Luca Berra wrote: On Wed, Feb 15, 2006 at 01:45:21PM +1100, CaT wrote: Seeing as how SATA drives can move around if one removes one from a set (ie given sda, sdb, sdc, if sdb was removed sdc drops to sdb) would md6 come back up without problems if I were to remove either sda or sdb. if you configured mdadm correctly, you will have no problem :) hint echo DEVICE partitions /etc/mdadm.conf mdadm -Esc partitions | grep ARRAY /etc/mdadm.conf So the md5 array will reconstruct itself after initial bootup where the kernel reconstructs the raid1 (as well as it can) for booting? All md partitions are of type fd (Linux raid autodetect). this is surprisingly not at all relevant Awww. But I like it when the kernel just, well, does it all and makes it all ready. :) -- To the extent that we overreact, we proffer the terrorists the greatest tribute. - High Court Judge Michael Kirby - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html