safe mode for grow?
Hi, After having been through a few ups and downs with md and mdadm I think it would be a good idea if mdadm had a safe mode for growing raid5. Safe mode would entail several things: 1. It would kick off a resync. 2. It would write to the spare drive to ensure it is ok. (we have to wait for the resync anyway) 3. It would make a backup file mandatory. 4. It would only allow to grow the raid by one device. (because if more than one of the new drives go bad, we have a big problem) I know it would probably result in a bit of a deviation of the usual mdadm behaviour but I think its worth it. Kind regards, Alex. #_ __ _ __ http://www.nagilum.org/ \n icq://69646724 # # / |/ /__ _(_) /_ _ [EMAIL PROTECTED] \n +491776461165 # # // _ `/ _ `/ / / // / ' \ Amiga (68k/PPC): AOS/NetBSD/Linux # # /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/ Mac (PPC): MacOS-X / NetBSD /Linux # # /___/ x86: FreeBSD/Linux/Solaris/Win2k ARM9: EPOC EV6 # cakebox.homeunix.net - all the machine one needs.. pgp6rjuteGLTw.pgp Description: PGP Digital Signature
write-intent bitmaps
http://lists.debian.org/debian-devel/2008/01/msg00921.html Are they regarded as a stable feature? If so I'd like to see distributions supporting them by default. I've started a discussion in Debian on this topic, see the above URL for details. -- [EMAIL PROTECTED] http://etbe.coker.com.au/ My Blog http://www.coker.com.au/sponsorship.html Sponsoring Free Software development - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: write-intent bitmaps
On Sunday January 27, [EMAIL PROTECTED] wrote: http://lists.debian.org/debian-devel/2008/01/msg00921.html Are they regarded as a stable feature? If so I'd like to see distributions supporting them by default. I've started a discussion in Debian on this topic, see the above URL for details. Yes, it is regarded as stable. However it can be expected to reduce write throughput. A reduction of several percent would not be surprising, and depending in workload it could probably be much higher. It is quite easy to add or remove a bitmap on an active array, so making it a default would probably be fine providing it was easy for an admin to find out about it and remove the bitmap is they wanted the extra performance. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: write-intent bitmaps
On Sunday 27 January 2008 22:21, Neil Brown [EMAIL PROTECTED] wrote: On Sunday January 27, [EMAIL PROTECTED] wrote: http://lists.debian.org/debian-devel/2008/01/msg00921.html Are they regarded as a stable feature? If so I'd like to see distributions supporting them by default. I've started a discussion in Debian on this topic, see the above URL for details. Yes, it is regarded as stable. Thanks for that information. However it can be expected to reduce write throughput. A reduction of several percent would not be surprising, and depending in workload it could probably be much higher. It seems to me that losing a few percent of performance all the time is better than a dramatic performance loss for an hour or two when things go wrong. It is quite easy to add or remove a bitmap on an active array, so making it a default would probably be fine providing it was easy for an admin to find out about it and remove the bitmap is they wanted the extra performance. I hadn't realised that. So having this in the installer is not as important as I previously thought. -- [EMAIL PROTECTED] http://etbe.coker.com.au/ My Blog http://www.coker.com.au/sponsorship.html Sponsoring Free Software development - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
2 failed disks RAID 5 behavior bug?
Hi! Let me apologize in advance for not having as much information as I'd like to. I have a RAID 5 array with 3 elements. Kernel is 2.6.23. I had a SATA disk fail. On analysis, it's SMART claimed it had an 'electrical failure'. The drive sounded like an angry buzz-saw, so I'm guessing more was going on with it. Anyway, when the drive failed, /proc/mdstat showed two drives marked as failed [__U]. The other failed drive was on the other channel of the same SATA controller. On inspection, this second drive works fine. I'm guessing somehow the failing drive caused the SATA controller to lock or something, which caused the RAID layer to think the second drive was failed. The problematic behavior is that once two elements were marked as failed, any read or write access resulted in an I/O Failure message. Unfortunately, I believe some writes were made to the array as the Event Counter did not match on the two functional elements, and there was quite a bit of data corruption of the superblock of the FS. I'm sorry I don't have more specifics, but I hope perhaps Mr. Brown or someone else who knows the RAID code will consider making some sort of safeguard to prevent writing to a RAID 5 array when more than one element is failed. PS: Please CC: me. :) Thank You! TJ Harrell [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
striping of a 4 drive raid10
Hi I have tried to make a striping raid out of my new 4 x 1 TB SATA-2 disks. I tried raid10,f2 in several ways: 1: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, md2 = raid0 of md0+md1 2: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, md2 = raid01,f2 of md0+md1 3: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, chunksize of md0 =md1 =128 KB, md2 = raid0 of md0+md1 chunksize = 256 KB 4: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, chunksize of md0 = md1 = 128 KB, md2 = raid01,f2 of md0+md1 chunksize = 256 KB 5: md0= raid10,f4 of sda1+sdb1+sdc1+sdd1 My new disks give a transfer rate of about 80 MB/s, so I expected to have something like 320 MB/s for the whole raid, but I did not get more than about 180 MB/s. I think it may be something with the layout, that in effect the drives should be something like: sda1 sdb1sdc1 sdd1 01 2 3 45 6 7 And this was not really doable for the combination of raids, because thet combinations give different block layouts. How can it be done? Do we need a new raid type? Best regards keld - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: striping of a 4 drive raid10
On Sun, 27 Jan 2008 20:33:45 +0100, Keld Jørn Simonsen [EMAIL PROTECTED] said: keld Hi I have tried to make a striping raid out of my new 4 x keld 1 TB SATA-2 disks. I tried raid10,f2 in several ways: keld 1: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, md2 = raid0 keldof md0+md1 keld 2: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, md2 = raid01,f2 keldof md0+md1 keld 3: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, chunksize of keldmd0 =md1 =128 KB, md2 = raid0 of md0+md1 chunksize = 256 KB keld 4: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, chunksize keldof md0 = md1 = 128 KB, md2 = raid01,f2 of md0+md1 chunksize = 256 KB These stacked RAID levels don't make a lot of sense. keld 5: md0= raid10,f4 of sda1+sdb1+sdc1+sdd1 This also does not make a lot of sense. Why have four mirrors instead of two? Instead, try 'md0 = raid10,f2' for example. The first mirror of will be striped across the outer half of all four drives, and the second mirrors will be rotated in the inner half of each drive. Which of course means that reads will be quite quick, but writes and degraded operation will be slower. Consider this post for more details: http://www.spinics.net/lists/raid/msg18130.html [ ... ] - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: striping of a 4 drive raid10
On Sunday January 27, [EMAIL PROTECTED] wrote: Hi I have tried to make a striping raid out of my new 4 x 1 TB SATA-2 disks. I tried raid10,f2 in several ways: 1: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, md2 = raid0 of md0+md1 2: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, md2 = raid01,f2 of md0+md1 3: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, chunksize of md0 =md1 =128 KB, md2 = raid0 of md0+md1 chunksize = 256 KB 4: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, chunksize of md0 = md1 = 128 KB, md2 = raid01,f2 of md0+md1 chunksize = 256 KB 5: md0= raid10,f4 of sda1+sdb1+sdc1+sdd1 Try 6: md0 = raid10,f2 of sda1+sdb1+sdc1+sdd1 Also try raid10,o2 with a largeish chunksize (256KB is probably big enough). NeilBrown My new disks give a transfer rate of about 80 MB/s, so I expected to have something like 320 MB/s for the whole raid, but I did not get more than about 180 MB/s. I think it may be something with the layout, that in effect the drives should be something like: sda1 sdb1sdc1 sdd1 01 2 3 45 6 7 And this was not really doable for the combination of raids, because thet combinations give different block layouts. How can it be done? Do we need a new raid type? Best regards keld - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: striping of a 4 drive raid10
On Mon, Jan 28, 2008 at 07:13:30AM +1100, Neil Brown wrote: On Sunday January 27, [EMAIL PROTECTED] wrote: Hi I have tried to make a striping raid out of my new 4 x 1 TB SATA-2 disks. I tried raid10,f2 in several ways: 1: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, md2 = raid0 of md0+md1 2: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, md2 = raid01,f2 of md0+md1 3: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, chunksize of md0 =md1 =128 KB, md2 = raid0 of md0+md1 chunksize = 256 KB 4: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, chunksize of md0 = md1 = 128 KB, md2 = raid01,f2 of md0+md1 chunksize = 256 KB 5: md0= raid10,f4 of sda1+sdb1+sdc1+sdd1 Try 6: md0 = raid10,f2 of sda1+sdb1+sdc1+sdd1 That I already tried, (and I wrongly stated that I used f4 in stead of f2). I had two times a thruput of about 300 MB/s but since then I could not reproduce the behaviour. Are there errors on this that has been corrected in newer kernels? Also try raid10,o2 with a largeish chunksize (256KB is probably big enough). I tried that too, but my mdadm did not allow me to use the o flag. My kernel is 2.6.12 and mdadm is v1.12.0 - 14 June 2005. can I upgrade the mdadm alone to a newer version, and then which is recommendable? best regards keld - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: striping of a 4 drive raid10
On Sun, Jan 27, 2008 at 08:11:35PM +, Peter Grandi wrote: On Sun, 27 Jan 2008 20:33:45 +0100, Keld Jørn Simonsen [EMAIL PROTECTED] said: keld Hi I have tried to make a striping raid out of my new 4 x keld 1 TB SATA-2 disks. I tried raid10,f2 in several ways: keld 1: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, md2 = raid0 keldof md0+md1 keld 2: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, md2 = raid01,f2 keldof md0+md1 keld 3: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, chunksize of keldmd0 =md1 =128 KB, md2 = raid0 of md0+md1 chunksize = 256 KB keld 4: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, chunksize keldof md0 = md1 = 128 KB, md2 = raid01,f2 of md0+md1 chunksize = 256 KB These stacked RAID levels don't make a lot of sense. keld 5: md0= raid10,f4 of sda1+sdb1+sdc1+sdd1 This also does not make a lot of sense. Why have four mirrors instead of two? My error, I did mean f2. Anyway 4 mirrors would make the disk 2 times faster than 2 disks, and given disk prices these days this could make a lot of sense. Instead, try 'md0 = raid10,f2' for example. The first mirror of will be striped across the outer half of all four drives, and the second mirrors will be rotated in the inner half of each drive. Which of course means that reads will be quite quick, but writes and degraded operation will be slower. Consider this post for more details: http://www.spinics.net/lists/raid/msg18130.html Thanks for the reference. There is also more in the original article on possible layouts of what is now known as raid10,f2 http://marc.info/?l=linux-raidm=107427614604701w=2 including performance enhancements due to use of the faster outer sectors, and smaller average seek times because you can seek on only half the disk. best regards keld - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: striping of a 4 drive raid10
On Sunday January 27, [EMAIL PROTECTED] wrote: On Mon, Jan 28, 2008 at 07:13:30AM +1100, Neil Brown wrote: On Sunday January 27, [EMAIL PROTECTED] wrote: Hi I have tried to make a striping raid out of my new 4 x 1 TB SATA-2 disks. I tried raid10,f2 in several ways: 1: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, md2 = raid0 of md0+md1 2: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, md2 = raid01,f2 of md0+md1 3: md0 = raid10,f2 of sda1+sdb1, md1= raid10,f2 of sdc1+sdd1, chunksize of md0 =md1 =128 KB, md2 = raid0 of md0+md1 chunksize = 256 KB 4: md0 = raid0 of sda1+sdb1, md1= raid0 of sdc1+sdd1, chunksize of md0 = md1 = 128 KB, md2 = raid01,f2 of md0+md1 chunksize = 256 KB 5: md0= raid10,f4 of sda1+sdb1+sdc1+sdd1 Try 6: md0 = raid10,f2 of sda1+sdb1+sdc1+sdd1 That I already tried, (and I wrongly stated that I used f4 in stead of f2). I had two times a thruput of about 300 MB/s but since then I could not reproduce the behaviour. Are there errors on this that has been corrected in newer kernels? No, I don't think any performance related changes have been made to raid10 lately. You could try increasing the read-ahead size. For a 4-drive raid10 it defaults to 4 times the read-ahead setting of a single drive, but increasing substantially (e.g. 64 times) seem to increase the speed of dd reading a gigabyte. Whether that will actually affect your target workload is a different question. Also try raid10,o2 with a largeish chunksize (256KB is probably big enough). I tried that too, but my mdadm did not allow me to use the o flag. My kernel is 2.6.12 and mdadm is v1.12.0 - 14 June 2005. can I upgrade the mdadm alone to a newer version, and then which is recommendable? You would need a newer kernel and a newer mdadm to get raid10 - offset mode. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html