Re: How many drives are bad?
On Thu, 21 Feb 2008 13:12:30 -0500, Norman Elton [EMAIL PROTECTED] said: [ ... ] normelton Assuming we go with Guy's layout of 8 arrays of 6 normelton drives (picking one from each controller), Guy Watkins proposed another one too: «Assuming the 6 controllers are equal, I would make 3 16 disk RAID6 arrays using 2 disks from each controller. That way any 1 controller can fail and your system will still be running. 6 disks will be used for redundancy. Or 6 8 disk RAID6 arrays using 1 disk from each controller). That way any 2 controllers can fail and your system will still be running. 12 disks will be used for redundancy. Might be too excessive!» So, I would not be overjoyed with either physical configuration, except in a few particular cases. It is very amusing to read such worries about host adapter failures, and somewhat depressing to see too excessive used to describe 4+2 parity RAID. normelton how would you setup the LVM VolGroups over top of normelton these already distributed arrays? That looks like a trick question, or at least an incorrect question; because I would rather not do anything like that except in a very few cases. However, if one wants to do a bad thing in the least bad way, perhaps a volume group per array would be least bad. Going back to your original question: «So... we're curious how Linux will handle such a beast. Has anyone run MD software RAID over so many disks? Then piled LVM/ext3 on top of that?» I haven't because it sounds rather inappropriate to me. «Any suggestions?» Not easy to respond without a clear statement of what the array be used for: RAID levels and file systems are very anisotropic in both performance an resilience, so a particular configuration may be very good for something but not for something else. For example a 48 drive RAID0 with 'ext2' on top would be very good for some cases, but perhaps not for archival :-). In general, I'd use RAID10 (http://WWW.BAARF.com/), RAID5 in very few cases and RAID6 almost never. In general current storage practices do not handle that well large single computer storage pools (just consider 'fsck' times) and beyond 10TB I reckon that currently only multi-host parallel/cluster file systems are good enough, for example Lustre (for smaller multi TB filesystem I'd use JFS or XFS). But then Lustre can be also used on a single machine with multiple (say 2TB) block devices, and this may be the best choice here too if a single virtual filesystem is the goal: http://wiki.Lustre.org/index.php?title=Lustre_Howto - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid 10 su, sw settings
On Sun, 30 Dec 2007 19:00:39 -0500, Brad Langhorst [EMAIL PROTECTED] said: [ ... VMware virtual disks over RAID ... ] brad - 4 disk raid 10 brad - 64k stripe size Stripe size or chunk size? Try reducing the chunk size if that is the chunk size, and applications in the VM do short reads or writes with intervals. But things seem not to require a lot of tweaking: [ ... ] brad Typical blocks/sec from iostat during large file movements brad is about 100M/s read and 80M/s write. That's fine, you are getting more or less the combined speed of 2 drives, which is what standard RAID10 over 4 drives should give you. brad - is the partition aligned correctly? i fear not... [ brad... ] brad - What should the sunit and swidth settings be during bradmount? [ ... ] These really matter for parity RAID that do read-modify-write of unaligned sector clusters. But it is rather less essential to say the least for non-parity RAID, as it only affects speed with respect to chunk size if operations are of the order of size as the chunk size or smaller. If the applications in your VM do mostly reads, try to switch to RAID0 f2 software RAID, unless they often do concurrent reads, in which case that's a bad idea. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid over 48 disks
On Wed, 19 Dec 2007 07:28:20 +1100, Neil Brown [EMAIL PROTECTED] said: [ ... what to do with 48 drive Sun Thumpers ... ] neilb I wouldn't create a raid5 or raid6 on all 48 devices. neilb RAID5 only survives a single device failure and with that neilb many devices, the chance of a second failure before you neilb recover becomes appreciable. That's just one of the many problems, other are: * If a drive fails, rebuild traffic is going to hit hard, with reading in parallel 47 blocks to compute a new 48th. * With a parity strip length of 48 it will be that much harder to avoid read-modify before write, as it will be avoidable only for writes of at least 48 blocks aligned on 48 block boundaries. And reading 47 blocks to write one is going to be quite painful. [ ... ] neilb RAID10 would be a good option if you are happy wit 24 neilb drives worth of space. [ ... ] That sounds like the only feasible option (except for the 3 drive case in most cases). Parity RAID does not scale much beyond 3-4 drives. neilb Alternately, 8 6drive RAID5s or 6 8raid RAID6s, and use neilb RAID0 to combine them together. This would give you neilb adequate reliability and performance and still a large neilb amount of storage space. That sounds optimistic to me: the reason to do a RAID50 of 8x(5+1) can only be to have a single filesystem, else one could have 8 distinct filesystems each with a subtree of the whole. With a single filesystem the failure of any one of the 8 RAID5 components of the RAID0 will cause the loss of the whole lot. So in the 47+1 case a loss of any two drives would lead to complete loss; in the 8x(5+1) case only a loss of two drives in the same RAID5 will. It does not sound like a great improvement to me (especially considering the thoroughly inane practice of building arrays out of disks of the same make and model taken out of the same box). There are also modest improvements in the RMW strip size and in the cost of a rebuild after a single drive loss. Probably the reduction in the RMW strip size is the best improvement. Anyhow, let's assume 0.5TB drives; with a 47+1 we get a single 23.5TB filesystem, and with 8*(5+1) we get a 20TB filesystem. With current filesystem technology either size is worrying, for example as to time needed for an 'fsck'. In practice RAID5 beyond 3-4 drives seems only useful for almost read-only filesystems where restoring from backups is quick and easy, never mind the 47+1 case or the 8x(5+1) one, and I think that giving some credit even to the latter arrangement is not quite right... - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html