Re: first pre-emptive raid
Derek Ragona wrote: Mirroring offers redundancy but uses twice the disk space, AND is slower than striping. Actually, disk Reads of a stripe and a mirror are the same. Writes are same speed as a single disk (half the speed of a two disk stripe). If you use something like gmirror and set the algorithm to 'round robin' reads are done from both disks if you access a 2MB file, 1MB is read from disk0 and the other MB from disk1. Rudy ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: first pre-emptive raid
At 02:57 AM 6/28/2008, prad wrote: our dual pentium3 1GHz with 2G ram and 8 18G scsi drives (server holds 4) should be arriving in about 1 week. my son and i want to this up as proper server rather than as a desktopish installation being used as a server. it will serve primarily websites (static html) and email for virtual domains as well as implement dns. from the handbook, we are learning 1. how to installing scsi drives (we have some old 2G) from #18.3 2. about software raid #18.4.1 (because we don't have a hardware solution and i guess you really don't have anything to figure out with one) 3. about geom #19 and vinum #20 4. about raid principles in general from wikipedia after a first reading, some initial questions about items: 3. it seems that geom just does striping and mirroring, but vinum offers more configurability and is really the preferred choice? 4.1 with 4 18G drives one thought is to do a raid1, but we really don't want 3 identical copies. is the only way to have 2 36G mirrors, by using raid0+1 or raid1+0? 4.2 another possibility is to do raid0, but is that ever wise unless you desperately need the space since in our situation you run a 1/4 chance of going down completely? 4.3 is striping or mirroring faster as far as i/o goes (or does the difference really matter)? i would have thought the former, but the handbook says "Striping requires somewhat more effort to locate the data, and it can cause additional I/O load where a transfer is spread over multiple disks" #20.3 4.4 vinum introduces raid5 with striping and data integrity, but exactly what are the parity blocks? furthermore, since the data is striped, how can the parity blocks rebuild anything from a hard drive that has crashed? surely, the data from each drive can't be duplicated somehow over all the drives though #20.5.2 Redundant Data Storage has me scratching my head! if there is complete mirroring, wouldn't the disk space be cut in half as with raid1? this is all very interesting and very new to us. -- Striping alone offers speed but no data protection. Mirroring offers redundancy but uses twice the disk space, AND is slower than striping. Mirror + striping offers the best of both speed with redundancy. However this configuration requires drive arrays of at least 4 drives and usually drives are added in 4's. For complete safety you should have two drives in the array as hot spares as you can lose two drives. Raid 5 attempts to offer data protection via saving parity checksums in another location. However, it is possible to have both the data area fail, and the parity fail, making a rebuild impossible. This can happen if you have two drives fail. In newer hardware offering RAID 6, the parity is saved to 2 different drives, making the failure less likely. I would suggest you wither do mirrored, or mirrored + striped as these are older drives you are using. For those of us that have lost drives in various ways, they die on their own of age, power problems will kill drives (from bad AC power AND/OR bad power supplies), heat of course will kill drives, etc. Since most drives are installed at the same time often from the same manufacturer's lot, if there is any sensitivity or defect, you can easily lose multiple drives that way as well. -Derek -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: first pre-emptive raid
prad wrote: why in diagram 20-3 of the handbook do they show 2 parity blocks though for disk3 and disk4? why would you ever have more than 1 for any single disk? The diagram shows a RAID5 made out of a number of disk stripes spread across 4 physical drives. It's the /stripes/ that are the basic building block, not the disks in that implementation. What RAID5 does is take N-1 data blocks and adds a Nth parity block as a stripe across the N drives. ie. reading across the diagram the first stripe consists of the top row of 3 data blocks on the first three disks plus the 1 parity block on the top row on the 4th disk. The next stripe (the 2nd row down) arranges it as data data parity data, and so forth for the succeeding stripes. The parity blocks are staggered across the drives to even out the traffic levels on each spindle. As the diagram shows, it's not always possible to have an exactly equal balance of parity and data blocks per drive. The diagram is simplistic though -- in a real system the stripe size would be something like N*128k -- which means that there would be millions of stripes on typical GB sized disks, so the differences between drives are entirely neglible. You have to have at least three disks to make a RAID5, and while theoretically you can have as many disks as you want, there's a practical limit somewhere around 12-13 drives [Remember, you have to calculate the XOR of all the data blocks in the stripe on any IO operation, and with so many drives, that rapidly becomes onerous.] The sweet spot seems to be around 7 drives. There is a variant not supported by vinum called RAID6 -- this is simply RAID5 with the parity block doubled up as a backup. It's relatively new and available under FreeBSD only by using hardware cards (Areca, 3ware etc.) All this does is allow the file system to survive loss of any *two* disks -- something that is more likely to occur than you might think, particularly as the number of disks per RAID goes up. Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW signature.asc Description: OpenPGP digital signature
Re: first pre-emptive raid
On Sat, 28 Jun 2008 13:02:20 +0200 Pieter de Goeje <[EMAIL PROTECTED]> wrote: > Parity is calculated using the following formula: > pieter, that is absolutely beautiful!! it was really bothering me how you can recover data that really wasn't 'there'. my son and i just worked out the mechanism with some nibbles: 0110 d0 0011 d1 0010 d2 0111 p so 0111 p 0111 p 0111 p 0011 d10110 d00110 d0 0010 d20010 d20011 d1 0110 d00011 d10010 d2 and just extend the concept from nibbles to blocks. why in diagram 20-3 of the handbook do they show 2 parity blocks though for disk3 and disk4? why would you ever have more than 1 for any single disk? -- In friendship, prad ... with you on your journey Towards Freedom http://www.towardsfreedom.com (website) Information, Inspiration, Imagination - truly a site for soaring I's ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: first pre-emptive raid
On Saturday 28 June 2008, prad wrote: > 3. it seems that geom just does striping and mirroring, but vinum > offers more configurability and is really the preferred choice? Geom also does raid 3 and disk concatenation (JBOD) (see the geom(8) manpage). I think geom is preferred because it is better tested (in later versions of FreeBSD) and easier to setup. > > 4.1 with 4 18G drives one thought is to do a raid1, but we really > don't want 3 identical copies. is the only way to have 2 36G mirrors, > by using raid0+1 or raid1+0? If you want one logical "disk" you could also mirror both pairs and use gconcat to add their sizes together. > > 4.2 another possibility is to do raid0, but is that ever wise unless > you desperately need the space since in our situation you run a 1/4 > chance of going down completely? Indeed, the chances have quadrupled. > > 4.3 is striping or mirroring faster as far as i/o goes (or does the > difference really matter)? i would have thought the former, but the > handbook says "Striping requires somewhat more effort to locate the > data, and it can cause additional I/O load where a transfer is spread > over multiple disks" #20.3 Both are faster when reading data. Raid 0 is faster when writing. When data blocks are spread over N disks, it is possible to achieve sequential read speeds N times faster than a simple JBOD configuration would do. However, the system also needs N times more bandwith to the disks to achieve this. If the disks are on a limited speed shared bus, one could imagine that the overhead of the extra I/O commands needed to do raid0 actually impairs performance. > > 4.4 vinum introduces raid5 with striping and data integrity, but > exactly what are the parity blocks? furthermore, since the data is > striped, how can the parity blocks rebuild anything from a hard drive > that has crashed? surely, the data from each drive can't be duplicated > somehow over all the drives though #20.5.2 Redundant Data Storage has > me scratching my head! if there is complete mirroring, wouldn't the > disk space be cut in half as with raid1? Parity is calculated using the following formula: parity = data0 XOR data1 XOR data2 Where data0..2 are datablocks striped over the disks, thus we need four disks to hold our data (3 for data 1 for parity). Now the disk with datablock 0 dies. To get the data back we simply need to solve the previous formula for data0: data0 = parity XOR data1 XOR data2 and for data1, 2 (in case the other disks die): data1 = parity XOR data0 XOR data2 data2 = parity XOR data0 XOR data1 This scales easily with bigger numbers of disks. Another use of parity data is to check data integrity. If for some reason the calculated parity of a "stripe" is no longer matching the on-disk parity data, then there must be an error. Note that is is easy to see the similarity of raid0 and raid5; basically raid5 is raid0 plus extra parity data for redundancy, resulting in being able to recover from 1 disk failure. -- Pieter de Goeje ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: first pre-emptive raid
prad wrote: 4.1 with 4 18G drives one thought is to do a raid1, but we really don't want 3 identical copies. is the only way to have 2 36G mirrors, by using raid0+1 or raid1+0? raid10 strongly preferred -- ie. you make a series of raid1 pairs and then stripe across them. This is high performance and resilient to disk failures -- it can conceivably survive loss of half your drives so long as it's only one from each RAID1 pair. 4.2 another possibility is to do raid0, but is that ever wise unless you desperately need the space since in our situation you run a 1/4 chance of going down completely? Anything that involves raid0 over several raw drives is going to be an Achillies heel. Loss of any one disk out of a raid0 disables the whole stripe. 4.3 is striping or mirroring faster as far as i/o goes (or does the difference really matter)? i would have thought the former, but the handbook says "Striping requires somewhat more effort to locate the data, and it can cause additional I/O load where a transfer is spread over multiple disks" #20.3 Mirroring tends to make reads a bit faster (because there are two disks to spread the IO between) and writes slightly slower (because the write has to hit both platters). On the whole, however, the performance difference between a mirrored pair and a single drive is probably not noticeable[*]. Striping across drives /generally/ gives you a big performance boost -- it depends really on your traffic patterns. If you're doing lots of small parallel IOs randomly distributed across the whole filesystem then striping is a really good choice. (Most RDBMses produce this sort of pattern, and so may things like web or mail servers.) If you're streaming large quantities of data sequentially into or out of a file, then striping isn't bad, but you might find it worthwhile to consider more space efficient geometries like RAID5[+], where this traffic pattern minimises the overhead of the extra processing involved. The big deal with any sort of RAID is not how long it takes to work out which disk the data is on or anything like that. That's an operation that completes on the time scale of CPU events: ie nanoseconds. The big stumbling block is always waiting for the disk to rotate, an operation which occurs on the timescale of milliseconds. The more spindles your IO request can be spread over the more that delay can be parallelized between them and the faster the ultimate result. Cheers, Matthew [*] Unless you adopt a highly sub-optimal configuration like mirroring the master with the slave on the same IDE bus. [+] but don't expect any sort of sparkling performance out of RAID5 unless you have a decent hardware controller card with plenty of cache RAM. -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW signature.asc Description: OpenPGP digital signature
first pre-emptive raid
our dual pentium3 1GHz with 2G ram and 8 18G scsi drives (server holds 4) should be arriving in about 1 week. my son and i want to this up as proper server rather than as a desktopish installation being used as a server. it will serve primarily websites (static html) and email for virtual domains as well as implement dns. from the handbook, we are learning 1. how to installing scsi drives (we have some old 2G) from #18.3 2. about software raid #18.4.1 (because we don't have a hardware solution and i guess you really don't have anything to figure out with one) 3. about geom #19 and vinum #20 4. about raid principles in general from wikipedia after a first reading, some initial questions about items: 3. it seems that geom just does striping and mirroring, but vinum offers more configurability and is really the preferred choice? 4.1 with 4 18G drives one thought is to do a raid1, but we really don't want 3 identical copies. is the only way to have 2 36G mirrors, by using raid0+1 or raid1+0? 4.2 another possibility is to do raid0, but is that ever wise unless you desperately need the space since in our situation you run a 1/4 chance of going down completely? 4.3 is striping or mirroring faster as far as i/o goes (or does the difference really matter)? i would have thought the former, but the handbook says "Striping requires somewhat more effort to locate the data, and it can cause additional I/O load where a transfer is spread over multiple disks" #20.3 4.4 vinum introduces raid5 with striping and data integrity, but exactly what are the parity blocks? furthermore, since the data is striped, how can the parity blocks rebuild anything from a hard drive that has crashed? surely, the data from each drive can't be duplicated somehow over all the drives though #20.5.2 Redundant Data Storage has me scratching my head! if there is complete mirroring, wouldn't the disk space be cut in half as with raid1? this is all very interesting and very new to us. -- In friendship, prad ... with you on your journey Towards Freedom http://www.towardsfreedom.com (website) Information, Inspiration, Imagination - truly a site for soaring I's ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"