Re: [lopsa-tech] software raid under linux

david Thu, 23 Dec 2010 10:54:05 -0800

On Thu, 23 Dec 2010, [email protected] wrote:

On Thu, 23 Dec 2010, Andrew Hume wrote:

we (some folks at work) have built two backblaze boxes
(roughly speaking, a linux box with 45 2TB drives).


I missed what hardware you are using when I made my first reply

a big issue for you to think about with this box is how you want to sliceup the drives.

this hardware has a high fan-out ratio, so you cannot transfer data to allthe drives at anything close to the I/O capacity of the drives.

with 2TB drives, the rebuild time will be quite significant (if you wantto do it in the background, say 5-10% of your I/O bandwidth, you caneasily be talking about a week in rebuild time)

As a result, you need to worry about a second drive dieing before youfinish rebuilding the first one.

the big hard drive failure studies from a few years ago (google, etc) cameup with drive failure rates of ~10% chance of failure per drive per year.

this is ~1% chance of failure per month, with 45 drives you are talkingabout a very significant chance of having a failure every month, if theyare all in one raid set, the chance of a second drive failing before youfinish rebuilding after the prior failure is significant, with raid6 youwould have to loose three drives at once, and that significantly improvesyour chances. I calculated the odds a couple years ago, and I think theywere something on the order or 2.5% chance of a second failure during arebuild of 1TB drives at 10% bandwidth, but only 0.025% chance of a thirdfailure.

with raid 10, every block of data is on two drives, but I believe it thendistributes the stripes to balance the load rather than the drives endingup in exact mirrors of each other. With the large number of blocks on a2TB drive, I believe that the odds of a second drive taking down _some_blocks that were the mirrors of the first drive approaches certinty (ifraid10 doesn't distribute the mirrored stripes and has one drive be anexact duplicate of another drive, then the odds get better as you need toloose both drives in a pair before you can rebuild one)

I've run 45 drive software raid arrays, and I had multiple instances ofdouble drive failures over a couple years of operation. As a result, withthat size of an array, I won't do anything short of double-redundancy. Idon't know if you can configure raid10 to keep 3 copies of everything ornot.

remember that these disk arrays are not high performance systems, they arehigh capacity and cheap, but not high performance.


David Lang

i think i know how to deploy such a beast, but wanted to check
my understanding, which is that mdadm is the tool of choice,
and that for performance and reliability, raid10 is the sweet spot
(specificly, not RAID5).

does anyone have anything specific to say about mdadm,
and the raid it produces, either good or bad?
mdadm has by far the longest track record, with the most raid support. thereis also the dm family of drivers and tools which have some additionalfeatures.
which raid mode you want is highly dependent on what your requirements areand how much space you are willing to sacrafice.
raid6 is significantly more reliable than raid5, raid1, or raid10, butsuffers the same write performance issues that raid5 has (note that if youare in something close to a read-only situation, raid6 can be just as fast asraid10 while still being more reliable, it's what I use for my splunkdatastore for example)
raid5 and raid6 really suffer in the situation where you have lots of small,random writes. large sequential writes have much less overhead.
David Lang

_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Re: [lopsa-tech] software raid under linux

Reply via email to