Sent from my iPad
On 02.01.2014, at 18:06, Justin Dossey <[email protected]> wrote: > 1) It depends on the number of drives per chassis, your tolerance for risk, > and the speed of rebuilds. I'd recommend doing a couple of test rebuilds > with different array sizes to see how fast your controller and drives can > complete them, and then comparing the rebuild completion times to your SLA-- > if a rebuild takes two days to complete, is that good enough for you > (especially given the chances of another failure occuring during the > rebuild)? All other things being equal, the smaller the array, the faster > the rebuild, but the more "wasted" space in the array. Also note that many > controllers have tunable rebuild algorithms, so you can divert more resources > to completing rebuilds faster at the cost of performance. One data point > from me: my last 16-2T-SATA RAID-6 rebuild took about 58 hours to complete. > > 2) My understanding is that the way file reads work on GlusterFS, read > requests are sent to all nodes and the data is used from the first node to > respond to the request. So if one node is busier than others, it is likely > to respond more slowly and thus receive a lower portion of the read activity, > as long as the files being read are larger than a single response. > > > On Wed, Jan 1, 2014 at 12:21 PM, Fredrik Häll <[email protected]> wrote: >> Thanks for all the input! >> >> It sure sounds like RAID-6 for disk failures and Gluster for the spanning >> and high level redundancy parts is a good candidate. >> >> Some final questions: >> >> 1) How big can one comfortably go in terms of RAID-6 array size? Given 4TB >> SATA/SAS drives. On the one hand much points to keeping as few RAIDs as >> possible, and disk usage is of course maximized. But there are complications >> in terms of rebuild times and risk of losing the 2 drives. Hot spares may >> also be an option. Your reflections? >> >> 2) Is there any intelligence or automation in Gluster that makes smart use >> of dual (or multiple) replicas? Say that I have 2 replicas, and one of them >> is spending some effort on a RAID rebuild, is there functionality for >> manually or automatically preferring the other (healhy) replica? >> >> Best regards, >> >> Fredrik >> >> >> On Tue, Dec 31, 2013 at 10:27 PM, Justin Dossey <[email protected]> wrote: >>> Yes, RAID-6 is better than RAID-5 in most cases. I agonized over the >>> decision to deploy 5 for my Gluster cluster, and the reason I went with 5 >>> is that the number of drives in the brick was (IMO) acceptably low. I use >>> 6 for my 16-drive arrays, which means I have to lose 3 disks out of the 16 >>> to lose my data. With 2x8-drive arrays in 5, I also have to lose 3 disks >>> to lose data, but if I do lose data, I only lose 50% of the data on the >>> server, and all these bricks are distribute-replicate anyway, so I wouldn't >>> actually lose any data at all. That consideration, paired with the fact >>> that I keep spares on hand and replace failed drives within a day or two, >>> means that I'm okay with running 2x RAID-5 instead of 1x RAID-6. (2x >>> RAID-6 would put me below my storage target, forcing additional hardware >>> purchases.) >>> >>> I suppose the short answer is "evaluate your storage needs carefully." >>> >>> >>> On Tue, Dec 31, 2013 at 11:19 AM, James <[email protected]> wrote: >>>> On Tue, Dec 31, 2013 at 11:33 AM, Justin Dossey <[email protected]> wrote: >>>> > >>>> > Yes, I'd recommend sticking with RAID in addition to GlusterFS. The >>>> > cluster I'm mid-build on (it's a live migration) is 18x RAID-5 bricks on >>>> > 9 servers. Each RAID-5 brick is 8 2T drives, so about 13T usable. It's >>>> > better to deal with a RAID when a disk fails than to have to pull and >>>> > replace the brick, and I believe Red Hat's official recommendation is >>>> > still to minimize the number of bricks per server (which makes me a >>>> > rebel for having two, I suppose). 9 (slow-ish, SATA RAID) servers >>>> > easily saturate 1Gbit on a busy day. >>>> >>>> >>>> I think RedHat also recommends RAID6 instead of RAID5. In any case, I >>>> sure do, at least. >>>> >>>> James >>>> >>>> >>>> >>>> On Mon, Dec 30, 2013 at 5:54 AM, bernhard glomm >>>> <[email protected]> wrote: >>>> > >>>> > some years ago I had a similar tasks. >>>> > I did: >>>> > - We had disk arrays with 24 slots, with optional 4 JBODS (each 24 >>>> > slots) stacked on top, dual LWL controller 4GB (costs ;-) >>>> > - creating raids (6) with not more than 7 disks each >>>> > - as far as I remember I had one hot spare per each 4 raids >>>> > - connecting as many of this raid bricks together with striped glusterfs >>>> > as needed >>>> > - as for replication, I was planing for an offside duplicate of this >>>> > architecture and >>>> > because losing data was REALLY not an option, writing it all off at a >>>> > second offside location onto LTFS tapes. >>>> > As the original version for the LTFS library edition was far to >>>> > expensive for us >>>> > I found an alternative solution that does the same thing >>>> > but fort a much reasonable prize. LTFS is still a big thing in digital >>>> > Archiving. >>>> > Give me a note if you like more details on that. >>>> > >>>> > - This way I could fsck all (not to big) raids in parallel (sped things >>>> > up) >>>> > - proper robustness against disk failure >>>> > - space that could grow infinite in size (add more and bigger disks) and >>>> > keep up with access speed (ad more server) at a pretty foreseeable prize >>>> > - LTFS in the vault provided just the finishing having data accessible >>>> > even if two out three sides are down, >>>> > reasonable prize, (for instance no heat problem at the tape location) >>>> > Nowadays I would go for the same approach except zfs raidz3 bricks (at >>>> > least do a thorough test on it) >>>> > instead of (small) hardware raid bricks. >>>> > As for simplicity and robustness I wouldn't like to end up with several >>>> > hundred glusterfs bricks, each on one individual disk, >>>> > but rather leaving disk failure prevention either to hardware raid or >>>> > zfs and using gluster to connect this bricks into the >>>> > fs size I need( - and for mirroring the whole thing to a second side if >>>> > needed) >>>> > hth >>>> > Bernhard >>>> > >>>> > >>>> > >>>> > Bernhard Glomm >>>> > IT Administration >>>> > >>>> > Phone: +49 (30) 86880 134 >>>> > Fax: +49 (30) 86880 100 >>>> > Skype: bernhard.glomm.ecologic >>>> > Ecologic Institut gemeinnützige GmbH | Pfalzburger Str. 43/44 | 10717 >>>> > Berlin | Germany >>>> > GF: R. Andreas Kraemer | AG: Charlottenburg HRB 57947 | USt/VAT-IdNr.: >>>> > DE811963464 >>>> > Ecologic™ is a Trade Mark (TM) of Ecologic Institut gemeinnützige GmbH >>>> > ________________________________ >>>> > >>>> > On Dec 25, 2013, at 8:47 PM, Fredrik Häll <[email protected]> wrote: >>>> > >>>> > I am new to Gluster, but so far it seems very attractive for my needs. I >>>> > am trying to assess its suitability for a cost-efficient storage problem >>>> > I am tackling. Hopefully someone can help me find how to best solve my >>>> > problem. >>>> > >>>> > Capacity: >>>> > Start with around 0.5PB usable >>>> > >>>> > Redundancy: >>>> > 2 replicas with non-RAID is not sufficient. Either 3 replicas with >>>> > non-raid or some combination of 2 replicas and RAID? >>>> > >>>> > File types: >>>> > Large files, around 400-1500MB each. >>>> > >>>> > Usage pattern: >>>> > Archive (not sure if this matches nearline or not..) with files being >>>> > added at around 200-300GB/day (3-400 files/day). Very few reads, order >>>> > of 10 file accesses per day. Concurrent reads highly unlikely. >>>> > >>>> > The main two factors for me are cost and redundancy. Losing data is not >>>> > an option, being an archive solution. Cost/usable TB is the other key >>>> > factor, as we see growth estimates of 100-500TB/year. >>>> > >>>> > Looking just at $/TB, a RAID-based approach to me sounds more efficient. >>>> > But RAID rebuild times with large arrays of large capacity drives sound >>>> > really scary. Not sure if something smart can be done since we will >>>> > still have a replica left during the rebuild? >>>> > >>>> > So, any suggestions on what would be possible and cost-efficient >>>> > solutions? >>>> > >>>> > - Any experience on dense servers, what is advisable? 24/36/50/60 slots? >>>> > - SAS expanders/storage pods? >>>> > - RAID vs non-RAID? >>>> > - Number of replicas etc? >>>> > >>>> > Best, >>>> > >>>> > Fredrik >>>> > _______________________________________________ >>>> > Gluster-users mailing list >>>> > [email protected] >>>> > http://supercolony.gluster.org/mailman/listinfo/gluster-users >>>> > >>>> > >>>> > >>>> > _______________________________________________ >>>> > Gluster-users mailing list >>>> > [email protected] >>>> > http://supercolony.gluster.org/mailman/listinfo/gluster-users >>>> >>>> >>>> >>>> >>>> > -- >>>> > Justin Dossey >>>> > CTO, PodOmatic >>>> > >>>> > >>>> > _______________________________________________ >>>> > Gluster-users mailing list >>>> > [email protected] >>>> > http://supercolony.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> -- >>> Justin Dossey >>> CTO, PodOmatic >> >> >> _______________________________________________ >> Gluster-users mailing list >> [email protected] >> http://supercolony.gluster.org/mailman/listinfo/gluster-users > > > > -- > Justin Dossey > CTO, PodOmatic > > _______________________________________________ > Gluster-users mailing list > [email protected] > http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list [email protected] http://supercolony.gluster.org/mailman/listinfo/gluster-users
