On 05/11/2012 11:39 AM, Geoff Galitz wrote: >> On 05/11/2012 01:58 AM, Geoff Galitz wrote: >>> Are we more likely to avoid a split brain scenario if we have more than >>> 2 >>> bricks total? IOW, one brick on at least three servers or more? >> >> Yes. > > One of my guys asked if using the quorum feature in 3.3 makes any > difference in this scenario compared to 3.2.x.
Heh. I knew there'd be a follow-up. I seriously couldn't think of anything more useful to say before, and couldn't resist the opportunity to balance my usual verbosity with a dose of brevity, but I'll be glad to address further questions as best I can. Having more than three bricks will enhance your ability to repair a split brain after it has happened. Using the quorum-enforcement feature will make it less likely that you'll get split-brain in the first place. Having both is ideal, because quorum enforcement with R=2 is really a bit of a hack. I'm the guy who implemented it, so I can say that. ;) The problem is that with R=2 a single failure denies true quorum to either side. Because it's a common configuration we handle it anyway, by deciding ties in favor of the first brick in the configured list, but personally I'd rather not see that become common. In fact, some of us are having a very detailed and somewhat heated discussion about ways to avoid using that tie-breaker at least in the case where the total number of servers is greater than two. Getting back to the original question, combining R>2 with quorum enforcement is quite ideal, because in that case a single failure still leaves one side with a true quorum. The downside is that writes (or other modifying operations) on a client that can't see a quorum of the bricks will get EROFS errors. If your application isn't prepared to handle that, it might not help much, but in most situations it's better than split brain. Just keep telling yourself that every EROFS you see is a potential split-brain disaster that you've avoided. The insidious thing about split brain is that it can cause errors to remain latent in your system for a long time - and they usually seem to manifest at the most inconvenient times. Having to deal with EROFS instead is far preferable, and I have the scars to prove it. _______________________________________________ Gluster-users mailing list [email protected] http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
