I know you're not looking for fixes, but there are a couple of points I would make:
On 02/07/2013, at 6:31 AM, William Seligman <[email protected]> wrote: > I'm about to write a transition plan for getting rid of high-availability on > our > lab's cluster. Before I do that, I thought I'd put my reasons before this > group > so that: > > a) people can exclaim "You fool!" and point out all the stupid things I did > wrong; > > b) sysadmins who are contemplating the switch to HA have additional points to > add to the pros and cons. > > The basic reason why we want to back out of HA is that, in the three years > since > I've implemented an HA cluster at the lab, we have had not a single hardware > problem for which HA would have been useful. However, we've had many instances > of lab-wide downtime due to the HA configuration. Two of the three cases you list in support of this are the HA software being unable to mask an external problem, rather than being the root cause itself. Even cman crashing should have been easily recoverable without human intervention. > > Description: two-node cluster, Scientific Linux 6.2 (=RHEL6.2), > cman+clvmd+pacemaker, dedicated Ethernet ports for DRBD traffic. I've had both > primary/secondary and dual-primary configurations. The resources pacemaker > manages include DRBD and VMs with configuration files and virtual disks on the > DRBD partition. Detailed package versions and configurations are at the end of > this post. > > Here are some examples of our difficulties. This is not an exhaustive list. > > "Mystery crashes" > > I'll mention this one first because it's the most recent, and it was the straw > that broke the camel's back as far as the users were concerned. > > Last week, cman crashed, and the cluster stopped working. There was no clear > message in the logs indicating why. I had no time for archeology, since the > crash happened in middle of our working day; I rebooted everything and cman > started up again just fine. > > Problems under heavy server load: > > Let's call the two nodes on the cluster A and B. Node A starts running a > process > that does heavy disk writes to the shared DRBD volume. The load on A starts to > rise. The load on B rises too, more slowly, because the same blocks must be > written to node B's disk. > > Eventually the load on A grows so great that cman+clvmd+pacemaker does not > respond promptly, and node B stoniths node A. The problem is that the DRBD > partition on node B marked "Inconsistent". All the other resources in the > pacemaker configuration depend on DRBD, so none of them are allowed to run. > > The cluster stays in this non-working state (node A powered off, node B not > running any resources) until I manually intervene. > > "Poisoned resource" > > This is the one you can directly attribute to my stupidity. > > I add a new resource to the pacemaker configuration. Even though the pacemaker > configuration is syntactically correct, and even though I think I've tested > it, > in fact the resource cannot run on either node. > > The most recent example: I created a new virtual domain and tested it. It > worked > fine. I created the ocf:heartbeat:VirtualDomain resource, verified that crm > could parse it, and activated the configuration. However, I had not actually > created the domain for the virtual machine; I had typed "virsh create ..." but > not "virsh define ...". > > So I had a resource that could not run. What I'd want to happen is for the > "poisoned" resource to fail, I see lots of error messages, but the remaining > resources would continue to run. > > What actually happens is that resource tries to run on both nodes alternately > an > "infinite" number of times (10000 times or whatever the value is). Then one of > the nodes stoniths the other. The poisoned resource still won't run on the > remaining node, so that node tries restarting all the other resources in the > pacemaker configuration. That still won't work. > > By this time, usually one of the other resources has failed (possibly because > it's not designed to be restarted so frequently), and the cluster is in a > non-working state until I manually intervened. > > In this particular case, had we not been running HA, the only problem would > have > been that the incorrectly-initialized domain would not have come up after a > system reboot. With HA, my error crashed the cluster. I guess you could have set on-fail=block which would have inhibited the recovery process. But HA isn't a magic wand that can make buggy apps less buggy or improve an admin's memory. And if you _really_ don't trust yourself, put the rest of the cluster into maintenance-mode so that we won't do anything to the existing resources. > > > Let me be clear: I do not claim that HA is without value. My only point is > that > for our particular combination of hardware, software, and available sysadmin > support (me), high-availability has not been a good investment. > > I also acknowledge that I haven't provided logs for these problems to > corroborate any of the statements I've made. I'm sharing the problems I've > had, > but at this point I'm not asking for fixes. > > > Turgid details: > > # rpm -q kernel drbd pacemaker cman \ > lvm2 lvm2-cluster resource-agents > kernel-2.6.32-220.4.1.el6.x86_64 > drbd-8.4.1-1.el6.x86_64 > pacemaker-1.1.6-3.el6.x86_64 > cman-3.0.12.1-23.el6.x86_64 > lvm2-2.02.87-7.el6.x86_64 > lvm2-cluster-2.02.87-7.el6.x86_64 > resource-agents-3.9.2-7.el6.x86_64 > > /etc/cluster/cluster.conf: http://pastebin.com/qRAxLpkx > /etc/lvm/lvm.conf: http://pastebin.com/tLyZd09i > /etc/drbd.d/global_common.conf: http://pastebin.com/H8Kfi2tM > /etc/drbd.d/admin.res: http://pastebin.com/1GWupJz8 > output of crm configure show: http://pastebin.com/wJaX3Msn > output of crm configure show xml: http://pastebin.com/gyUUb2hi > -- > William Seligman | Phone: (914) 591-2823 > Nevis Labs, Columbia Univ | > PO Box 137 | > Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/ > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
