On Wed, 22 Aug 2012, Jon Heese wrote: > On 21 Aug 2012, at 17:29, David Lang <[email protected]> wrote: >> On Tue, 21 Aug 2012, Jon Heese wrote: >>> Feel free to keep discussing alternatives, but I am not at liberty to >>> change this system from the current Heartbeat/Pacemaker/CRM\ >>> architecture. >>> >>> Does anyone have any ideas about making my setup do what I'm asking? >>> Any comments on my idea of switching to OCF and hacking the resource >>> script to never run the "stop" command (except maybe on shutdown or >>> some other cases that I haven't thought of yet ;))? >> >> The simple answer is to just not have heartbeat manage Apache. >> >> Start Apache from the standard system-wide init scripts (or systemd, >> upstart, etc) >> >> Heartbeat doesn't need to know that Apache exists for this to work. >> >> What you loose is the ability for heartbeat to notice that Apache is >> down and failover from that system, but in my experience that's not a >> very valuable capability in the first place (the number of outages >> caused by Apache going down is dwarfed by the number of outages caused >> by the application not working, when Apache is healthy) > > Well, see, that's the thing: These are just reverse-proxy Apache servers... > Apache *is* the "application". If the back-end application servers fail in > any way, the proxy servers don't (and shouldn't) care. The only things this > cluster is concerned with is the shared IP and the apache service itself. > > We originally built this cluster without the apache resource, but then we had > a false positive on our service monitoring app that scared us into thinking > that Apache died on the active node yesterday morning. This made us decide > to add the apache resource. But now we're in a quandary between here and > there... > > I renew my call for anyone who knows of a way to leave a resource running on > all nodes at once. Are there any developers on this list that may know of > more esoteric options for the OCF and/or LSB resource types, or do I have to > join the developer list for that?
I believe that in pacemaker you can configure a resource to run on multiple systems in an active/active configuration. But I don't know how you would combine that with a separate IP address resource in a active/failover configuration (you would have to add a dependancy to the IP resource to say that it can only run on a node that has an active Apache resource) The trouble is that the resource management spec requires that the startup script be able to run the sequence start monitor start monitor stop monitor stop monitor and get the correct monitor output (started or stopped) each time. If you change stop to be a noop, you also need to tinker with monitor so that it returns a "stopped" status if heartbeat doesn't want the process running, but you still need to properly detect if Apache has stopped when heartbeat wants it to be running. You also have the problem that by doing this, you have elimianted any way for the system to do a graceful shutdown of Apache when you want to shut the entire system down (the system will tell heartbeat to stop, heartbeat will tell Apache to stop, and you want to configure Apache to ignore this) you are probably better of creating a heartbeat resource along the lines of "iamhealthy" that toggles the state of some status file and reports it's status based on that file. Then you setup other monitoring of the box that overrides this status file if it thinks that there is anything wrong. This monitoring can be that Apache dies, that network connectivity dies, that the box is out of disk space, is out of memory, or any other health checks that you find that you want. David Lang _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
