what you need is a device that will force the host to go offline (restart) such as a watchdog timer (kernel's softdog wouldn't suffice).
you can also remotely check the health of the host but don't do it via a port check alone, you should also set your own service transaction timeout. and if the service is not responsive enough (in terms of transaction), you can force the host to go offline via a STONITH device (i.e. manageable UPS, etc.). or you can use the ldirectord on linux directors (LVS) to monitor the services of the nodes in a load-balancing (active-active) cluster instead of doing the active-passive approach. however, you might be needing an extra machine to host the director since it is not advisable to host the director and the (real) service on a single machine. and you would also be needing _another_ machine (making it two) that would serve as a failover for the director and therefore, eliminate the new set of SPOF. :) On 3/18/07, Ramer Ortega <[EMAIL PROTECTED]> wrote:
Hi pluggers, We are running our proxy server on an aging 1U-chassis Dell server installed with RH9.0. Performance, over the years, has been satisfactory. But during our operations review, the management identified it as a single-point of failure, should the box goes down, some segment of our network will not be able to reach the internet. We want to implement a highly-available web cache solution using squid. There was an initial attempt (the sys admin originally handling this has resigned) to configure 2 x Dell 1850 with RH ES3.0, heartbeat and ipfail but the behavior is not very satisfactory - they only detect when the network goes down, no service fail-over will occur when other resources caused the problem. I'm reading Linux Virtual Server tutorial from http://www.ultramonkey.org and I think there's a way to expand the cluster's configuration (to include more resources under cluster control). But I would like to ask if anybody has implemented a linux-based web-cache cluster. Perhaps I can replicate your approach rather than work from the ground up. * An active-passive solution is acceptable. * Content filtering is not necessary (the cluster will be deployed in front of the application servers, this is not for human users). * The cluster solution should monitor at least the following resources - network interface, the health of squid daemon, and the disk partitions used by squid. Thanks in advance. -ramer- _________________________________________________ Philippine Linux Users' Group (PLUG) Mailing List [email protected] (#PLUG @ irc.free.net.ph) Read the Guidelines: http://linux.org.ph/lists Searchable Archives: http://archives.free.net.ph
_________________________________________________ Philippine Linux Users' Group (PLUG) Mailing List [email protected] (#PLUG @ irc.free.net.ph) Read the Guidelines: http://linux.org.ph/lists Searchable Archives: http://archives.free.net.ph

