On 8/27/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > I would like to do these step: > > As soon as the process is declared as not running (misconfiguration etc...), > launch the "monit heartbeat stop" command is order to change the active node. > Any idea ?
Ok, now I clearly see what you want to do ;) At frist I must say you've missed the whole concept, when want to make failover, you should never shutdown the service that enables communication between the nodes of the server. You can use monit to monitor the services that are running locally on each node and heartbeat to let the nodes of your server to communicate between each other and to ask each other to make failover in case of problems. Heartbeat can communicate over a ethernet or serial connection. The usual way is to configure it to communicate over ethernet and to use the serial line as a redundant link. The fail over condition is can be defined by many parameters. The basic is the state of the services that are under its control (those scripts defined in /etc/ha.d/resource.d), checks to see if the other node of the server is "alive", than it can monitor the node's visibility to the outer world, it has a list of ip addresses that pinged - a ping forum, so when pinging if the active node receives less responses than the standby it will perform the failover. Heartbeat will always run so that the nodes can ask each other to make failover So here is a small description how to make monit and hearbeat live together ;) This is the solution that I've used (both nodes should be configured this way): The idea is to create two service groups, the first enables basic functionality of the node and the second will contain the services under heartbeat command. 1. configure two groups of services in monitrc local: the one that enables basic functionality of the node (postfix, heartbeat, some mounts, if needed). In this group you should at least have postfix (to have active MTA so monit can send e-mail alerts) and heartbeat to monitor the state of the services and the whole node. cluster: the applications which are monitored (apache, mysql.......). put all the start scripts in /etc/ha.d/resource.d + one script that will start the group cluster in monit something like: /usr/bin/monit start -g cluster 2. configure heartbeat to execute the script for starting monit's group cluster when starting 3. configure monit to start in respawn mode at boot, set this in /etc/inittab. start monit with group "local" in respawn mode, so if it crashes it will be restarted by the kernel. What do we gain from this setup? At boot time when monit starts it sets the basic functionality of the server, than heartbeat takes over the control and it checks to see what's the situation of the nodes (the two heartbeats running on each node communicate with each other to see who will take the resources). Monit monitors the services that run locally, and heartbeat checks the state of the services and makes a failover if needed. If you type : monit summary the active node will show you that all services are running, and the passive will show you that all the services in the local group are running and the services from group cluster are not monitored. When problems appear on active node it will ask the standby to take over the resources, and the failover is made. I hope you got the picture :) BR, Jovan -- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
