Re: [monit] Aborting monit on failure

Eric Pailleau Fri, 07 Aug 2009 03:31:05 -0700

Stephan-Frank Henry a écrit :

The fail_action.sh is basically just a wrapper around a group of other

scripts I want to execute at runtime:

Stop slony (PostgreSQL replication system), failover the database and

then stop monit or at least make sure it does not monitor anything anymore. I
would not like for the master db to come back up and cause my system to go
crazy on me. :)


Eric Pailleau wrote:

Don't use Slony... Prefere DRBD + Heartbeat  :>)))))))))))


I had looked into it but it looked as if it would not fit the crazy 
'requirements' provided. 'we want high availability but only using two servers 
*for everything* .. i.e. 1 server = appserver + db, for HA we will just add 
another server and they should automagically find themselves and whatnot ... 
but o/c it should not cost anything to add HA.
... and they wonder why I am slowly going mad. ;)

As time is/was short and slony looked like it would work well enough, I chose 
it.

The problem is, slony does not do the automatic failover thing so I took monit 
to solve this.


(Sorry this discussion is not 'monit related')
We tested Slony but we finally take over because it is too complicate (and 
dangerous) for server switch.

So we are using DRBD that synchonise in realtime 1 mounted partition ( master 
server ) on 1 unmounted partition (slave server).

Hearbeat detect system failure on master and can decide to take over.
In such case Master become Slave, master processes are stopped, and postgresql 
partition unmount.
Slave becomes Master, mount postgresql partition and start master processes.

With this you can even reboot the server when you want (or every week by cron).

monit can be also used to monitor postgresql and can decide to reboot the 
server on fatal error
(Heartbeat on Slave detect the fail-over and become Master). No transaction 
lost because of DRBD and
monit that gently stop application in right manner due to 'depends' feature !

Very relaxing !!!!!!!!!!


--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general

Re: [monit] Aborting monit on failure

Reply via email to