At 11:46 AM -0600 2005-09-22, Ivan Fetch wrote: > i'm looking at this for our setup as well. How are others handling > the Mailman processes (qrunners) starting on a second server once the > first server has gone down, and switching back to the first server once > it has recovered?
You'd have to have some sort of "heartbeat" monitor application which uses the standard OS start/stop routines when it decides that the other system has gone down, and might need to put in some additional lock cleanup code in there. Moreover, you'd have to make sure that the problem is not in the heartbeat monitor script, which could potentially cause the Mailman processes on both machines to be running and each think that the other is down. You'd also need to make sure that you don't start Mailman on the second machine if the shared filesystem is not properly available. And there are a bazillion other potential failure modes that you'd also need to look out for. Doing high availability is tough. Much tougher than anyone ever gives it credit for. The problem is that Mailman was never designed to be used/abused in this way, so you're pretty much in completely uncharted water. -- Brad Knowles, <[EMAIL PROTECTED]> "Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety." -- Benjamin Franklin (1706-1790), reply of the Pennsylvania Assembly to the Governor, November 11, 1755 SAGE member since 1995. See <http://www.sage.org/> for more info. ------------------------------------------------------ Mailman-Users mailing list Mailman-Users@python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp