On Fri, Feb 3, 2012 at 9:31 AM, Andrew Beekhof <and...@beekhof.net> wrote: > On Thu, Feb 2, 2012 at 9:55 PM, Shyam <shyam.kaus...@gmail.com> wrote: >> Hi Andrew, >> >> Here is more logs covering a larger period that shows multiple of this >> election cycle. Please note that in the below case I had set dc-deadtime to >> 5secs & the I_DC_TIMEOUT pops up every 5 secs. I turned this dc-deadtime to >> 10secs & the long election cycle problem disappeared. It no longer happens. >> I suspect that before a single election cycle completes, the next >> I_DC_TIMEOUT kicks-in. Could this be the reason? > > Yes. The question is why the cycle is taking so long :-/
Could you reproduce with debug on please? It would be nice to know what the cluster is doing for the 4 seconds between these two messages: Jan 17 12:00:04 vsa-0000003ca-vc-0 crmd: [1120]: WARN: start_subsystem: Client pengine already running as pid 4243 Jan 17 12:00:08 vsa-0000003ca-vc-0 crmd: [1120]: info: do_dc_takeover: Taking over DC status for this partition What version of pacemaker is this btw? _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org