On Tue, May 11, 2010 at 01:35:17PM -0700, Mike Sweetser wrote:
> Hello,
> 
> I've set up a DRBD and Heartbeat configuration communicating over an
> Internet connection, rather than internal.  The servers are running CentOS
> 5.4, with DRBD 8.3.2 and Heartbeat 3.0.3, out of the CentOS repository.
> 
> I start seeing these in the ha-log.
> 
> ERROR: Message hist queue is filling up (500 messages in queue)
> 
> Then I see a bunch of these:
> 
> WARN: Gmain_timeout_dispatch: Dispatch function for retransmit request took
> too long to execute: 20 ms (> 10 ms) (GSource: 0x1c3025c0)
> 
> And finally:

What is before this?
Below is "MCP dead" (Master Control Process)...
it should log why it died.
Or there should be some core file below
        find /var/lib/heartbeat/cores/
Or both.

> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Killing pid 5533 with
> SIGTERM
> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Killing pid 5537 with
> SIGTERM
> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Killing pid 5538 with
> SIGTERM
> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Killing pid 5539 with
> SIGTERM
> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Killing pid 5540 with
> SIGTERM
> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Emergency Shutdown(MCP
> dead): Killing ourselves.

> logfacility     local0
> debug 1
> debugfile /var/log/ha-debug
> logfile /var/log/ha-log

maybe you should use logd?

> node mysql1
> node mysql2
> keepalive 2
> deadtime 60
> initdead 120
> warntime 15
> udpport 694
> ucast eth1 66.165.231.34
> ucast eth1 67.218.128.19

You should add an additional link.
Really.

> auto_failback on
> crm yes

Are you short on memory, or under memory pressure?
Are UDP packets dropped?
Packet loss somewhere?
Message corruption?
Firewalled in one direction?

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to