On Tue, May 11, 2010 at 01:35:17PM -0700, Mike Sweetser wrote:
> Hello,
>
> I've set up a DRBD and Heartbeat configuration communicating over an
> Internet connection, rather than internal. The servers are running CentOS
> 5.4, with DRBD 8.3.2 and Heartbeat 3.0.3, out of the CentOS repository.
>
> I start seeing these in the ha-log.
>
> ERROR: Message hist queue is filling up (500 messages in queue)
>
> Then I see a bunch of these:
>
> WARN: Gmain_timeout_dispatch: Dispatch function for retransmit request took
> too long to execute: 20 ms (> 10 ms) (GSource: 0x1c3025c0)
>
> And finally:
What is before this?
Below is "MCP dead" (Master Control Process)...
it should log why it died.
Or there should be some core file below
find /var/lib/heartbeat/cores/
Or both.
> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Killing pid 5533 with
> SIGTERM
> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Killing pid 5537 with
> SIGTERM
> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Killing pid 5538 with
> SIGTERM
> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Killing pid 5539 with
> SIGTERM
> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Killing pid 5540 with
> SIGTERM
> May 08 05:33:19 mysql1 heartbeat: [5536]: CRIT: Emergency Shutdown(MCP
> dead): Killing ourselves.
> logfacility local0
> debug 1
> debugfile /var/log/ha-debug
> logfile /var/log/ha-log
maybe you should use logd?
> node mysql1
> node mysql2
> keepalive 2
> deadtime 60
> initdead 120
> warntime 15
> udpport 694
> ucast eth1 66.165.231.34
> ucast eth1 67.218.128.19
You should add an additional link.
Really.
> auto_failback on
> crm yes
Are you short on memory, or under memory pressure?
Are UDP packets dropped?
Packet loss somewhere?
Message corruption?
Firewalled in one direction?
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems