Reporting an issue. Thank you very much for any feedback
today heartbeat process took all the CPU (99%-100%) and load went up.
it put this message to log repeating 14 times a seccond:
heartbeat: [2553]: WARN: Gmain_timeout_dispatch: Dispatch function for
retransmit request took too long to execute: 20 ms (> 10 ms) (GSource:
0x8ade118)
then this message to log repeating:
heartbeat: [2553]: WARN: Gmain_timeout_dispatch: Dispatch function for
retransmit request was delayed 540 ms (> 500 ms) before being called (GSource:
0x9152988)
heartbeat: [2553]: info: Gmain_timeout_dispatch: started at 788890528 should
have started at 788890474
I have very simple setup of two Nodes having only IP address as the heartbeat
resource.
Heartbeat communicates on the local network (not with crossover network cable
and not with serial cable). So first thing I thought it's related to network
noise.
Nodes could connect to each other on the local network fine and no problems
have
been reported with local network.
Secondary node was fine without any errors and showing two nodes online with
primary node having resource IP.
To fix i had to reboot primary node (the one with the problems):
Second node took control of IP as i rebooted and released it back when main
node
came back online. So it's all fine working now.
Please tell me any ideas why/what happened. If it can be a bug and if i need to
upgrade to latest version. Current version is heartbeat-2 2.1.3-2.
I also have heartbeat 2.1.3-2.
OS is Ubuntu 8.04.4
thank you very much!!!
-Sam
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems