Quoting Vsevolod Katkov <[email protected]>: > Reporting an issue. Thank you very much for any feedback > > today heartbeat process took all the CPU (99%-100%) and load went up. > it put this message to log repeating 14 times a seccond: > heartbeat: [2553]: WARN: Gmain_timeout_dispatch: Dispatch function for > retransmit request took too long to execute: 20 ms (> 10 ms) (GSource: > 0x8ade118) > > then this message to log repeating: > heartbeat: [2553]: WARN: Gmain_timeout_dispatch: Dispatch function for > retransmit request was delayed 540 ms (> 500 ms) before being called > (GSource: > 0x9152988) > heartbeat: [2553]: info: Gmain_timeout_dispatch: started at 788890528 should > have started at 788890474 > > > I have very simple setup of two Nodes having only IP address as the heartbeat > resource. > Heartbeat communicates on the local network (not with crossover network cable > and not with serial cable). So first thing I thought it's related to network > noise. > Nodes could connect to each other on the local network fine and no > problems have > been reported with local network. > Secondary node was fine without any errors and showing two nodes online with > primary node having resource IP. > > To fix i had to reboot primary node (the one with the problems): > Second node took control of IP as i rebooted and released it back > when main node > came back online. So it's all fine working now. > > Please tell me any ideas why/what happened. If it can be a bug and > if i need to > upgrade to latest version. Current version is heartbeat-2 2.1.3-2. > > I also have heartbeat 2.1.3-2. > OS is Ubuntu 8.04.4 > > thank you very much!!! > -Sam
If heartbeat had gone into an infinite loop, you would not be able to do anything at all. These messages indicate that something (most likely something else) was consuming a lot of CPU. The heartbeat system has a number of processes. Can you say which one you believe was consuming a lot of CPU? _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
