Ola 4 I am using Xen 3.0.4 + DRBD + Heartbeat packages repository debian squeeze. 8 domUs SERV1 running and running Serv2 domUs 3. Hardware and SERV1 Serv2 is Quad Core 2.2 GHZ 5 15000 rpm SAS drives in raid 5, 42 and 4 GB Ram Intel and Broadcom Gigabit Ethernet. SERV1 and Serv2 are with ethernet1 to the cross cable connected DRBD and Heartbeat for ETHERNET2 connected via cross cable. Today the problem returned, simply server1 conclusively that the network was dead heartbeat however This did not happen, and all services have been migrated to Server2.One thing I noticed that the time was early server1 1 hour with respect to server2.
Is this what caused the problem? Thanks. Log below generated by Server1 Sep 28 15:26:45 inga heartbeat: [2201]: WARN: Gmain_timeout_dispatch: Dispatch function for send local status was delayed 2989280 ms (> 7510 ms) before being called (GSource: 0xcef350) Sep 28 15:26:45 inga heartbeat: [2201]: info: Gmain_timeout_dispatch: started at 2761230009 should have started at 2760931081Sep 28 15:26:45 inga heartbeat: [2201]: WARN: Late heartbeat: Node inga: interval 3004870 msSep 28 15:26:45 inga heartbeat: [2201]: WARN: Gmain_timeout_dispatch: Dispatch function for send local status took too long to execute: 590 ms (> 50 ms) (GSource: 0xcef350)Sep 28 15:26:45 inga heartbeat: [2201]: WARN: node pitanga: is deadSep 28 15:26:45 inga heartbeat: [2201]: WARN: No STONITH device configured.Sep 28 15:26:45 inga heartbeat: [2201]: WARN: Shared disks are not protected.Sep 28 15:26:45 inga heartbeat: [2201]: info: Resources being acquired from pitanga.Sep 28 15:26:46 inga heartbeat: [2201]: info: Link pitanga:eth2 dead.Sep 28 15:26:46 inga heartbeat: [2201]: WARN: Gmain_timeout_dispatch: Dispatch function for check for signals was delayed 2989760 ms (> 7510 ms) before being called (GSource: 0xcef590)Sep 28 15:26:46 inga heartbeat: [2201]: info: Gmain_timeout_dispatch: started at 2761230077 should have started at 2760931101Sep 28 15:26:46 inga heartbeat: [2201]: WARN: Gmain_timeout_dispatch: Dispatch function for update msgfree count was delayed 3000510 ms (> 50000 ms) before being called (GSource: 0xcef680)Sep 28 15:26:46 inga heartbeat: [2201]: info: Gmain_timeout_dispatch: started at 2761230077 should have started at 2760930026Sep 28 15:26:46 inga heartbeat: [2201]: WARN: Gmain_timeout_dispatch: Dispatch function for update msgfree count took too long to execute: 110 ms (> 50 ms) (GSource: 0xcef680)Sep 28 15:26:46 inga heartbeat: [2201]: WARN: Gmain_timeout_dispatch: Dispatch function for client audit was delayed 3000190 ms (> 5000 ms) before being called (GSource: 0xcef4d0)Sep 28 15:26:46 inga heartbeat: [2201]: info: Gmain_timeout_dispatch: started at 2761230088 should have started at 2760930069harc[28182]: 2011/09/28_15:26:47 info: Running /etc/ha.d//rc.d/status statusmach_down[28232]: 2011/09/28_15:26:47 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquiredSep 28 15:26:47 inga heartbeat: [2201]: CRIT: Cluster node pitanga returning after partition.Sep 28 15:26:47 inga heartbeat: [2201]: info: For information on cluster partitions, See URL: http://linux-ha.org/wiki/Split_BrainSep 28 15:26:47 inga heartbeat: [2201]: WARN: Deadtime value may be too small.Sep 28 15:26:47 inga heartbeat: [2201]: info: See FAQ for information on tuning deadtime.Sep 28 15:26:47 inga heartbeat: [2201]: info: URL: http://linux-ha.org/wiki/FAQ#Heavy_LoadSep 28 15:26:47 inga heartbeat: [2201]: info: Link pitanga:eth2 up.Sep 28 15:26:47 inga heartbeat: [2201]: WARN: Late heartbeat: Node pitanga: interval 3014680 msSep 28 15:26:47 inga heartbeat: [2201]: info: Status update for node pitanga: status activemach_down[28232]: 2011/09/28_15:26:47 info: mach_down takeover complete for node pitanga.Sep 28 15:26:47 inga heartbeat: [2201]: info: mach_down takeover complete.Sep 28 15:26:49 inga heartbeat: [2201]: WARN: Shutdown delayed until current resource activity finishes.Sep 28 15:26:49 inga heartbeat: [28183]: info: Local Resource acquisition completed.harc[28293]: 2011/09/28_15:26:49 info: Running /etc/ha.d//rc.d/status statusharc[28308]: 2011/09/28_15:26:49 info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-respip-request-resp[28308]: 2011/09/28_15:26:50 received ip-request-resp xendomains::lobeira.cfg OK yesResourceManager[28329]: 2011/09/28_15:26:50 info: Acquiring resource group: inga xendomains::lobeira.cfg xendomains::jequitiba.cfg xendomains::munguba.cfg xendomains::buriti.cfg xendomains::jatoba.cfg xendomains::mangaba.cfg xendomains::cagaita.cfg xendomains::agroval.cfg xendomains::tigui.cfgResourceManager[28329]: 2011/09/28_15:26:50 info: Running /etc/ha.d/resource.d/xendomains lobeira.cfg startResourceManager[28329]: 2011/09/28_15:26:51 info: Running /etc/ha.d/resource.d/xendomains jequitiba.cfg startResourceManager[28329]: 2011/09/28_15:26:52 info: Running /etc/ha.d/resource.d/xendomains munguba.cfg startResourceManager[28329]: 2011/09/28_15:26:53 info: Running /etc/ha.d/resource.d/xendomains buriti.cfg startResourceManager[28329]: 2011/09/28_15:26:54 info: Running /etc/ha.d/resource.d/xendomains jatoba.cfg startResourceManager[28329]: 2011/09/28_15:26:54 info: Running /etc/ha.d/resource.d/xendomains mangaba.cfg startResourceManager[28329]: 2011/09/28_15:26:56 info: Running /etc/ha.d/resource.d/xendomains cagaita.cfg startResourceManager[28329]: 2011/09/28_15:26:57 info: Running /etc/ha.d/resource.d/xendomains agroval.cfg startResourceManager[28329]: 2011/09/28_15:26:57 info: Running /etc/ha.d/resource.d/xendomains tigui.cfg startSep 28 15:26:57 inga heartbeat: [2201]: info: Heartbeat shutdown in progress. (2201)Sep 28 15:26:57 inga heartbeat: [28829]: info: Giving up all HA resources.ResourceManager[28843]: 2011/09/28_15:26:57 info: Releasing resource group: inga xendomains::lobeira.cfg xendomains::jequitiba.cfg xendomains::munguba.cfg xendomains::buriti.cfg xendomains::jatoba.cfg xendomains::mangaba.cfg xendomains::cagaita.cfg xendomains::agroval.cfg xendomains::tigui.cfgResourceManager[28843]: 2011/09/28_15:26:57 info: Running /etc/ha.d/resource.d/xendomains tigui.cfg stopResourceManager[28843]: 2011/09/28_15:27:09 info: Running /etc/ha.d/resource.d/xendomains agroval.cfg stopResourceManager[28843]: 2011/09/28_15:27:28 info: Running /etc/ha.d/resource.d/xendomains cagaita.cfg stopResourceManager[28843]: 2011/09/28_15:27:48 info: Running /etc/ha.d/resource.d/xendomains mangaba.cfg stopResourceManager[28843]: 2011/09/28_15:28:34 info: Running /etc/ha.d/resource.d/xendomains jatoba.cfg stopResourceManager[28843]: 2011/09/28_15:28:57 info: Running /etc/ha.d/resource.d/xendomains buriti.cfg stopResourceManager[28843]: 2011/09/28_15:29:20 info: Running /etc/ha.d/resource.d/xendomains munguba.cfg stopResourceManager[28843]: 2011/09/28_15:29:22 info: Running /etc/ha.d/resource.d/xendomains jequitiba.cfg stopResourceManager[28843]: 2011/09/28_15:29:36 info: Running /etc/ha.d/resource.d/xendomains lobeira.cfg stopSep 28 15:29:49 inga heartbeat: [28829]: info: All HA resources relinquished.Sep 28 15:29:49 inga heartbeat: [2201]: WARN: 1 lost packet(s) for [pitanga] [695814:695816]Sep 28 15:29:49 inga heartbeat: [2201]: info: No pkts missing from pitanga!Sep 28 15:29:51 inga heartbeat: [2201]: info: killing HBFIFO process 2210 with signal 15Sep 28 15:29:51 inga heartbeat: [2201]: info: killing HBWRITE process 2211 with signal 15Sep 28 15:29:51 inga heartbeat: [2201]: info: killing HBREAD process 2212 with signal 15Sep 28 15:29:51 inga heartbeat: [2201]: info: Core process 2212 exited. 3 remainingSep 28 15:29:51 inga heartbeat: [2201]: info: Core process 2211 exited. 2 remainingSep 28 15:29:51 inga heartbeat: [2201]: info: Core process 2210 exited. 1 remainingSep 28 15:29:51 inga heartbeat: [2201]: info: inga Heartbeat shutdown complete.Sep 28 15:29:51 inga heartbeat: [2201]: info: Heartbeat restart triggered.Sep 28 15:29:51 inga heartbeat: [2201]: info: Restarting heartbeat.Sep 28 15:29:51 inga heartbeat: [2201]: info: Performing heartbeat restart exec.Sep 28 15:30:42 inga heartbeat: [2201]: WARN: Core dumps could be lost if multiple dumps occur.Sep 28 15:30:42 inga heartbeat: [2201]: WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportabilitySep 28 15:30:42 inga heartbeat: [2201]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportabilitySep 28 15:30:42 inga heartbeat: [2201]: info: Version 2 support: falseSep 28 15:30:42 inga heartbeat: [2201]: WARN: Logging daemon is disabled --enabling logging daemon is recommendedSep 28 15:30:42 inga heartbeat: [2201]: info: **************************Sep 28 15:30:42 inga heartbeat: [2201]: info: Configuration validated. Starting heartbeat 3.0.2Sep 28 15:30:42 inga heartbeat: [31875]: info: heartbeat: version 3.0.2Sep 28 15:30:42 inga heartbeat: [31875]: info: Heartbeat generation: 1294961335Sep 28 15:30:42 inga heartbeat: [31875]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth2Sep 28 15:30:42 inga heartbeat: [31875]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth2 - Status: 1Sep 28 15:30:42 inga heartbeat: [31875]: info: G_main_add_TriggerHandler: Added signal manual handlerSep 28 15:30:42 inga heartbeat: [31875]: info: G_main_add_TriggerHandler: Added signal manual handlerSep 28 15:30:42 inga heartbeat: [31875]: info: G_main_add_SignalHandler: Added signal handler for signal 17Sep 28 15:30:42 inga heartbeat: [31875]: info: Local status now set to: 'up'Sep 28 15:30:42 inga heartbeat: [31875]: info: Link inga:eth2 up.Sep 28 15:30:47 inga heartbeat: [31875]: info: Link pitanga:eth2 up.Sep 28 15:30:47 inga heartbeat: [31875]: info: Status update for node pitanga: status activeharc[31897]: 2011/09/28_15:30:47 info: Running /etc/ha.d//rc.d/status statusSep 28 15:30:48 inga heartbeat: [31875]: info: Comm_now_up(): updating status to activeSep 28 15:30:48 inga heartbeat: [31875]: info: Local status now set to: 'active'Sep 28 15:30:48 inga heartbeat: [31875]: info: remote resource transition completed.Sep 28 15:30:48 inga heartbeat: [31875]: info: remote resource transition completed.Sep 28 15:30:48 inga heartbeat: [31875]: info: Local Resource acquisition completed. (none)Sep 28 14:53:54 inga heartbeat: [31875]: info: Clock jumped backwards. Compensating. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
