On Thu, Aug 23, 2007 at 10:53:49AM +0200, Michael Liebl wrote: > Hello! > > My 2 node Linux-HA active/passive cluster works not as I expect. > > Node1 is active and owns all resources (just one additional IP). > I reboot Node2 and HA is startet automatically by init on Node2, > HA runs with active resources on Node1. > Fine so far. > > 40 seconds after HA was started on Node2, HA ist "automagically" > restartet on Node1 and Node2.
This looks suspiciosly close to the deadtime below. > Why? What is going wrong here? > Misconfigured? > > I'm using Ubuntu/dapper with Heartbeat v1.2.4. > There are no iptables or something. > If anyone needs more logs or something, just say. Well, you just sent the debug level messages. What about ERROR, WARNING, notice, and info? > Thank you! > > > > ha.cf: > debugfile /var/log/ha-debug > logfile /var/log/ha-log > logfacility daemon > coredumps false > debug 1 > keepalive 5 You should put this down, perhaps to 1 or 2. > warntime 10 > deadtime 20 > initdead 40 > auto_failback off > ucast eth0 192.168.8.151 > ucast eth0 192.168.8.152 > udpport 694 > node Node1 > node Node2 The configuration looks ok. You didn't show haresources, but I guess that it's hard to make a mistake there :) I almost forgot about the Heartbeat v1, but it looks like there was a communication problem. Are you sure that you don't have a firewall in between? > Debug output from Node1: > heartbeat: 2007/08/23_09:50:49 debug: Status seqno: 53 msgtime: 1187855449 > heartbeat: 2007/08/23_09:50:49 debug: StartNextRemoteRscReq() - calling hook > heartbeat: 2007/08/23_09:50:49 debug: notify_world: invoking harc: OLD > status: active > heartbeat: 2007/08/23_09:50:49 debug: Process [status] started pid 4076 > heartbeat: 2007/08/23_09:50:49 debug: Starting notify process [status] > heartbeat: 2007/08/23_09:50:49 debug: notify_world: setting SIGCHLD Handler > to SIG_DFL > heartbeat: 2007/08/23_09:50:49 debug: notify_world: Running harc status > heartbeat: 2007/08/23_09:50:49 debug: RscMgmtProc 'status' exited code 0 > heartbeat: 2007/08/23_09:50:51 debug: hb_rsc_isstable: > ResourceMgmt_child_count: 0, other_is_stable: 1, takeover_in_progress: 0, > going_standby: 0, standby running(ms): 0, resourcestate: 4 > heartbeat: 2007/08/23_09:50:51 debug: Sending hold resources msg: none, > stable=0 # shutdown > heartbeat: 2007/08/23_09:50:51 debug: hb_rsc_isstable: > ResourceMgmt_child_count: 0, other_is_stable: 1, takeover_in_progress: 0, > going_standby: 0, standby running(ms): 0, resourcestate: 5 > heartbeat: 2007/08/23_09:50:51 debug: Process [hb_giveup_resources] started > pid 4080 > heartbeat: 2007/08/23_09:50:52 debug: Starting /etc/ha.d/resource.d/IPaddr2 > 192.168.8.150/24/eth0:fs stop > heartbeat: 2007/08/23_09:50:52 debug: /etc/ha.d/resource.d/IPaddr2 > 192.168.8.150/24/eth0:fs stop done. RC=0 > heartbeat: 2007/08/23_09:50:52 debug: Sending T_SHUTDONE. > heartbeat: 2007/08/23_09:50:52 debug: hb_rsc_isstable: > ResourceMgmt_child_count: 1, other_is_stable: 1, takeover_in_progress: 0, > going_standby: 0, standby running(ms): 0, resourcestate: 5 > heartbeat: 2007/08/23_09:50:52 debug: Received T_SHUTDONE from us. > heartbeat: 2007/08/23_09:50:52 debug: Calling hb_mcp_final_shutdown in a > second. > heartbeat: 2007/08/23_09:50:52 debug: RscMgmtProc 'hb_giveup_resources' > exited code 0 > heartbeat: 2007/08/23_09:50:52 debug: hb_mcp_final_shutdown() phase 0 > heartbeat: 2007/08/23_09:50:53 debug: hb_mcp_final_shutdown() phase 1 > heartbeat: 2007/08/23_09:50:53 debug: Process 3789 processing SIGTERM > heartbeat: 2007/08/23_09:50:53 debug: Exiting from pid 3789 [rc=15] > heartbeat: 2007/08/23_09:50:53 debug: Process 3790 processing SIGTERM > heartbeat: 2007/08/23_09:50:53 debug: Exiting from pid 3790 [rc=15] > heartbeat: 2007/08/23_09:50:53 debug: Process 3791 processing SIGTERM > heartbeat: 2007/08/23_09:50:53 debug: Exiting from pid 3791 [rc=15] > heartbeat: 2007/08/23_09:50:53 debug: Process 3792 processing SIGTERM > heartbeat: 2007/08/23_09:50:53 debug: Exiting from pid 3792 [rc=15] > heartbeat: 2007/08/23_09:50:53 debug: Process 3793 processing SIGTERM > heartbeat: 2007/08/23_09:50:53 debug: Exiting from pid 3793 [rc=15] > > > Debug output from Node2: > heartbeat: 2007/08/23_09:50:49 debug: hb_mcp_final_shutdown() phase 0 > heartbeat: 2007/08/23_09:50:50 debug: hb_mcp_final_shutdown() phase 1 > heartbeat: 2007/08/23_09:50:50 debug: Process 4136 processing SIGTERM > heartbeat: 2007/08/23_09:50:50 debug: Process 4135 processing SIGTERM > heartbeat: 2007/08/23_09:50:50 debug: Exiting from pid 4135 [rc=15] > heartbeat: 2007/08/23_09:50:50 debug: Exiting from pid 4136 [rc=15] > heartbeat: 2007/08/23_09:50:50 debug: Process 4138 processing SIGTERM > heartbeat: 2007/08/23_09:50:50 debug: Process 4139 processing SIGTERM > heartbeat: 2007/08/23_09:50:50 debug: Exiting from pid 4138 [rc=15] > heartbeat: 2007/08/23_09:50:50 debug: Exiting from pid 4139 [rc=15] > heartbeat: 2007/08/23_09:50:50 debug: Process 4137 processing SIGTERM > heartbeat: 2007/08/23_09:50:50 debug: Exiting from pid 4137 [rc=15] > > > -- > <) .--. > )#=+ ' > /## | .+. Greetings, > ,,/###,|,,,,,,|,,,, Michael > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
