Hi, On Mon, Nov 09, 2009 at 12:18:12PM -0600, Carlos Chacón Ch wrote: > Hello guys > > I got this weird issue. > 2 nodes they are working fine with auto_failback ON. But a few days ago I > tried auto_failback OFF. > But I realized it did not work. When doing the manual failback the > IP(10.1.1.4) that has to move to node01 is lost. I mean there is no bond0:0 > in node01 when is should but the IP is not on node02 either. Every server > has its bond IP but the HA IP is lost. I have to basically restart heartbeat > several times to get the IP on any of the nodes. > > Weird thing the HA IP(10.1.1.4) works fine when failback is automatic but > when configured for manual the HA IP is lost.
Strange indeed. > OS: Red Hat Enterprise 5.0 > Kernel: 2.6.18-8.el5 > HeartBeat version: heartbeat-2.1.4-6.el5 - Install using RPM packages. > > ha.cf Conf. > > logfile /var/log/ha-log > logfacility local0 > keepalive 3 > deadtime 10 > udp bond0 > udpport 695 > auto_failback off > node node01 > node node02 > > haresources Conf > node01 x.x.x.x. HA_http > > these are the logs the day I tried the OFF configuration for auto_failback > > what could be causing this issue? > > Logs node01 > http://karlochacon.googlepages.com/node01.txt > > Logs node02 > http://karlochacon.googlepages.com/node02.txt The logs are a mess, there are several shutdowns on both nodes, not possible to figure out what's going on. There are these problems though (encountered many times): Oct 29 20:37:17 node02 ResourceManager[20430]: ERROR: Return code 1 from /etc/init.d/HA_http Oct 29 20:37:17 node02 ResourceManager[20430]: CRIT: Giving up resources due to failure of HA_http Nodes regularly go dead some 10 seconds after the other node starts or stops resources. Is your network sane? Does the IP resource makes somehow nodes lose each other? Oct 29 19:49:18 node01 IPaddr[15369]: INFO: Running OK Oct 29 19:49:18 node01 ResourceManager[15342]: info: Running /etc/init.d/HA_http start Oct 29 19:49:29 node01 heartbeat: [14132]: WARN: node node02: is dead Oct 29 19:49:29 node01 heartbeat: [14132]: info: Dead node node02 gave up resources. Thanks, Dejan > thanks a lot guys > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
