Am Dienstag, 21. Mai 2013 00:00:03 schrieb DaveW: > We are running heartbeat 2.1.3 on CentOS 5.4. Last Monday AM, I
- Man, so OLD! Any chance to update to the latest version ? Nikita Michalko > received a call while getting ready for work. Our high availability > server was not responding. The previous Saturday, our I.T. admins had > re-configured the network to expand IP address ranges on some subnets. > For whatever reason, this action caused our main server (in a two-node > HA configuration) to loose its virtual interface, rendering our > high-availability server unavailable. > > The network worked fine; the nodes could ping each other based on their > normal IP's and they could ping the ping node, but the virtual IP (the > one we REALLY care about) was ignored. Nothing in the logs, no errors, > nothing. Just an unresponsive virtual server. A manual fail-over > brought it back quickly as the backup took over. I.T. had done their > work on Sat and, had I checked our server on Sunday, I would have found > it "unreachable" with a normal ping. > > When my colleague called me, I asked him what "ifconfig" looked like. > He described three interfaces; eth0, eth1 and lo; no eth0:0. I had him > initiate the manual fail-over. > > After pouring over the logs, unable to find anything that indicated a > problem, I tried to simulate the problem with "ifconfig eth0:0 down". > Sure enough, no fail-over, no errors, nothing; just (once again) an > unresponsive server. "ifconfig eth0:0 <IP_ADDRESS> up" brought it right > back (I tried this last Saturday, BTW, when no one was working). It > seems that heartbeat (ipfail?) creates this virtual interface when it > starts, then forgets about it. I presume that the assumption is that if > eth0 remains intact, eth0:0 will remain intact, as well. > > Am I missing something in the configuration settings or docs? I find > nothing about configuring the backup node to monitor the virtual > address, just the other node (which has a different IP and kept working > after the network changes). I am about to set up a service to monitor > the virtual IP, but I wanted to check with the list, first, to see if > there's already been something built in that I have not configured > correctly. I have used main.company.com and backup.company.com as the > two hostnames of the nodes. Both systems have these names in an > /etc/hosts file, along with the hostname and IP of the virtual server > and the ping node. > > My configuration: > > /etc/ha.d/ha.cf: > > debugfile /var/log/ha-debug > logfile /var/log/ha-log > logfacility local0 > keepalive 2 > deadtime 10 > warntime 3 > initdead 120 > udpport 694 > baud 9600 > serial /dev/ttyS0 > ucast eth1 10.0.0.1 > ucast eth1 10.0.0.2 > auto_failback off > node main.company.com backup.company.com > ping 129.196.140.130 > respawn hacluster /usr/lib/heartbeat/ipfail > deadping 10 > > /etc/ha.d/haresources > > main.company.com drbddisk::drbd_resource_0 > Filesystem::/dev/drbd0::/usr0::ext3 mysql IPaddr::129.196.140.14 httpd > smb MailTo::root > > > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
