Hi, On Sun, Oct 28, 2007 at 11:19:32PM -0300, [EMAIL PROTECTED] wrote: > Hi all. > > > Following i have 2 servers, settings for function of firewall, with > configuration. > > Server Master > P4 3.0HT > 2GB Ram > 4 HD (2 used system and 2 to cache squid, firewall, Shaper and BGP-4) > Motherboard Intel > > > Server Slave > P4 2.0 > 1GB Ram > 2 HD > Motherboard Intel without squid but used to firewall, shaper and BGP-4 > > what it occurs is the following one, I have heartbeat installed in the > two servers, and of some days for here, I am having problems with > heartbeat of it to fall and to come back, as it follows in log below > register in the main server: > > > Oct 22 21:10:53 gateway heartbeat[19084]: WARN: Late heartbeat: Node > gateway2.domain.com.br: interval 12530 ms > Oct 22 22:20:37 gateway heartbeat[19084]: WARN: node > gateway2.domain.com.br: is dead > Oct 22 22:20:37 gateway heartbeat[19084]: WARN: No STONITH device > configured. > Oct 22 22:20:37 gateway heartbeat[19084]: WARN: Shared disks are not > protected. > Oct 22 22:20:37 gateway heartbeat[19084]: info: Resources being > acquired from gateway2.domain.com.br. > Oct 22 22:20:37 gateway heartbeat[19084]: info: Link > gateway2.domain.com.br:/dev/ttyS0 dead. > Oct 22 22:20:38 gateway heartbeat: info: Running /etc/ha.d/rc.d/status > status > Oct 22 22:20:38 gateway heartbeat: info: /usr/lib/heartbeat/mach_down: > nice_failback: foreign resources acquired > Oct 22 22:20:42 gateway heartbeat[19084]: WARN: Cluster node > gateway2.domain.com.br returning after partition. > Oct 22 22:20:42 gateway heartbeat[19084]: WARN: Deadtime value may be > too small. > Oct 22 22:20:42 gateway heartbeat[19084]: info: See documentation for > information on tuning deadtime. > Oct 22 22:20:42 gateway heartbeat[19084]: info: Link > gateway2.domain.com.br:/dev/ttyS0 up. > Oct 22 22:20:42 gateway heartbeat[19084]: WARN: Late heartbeat: Node > gateway2.domain.com.br: interval 35790 ms
This indicates one of three possible problems: flakey communications, high load, or a kernel scheduler problems. Thanks, Dejan > Oct 22 22:20:42 gateway heartbeat[19084]: info: Status update for node > gateway2.domain.com.br: status active > Oct 22 22:20:42 gateway heartbeat[19084]: info: mach_down takeover complete. > Oct 22 22:20:42 gateway heartbeat: info: mach_down takeover complete > for node gateway2.domain.com.br. > Oct 22 22:20:42 gateway heartbeat[14883]: info: Local Resource > acquisition completed. > Oct 22 22:20:42 gateway heartbeat: info: Running /etc/ha.d/rc.d/status > status > Oct 22 22:20:44 gateway heartbeat[19084]: info: Heartbeat shutdown in > progress. (19084) > Oct 22 22:20:44 gateway heartbeat[16667]: info: Giving up all HA resources. > Oct 22 22:20:44 gateway heartbeat: info: Releasing resource group: > gateway.domain.com.br 200.xxx.xxx.xxx/30/eth0 200.xxx.xxx.x6/30/eth1 > 200.xxx.xxx.x7/29/eth2 firewall shaper > Oct 22 22:20:44 gateway heartbeat: info: Running /etc/init.d/shaper stop > Oct 22 22:20:46 gateway heartbeat: info: Running /etc/init.d/firewall stop > Oct 22 22:20:46 gateway heartbeat: info: Running > /etc/ha.d/resource.d/IPaddr 200.xxx.xxx.x7/29/eth2 stop > Oct 22 22:20:47 gateway heartbeat: info: Running > /etc/ha.d/resource.d/IPaddr 200.xxx.xxx.x6/30/eth1 stop > Oct 22 22:20:47 gateway heartbeat: info: /sbin/route -n del -host > 200.xxx.xxx.x6 > Oct 22 22:20:47 gateway heartbeat: info: /sbin/ifconfig eth1:0 down > Oct 22 22:20:47 gateway heartbeat: info: IP Address 200.xxx.xxx.x6 released > Oct 22 22:20:47 gateway heartbeat: info: Running > /etc/ha.d/resource.d/IPaddr 200.xxx.xxx.xxx/30/eth0 stop > Oct 22 22:20:47 gateway heartbeat[16667]: info: All HA resources > relinquished. > Oct 22 22:20:47 gateway heartbeat[19084]: WARN: 1 lost packet(s) for > [gateway2.domain.com.br] [239455:239457] > Oct 22 22:20:47 gateway heartbeat[19084]: info: No pkts missing from > gateway2.domain.com.br! > Oct 22 22:20:48 gateway heartbeat[19084]: info: killing HBFIFO process > 19086 with signal 15 > Oct 22 22:20:48 gateway heartbeat[19084]: info: killing HBWRITE process > 19087 with signal 15 > Oct 22 22:20:48 gateway heartbeat[19084]: info: killing HBREAD process > 19088 with signal 15 > Oct 22 22:20:48 gateway heartbeat[19084]: info: Core process 19088 > exited. 3 remaining > Oct 22 22:20:48 gateway heartbeat[19084]: info: Core process 19086 > exited. 2 remaining > Oct 22 22:20:48 gateway heartbeat[19084]: info: Core process 19087 > exited. 1 remaining > Oct 22 22:20:48 gateway heartbeat[19084]: info: Heartbeat shutdown complete. > Oct 22 22:20:48 gateway heartbeat[19084]: info: Heartbeat restart triggered. > Oct 22 22:20:48 gateway heartbeat[19084]: info: Restarting heartbeat. > Oct 22 22:20:48 gateway heartbeat[19084]: info: Performing heartbeat > restart exec. > Oct 22 22:21:19 gateway heartbeat[19084]: info: ************************** > Oct 22 22:21:19 gateway heartbeat[19084]: info: Configuration > validated. Starting heartbeat 1.2.5 > Oct 22 22:21:19 gateway heartbeat[19947]: info: heartbeat: version 1.2.5 > Oct 22 22:21:19 gateway heartbeat[19947]: info: Heartbeat generation: 23 > Oct 22 22:21:20 gateway heartbeat[19947]: info: Starting serial > heartbeat on tty /dev/ttyS0 (19200 baud) > Oct 22 22:21:20 gateway heartbeat[19947]: info: pid 19947 locked in memory. > Oct 22 22:21:20 gateway heartbeat[19947]: info: Local status now set to: > 'up' > Oct 22 22:21:21 gateway heartbeat[19949]: info: pid 19949 locked in memory. > Oct 22 22:21:21 gateway heartbeat[19950]: info: pid 19950 locked in memory. > Oct 22 22:21:21 gateway heartbeat[19951]: info: pid 19951 locked in memory. > Oct 22 22:21:21 gateway heartbeat[19947]: WARN: string2msg_ll: node > [gateway2.domain.com.br] failed authentication > Oct 22 22:21:22 gateway heartbeat[19947]: info: Link > gateway2.domain.com.br:/dev/ttyS0 up. > Oct 22 22:21:22 gateway heartbeat[19947]: info: Status update for node > gateway2.domain.com.br: status active > Oct 22 22:21:22 gateway heartbeat[19947]: info: Local status now set > to: 'active' > Oct 22 22:21:22 gateway heartbeat: info: Running /etc/ha.d/rc.d/status > status > Oct 22 22:21:22 gateway heartbeat[19947]: info: remote resource > transition completed. > Oct 22 22:21:22 gateway heartbeat[19947]: info: remote resource > transition completed. > Oct 22 22:21:22 gateway heartbeat[19947]: info: Local Resource > acquisition completed. (none) > Oct 22 22:21:23 gateway heartbeat[19947]: info: gateway2.domain.com.br > wants to go standby [foreign] > Oct 22 22:21:35 gateway heartbeat[19947]: info: standby: acquire > [foreign] resources from gateway2.domain.com.br > Oct 22 22:21:35 gateway heartbeat[19956]: info: acquire local HA > resources (standby). > Oct 22 22:21:35 gateway heartbeat: info: Acquiring resource group: > gateway.domain.com.br 200.xxx.xxx.xxx/30/eth0 200.xxx.xxx.x6/30/eth1 > 200.xxx.xxx.x7/29/eth2 firewall shaper > Oct 22 22:21:35 gateway heartbeat: info: Running > /etc/ha.d/resource.d/IPaddr 200.xxx.xxx.xxx/30/eth0 start > Oct 22 22:21:35 gateway heartbeat: info: /sbin/ifconfig eth0:0 > 200.xxx.xxx.xxx netmask 255.255.255.252 broadcast 200.208.220.131 > Oct 22 22:21:35 gateway heartbeat: info: Sending Gratuitous Arp for > 200.xxx.xxx.xxx on eth0:0 [eth0] > Oct 22 22:21:35 gateway heartbeat: /usr/lib/heartbeat/send_arp -i 1010 > -r 5 -p /var/lib/heartbeat/rsctmp/send_arp/send_arp-200.xxx.xxx.xxx > eth0 200.xxx.xxx.xxx auto 200.xxx.xxx.xxx ffffffffffff > Oct 22 22:21:35 gateway heartbeat: info: Running > /etc/ha.d/resource.d/IPaddr 200.xxx.xxx.x6/30/eth1 start > Oct 22 22:21:35 gateway heartbeat: info: /sbin/ifconfig eth1:0 > 200.xxx.xxx.x6 netmask 255.255.255.252 broadcast 200.208.223.67 > Oct 22 22:21:35 gateway heartbeat: info: Sending Gratuitous Arp for > 200.xxx.xxx.x6 on eth1:0 [eth1] > Oct 22 22:21:35 gateway heartbeat: /usr/lib/heartbeat/send_arp -i 1010 > -r 5 -p /var/lib/heartbeat/rsctmp/send_arp/send_arp-200.xxx.xxx.x6 eth1 > 200.xxx.xxx.x6 auto 200.xxx.xxx.x6 ffffffffffff > Oct 22 22:21:36 gateway heartbeat: info: Running > /etc/ha.d/resource.d/IPaddr 200.xxx.xxx.x7/29/eth2 start > Oct 22 22:21:36 gateway heartbeat: info: /sbin/ifconfig eth2:0 > 200.xxx.xxx.x7 netmask 255.255.255.248 broadcast 200.208.220.151 > Oct 22 22:21:36 gateway heartbeat: info: Sending Gratuitous Arp for > 200.xxx.xxx.x7 on eth2:0 [eth2] > Oct 22 22:21:36 gateway heartbeat: /usr/lib/heartbeat/send_arp -i 1010 > -r 5 -p /var/lib/heartbeat/rsctmp/send_arp/send_arp-200.xxx.xxx.x7 eth2 > 200.xxx.xxx.x7 auto 200.xxx.xxx.x7 ffffffffffff > Oct 22 22:21:36 gateway heartbeat: info: Running /etc/init.d/firewall start > Oct 22 22:21:36 gateway heartbeat: info: Running /etc/init.d/shaper start > Oct 22 22:21:41 gateway heartbeat[19956]: info: local HA resource > acquisition completed (standby). > Oct 22 22:21:41 gateway heartbeat[19947]: info: Standby resource > acquisition done [foreign]. > Oct 22 22:21:41 gateway heartbeat[19947]: info: Initial resource > acquisition complete (auto_failback) > Oct 22 22:21:41 gateway heartbeat[19947]: info: remote resource > transition completed. > > ---------------------------------------------------------------- > Conectcor - velocidade com qualidade > www.conectcor.com.br > > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
