> Is the ha.cf identical on both nodes? Firewall rules? > Your logs are typical of a setup where the nodes do not communicate.
File should be the same. On vaha01 running the apps: [EMAIL PROTECTED] crm]# sum cib.xml 41504 7 Then on vaha02: [EMAIL PROTECTED] crm]# rm -f /var/log/ha-* [EMAIL PROTECTED] crm]# scp vaha01:/var/lib/heartbeat/crm/cib.xml . [EMAIL PROTECTED]'s password: cib.xml 100% 6897 6.7KB/s 00:00 [EMAIL PROTECTED] crm]# sum cib.xml 41504 7 Then I start heartbeat on vaha02. After starting I look with tcpdump on vaha02: 10:17:19.739352 IP vaha02.epilotcolo.eliberation.com.32784 > 172.16.5.255.ha-cluster: UDP, length 193 10:17:20.532226 IP vaha01.epilotcolo.eliberation.com.32769 > 172.16.5.255.ha-cluster: UDP, length 173 10:17:20.734297 IP vaha02.epilotcolo.eliberation.com.32784 > 172.16.5.255.ha-cluster: UDP, length 170 10:17:20.734919 IP vaha02.epilotcolo.eliberation.com.32784 > 172.16.5.255.ha-cluster: UDP, length 193 Looks like they are communicating but I get errors in the log: [EMAIL PROTECTED] crm]# egrep 'ERROR|WARN' /var/log/ha-log|head -3 heartbeat[12877]: 2006/08/29_10:17:19 ERROR: Message hist queue is filling up (151 messages in queue) heartbeat[12877]: 2006/08/29_10:17:19 ERROR: Message hist queue is filling up (152 messages in queue) heartbeat[12877]: 2006/08/29_10:17:20 ERROR: Message hist queue is filling up (153 messages in queue) "iptables -L -n" returns no rules on both machines and pinging and the routing tables look good (NFS and other programs work fine). > "just fine until I upgraded some RPMs" doesn't sound too good; > was it just heartbeat you updated, or what did you change? > Your phrasing indicates that the breakage prompted you to upgrade heartbeat > as the second step, but then I'd venture that heartbeat isn't responsible, just a victim ... > > Which heartbeat packages did you install? Did you build them yourself on the RHEL boxes? I upgraded Redhat RPM packages (not heartbeat) and heartbeat 2.0.4 broke. So then I installed heartbeat 2.0.7-1 which I built from source with rpmbuild: [EMAIL PROTECTED] crm]# rpm -q heartbeat heartbeat-2.0.7-1 Output of "ccm_tool -p" on both machines: [EMAIL PROTECTED] crm]# ccm_tool -p vaha01 [EMAIL PROTECTED] crm]# [EMAIL PROTECTED] crm]# ccm_tool -p [EMAIL PROTECTED] crm]# That last one doesn't look too good. :-) _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
