> Is the ha.cf identical on both nodes? Firewall rules? 
> Your logs are typical of a setup where the nodes do not communicate.

File should be the same. On vaha01 running the apps:

[EMAIL PROTECTED] crm]# sum cib.xml
41504     7

Then on vaha02:

[EMAIL PROTECTED] crm]# rm -f /var/log/ha-*
[EMAIL PROTECTED] crm]# scp vaha01:/var/lib/heartbeat/crm/cib.xml .
[EMAIL PROTECTED]'s password:
cib.xml
100% 6897     6.7KB/s   00:00
[EMAIL PROTECTED] crm]# sum cib.xml
41504     7

Then I start heartbeat on vaha02. After starting I look with tcpdump on
vaha02:

10:17:19.739352 IP vaha02.epilotcolo.eliberation.com.32784 >
172.16.5.255.ha-cluster: UDP, length 193
10:17:20.532226 IP vaha01.epilotcolo.eliberation.com.32769 >
172.16.5.255.ha-cluster: UDP, length 173
10:17:20.734297 IP vaha02.epilotcolo.eliberation.com.32784 >
172.16.5.255.ha-cluster: UDP, length 170
10:17:20.734919 IP vaha02.epilotcolo.eliberation.com.32784 >
172.16.5.255.ha-cluster: UDP, length 193

Looks like they are communicating but I get errors in the log:

[EMAIL PROTECTED] crm]# egrep 'ERROR|WARN' /var/log/ha-log|head -3
heartbeat[12877]: 2006/08/29_10:17:19 ERROR: Message hist queue is
filling up (151 messages in queue)
heartbeat[12877]: 2006/08/29_10:17:19 ERROR: Message hist queue is
filling up (152 messages in queue)
heartbeat[12877]: 2006/08/29_10:17:20 ERROR: Message hist queue is
filling up (153 messages in queue)

"iptables -L -n" returns no rules on both machines and pinging and the
routing tables look good (NFS and other programs work fine).

> "just fine until I upgraded some RPMs" doesn't sound too good; 
> was it just heartbeat you updated, or what did you change? 
> Your phrasing indicates that the breakage prompted you to upgrade
heartbeat
> as the second step, but then I'd venture that heartbeat isn't
responsible, just a victim ...
>
> Which heartbeat packages did you install? Did you build them yourself
on the RHEL boxes?

I upgraded Redhat RPM packages (not heartbeat) and heartbeat 2.0.4
broke. So then I installed heartbeat 2.0.7-1 which I built from source
with rpmbuild:

[EMAIL PROTECTED] crm]# rpm -q heartbeat
heartbeat-2.0.7-1

Output of "ccm_tool -p" on both machines:

[EMAIL PROTECTED] crm]# ccm_tool -p
vaha01
[EMAIL PROTECTED] crm]#

[EMAIL PROTECTED] crm]# ccm_tool -p
[EMAIL PROTECTED] crm]#

That last one doesn't look too good. :-)
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to