Hi: I'm using Heartbeat 3.0.3 stable version on RHEL 6.1 x64 platform, and found following issue: If I restart network service, heartbeat will not send broadcast packages from port 694. That makes this node never have a chance to join HA cluster again except restart it.
Details for setting cluster: ============================ 1. Compile heartbeat 3.0.3 from source and install it on 2 RHEL 6.1 x64 nodes: installer001 and rhel61 2. Compile pacemaker 1.0.9 from source and install it on 2 RHEL 6.1 x64 nodes 3. Configure /etc/ha.d/ha.cf, make sure both of these 2 nodes are Online through "crm status" 4. run "tcpdump -i eth0 port 694", we can found both of these 2 nodes are sending heartbeat broadcast packages. Details of configuration file: ============================= [root@rhel61 ~]# cat /etc/ha.d/ha.cf autojoin none bcast eth0 warntime 5 deadtime 15 initdead 60 keepalive 2 node installer001 node rhel61 crm respawn Then I tried to restart network service on the backup node "installer001", or just run "ifdown eth0; ifup eth0". And on node "rhel61" it will detected "installer001" as "offline" immediately. On node "installer001", it will detect "rhel61" as "offline". Then I run "tcpdump -i eth0 port 694" on "installer001" again, we can only detect "rhel61" still sending broadcast packages but no broadcast packages coming from "installer001", although "eth0" network is fully recovered now. I've tried the exactly same case on RHEL 5.6 (heartbeat 3.0.3), it works well. After restart network, the node can still send out broadcast packages... Thanks for you comments. --Lei _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems