Some more info: I compared the open ports to another HA inmstalltion which works and found a difference. The working HA cluster has another port open for heartbeat:
Proto Recv-Q Send-Q Local Address Foreign Address State Benutzer Inode PID/Program name raw 0 0 0.0.0.0:1 0.0.0.0:* 7 0 758777 22787/heartbeat: wr which the here discussed machine does not have. Best regards Peter Peter P GMX schrieb: > Hello Michael, > > I updated to 2.99-3 (pacemaker-heartbeat package for Ubuntu Hardy) and > still have the same behaviour. FS3 still ignores that FS2 is up and vice > versa. > (Btw: For auth method CRC it complained, that a shared secret is not > valid, so I shouldn't use one). > > On each server I can > ping fs2 > ping fs3 > and see the right eth2 IP. > Also I can correctly ping the shared IP on the other server. And I can > ping the shared IPs from a 3rd server as soon as they are active. > > So fs3 ist still taking over as it considers fs2 as dead and vice versa. > And normally, when I startup heartbeat on the second server I should see > on the 1st server's ha-debug.log that 2nd server has started up. > This is also not the case here. It seems that both servers do not see > each other though HA, although there is a lot of traffic between them on > port 694 (and DRBD does also work via eth2). > What also irritates me, is that both servers can acquire the common IP > resource. HA should check before, if he assignes a new IP, right? > > Any idea where to search? Are there other ports or messages involved? > > Best regards > Peter > > > Michael Schwartzkopff schrieb: > >> Am Donnerstag, 24. September 2009 20:14:22 schrieb Peter P GMX: >> >> >>> Hello, >>> This is the frist heartbeat I setup with heartbeat: version 2.1.3. >>> Some setups with older versions worked fine. >>> >>> >> Do yourself a favor and do not use that old version any more. It's buggy. >> >> >> >>> However I have a problem with a 2 node cluster under Ubuntu Server 8.043: >>> Machine fs2 comes up fine, starts 2 shared IPs and mysql. >>> Then heartbeat on Machine fs3 is started. It immediately considers fs2 >>> as dead and starts 2 shared IPs and mysql. >>> >>> I can ngrep the traffic on UDP port 694 on both machines and see that >>> heartbeat messages are sent from both machines and are received on the >>> other machine on both network interfaces I configured with HA. Auth is >>> kept simple in order to reduce another points of failure. >>> >>> Here's the configs and logs: >>> (I anonymized the domain name and public IPs) >>> >>> [Ubuntu] r...@fs2:/etc/ha.d# uname -n >>> fs2.my.domain.de >>> >>> [Ubuntu] r...@fs3:/etc/ha.d# uname -n >>> fs3.my.domain.de >>> >>> [Ubuntu] r...@fs2:/etc/ha.d# cat ha.cf >>> auto_failback off >>> ucast eth2 10.255.0.193 >>> ucast eth2 10.255.0.194 >>> ucast eth0 85.xxx.xx.163 >>> ucast eth0 85.xxx.xx.164 >>> ping 85.xxx.xx.165 >>> bcast eth2 >>> bcast eth0 >>> warntime 2 >>> deadtime 5 >>> initdead 10 >>> keepalive 1 >>> udpport 694 >>> node fs2.my.domain.de >>> node fs3.my.domain.de >>> debugfile /var/log/ha-debug.log >>> logfile /var/log/ha-log.log >>> respawn hacluster /usr/lib/heartbeat/ipfail >>> >>> [Ubuntu] r...@fs2:/etc/ha.d# cat authkeys >>> auth 1 >>> 1 crc >>> >>> >> what is the shared secret? You need to put a shared secret in. >> >> >> >>> [Ubuntu] r...@fs2:/etc/ha.d# cat haresources >>> fs2.my.domain.de 85.xxx.xx.165 10.255.0.195 mysql >>> >>> >> Please consider using the CRM variant for configuration and drop haresources. >> >> >> >>> Ngrepping the traffic when fs3 takes over: >>> ======================== until here FS3 status is UP after Start >>> ========================= >>> U 2009/09/24 19:59:25.918590 10.255.0.194:46457 -> 10.255.0.255:694 >>> >>> t=status >>> st=up >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=e >>> hg=4aa92fbc >>> ts=4abbb37d >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 48236807 >>> <<< >>> . >>> # >>> ======================== FS2 still making Heartbeats >>> ========================= >>> U 2009/09/24 19:59:26.040197 10.255.0.193:46920 -> 10.255.0.194:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs2.my.domain.de >>> (1)srcuuid=MkXu7dETSGG21s6IhhOETA== >>> seq=1fe >>> hg=4aa922ac >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 6/287 26822 >>> ttl=4 >>> auth=1 75f50f20 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.040220 85.xxx.xx.163:53661 -> 85.xxx.xx.164:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs2.my.domain.de >>> (1)srcuuid=MkXu7dETSGG21s6IhhOETA== >>> seq=1fe >>> hg=4aa922ac >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 6/287 26822 >>> ttl=4 >>> auth=1 75f50f20 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.040224 10.255.0.193:46574 -> 10.255.0.255:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs2.my.domain.de >>> (1)srcuuid=MkXu7dETSGG21s6IhhOETA== >>> seq=1fe >>> hg=4aa922ac >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 6/287 26822 >>> ttl=4 >>> auth=1 75f50f20 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.040242 85.xxx.xx.163:41366 -> 85.xxx.xx.191:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs2.my.domain.de >>> (1)srcuuid=MkXu7dETSGG21s6IhhOETA== >>> seq=1fe >>> hg=4aa922ac >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 6/287 26822 >>> ttl=4 >>> auth=1 75f50f20 >>> <<< >>> . >>> # >>> ======================== here FS3 starts to become ACTIVE >>> ========================= >>> U 2009/09/24 19:59:26.416790 10.255.0.194:51610 -> 10.255.0.193:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=10 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 c2ed5f24 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.416805 85.xxx.xx.164:41680 -> 85.xxx.xx.164:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=10 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 c2ed5f24 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.416812 10.255.0.194:40530 -> 10.255.0.194:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=10 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 c2ed5f24 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.416817 85.xxx.xx.164:50732 -> 85.xxx.xx.191:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=10 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 c2ed5f24 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.416834 85.xxx.xx.164:51189 -> 85.xxx.xx.163:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=10 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 c2ed5f24 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.416854 10.255.0.194:46457 -> 10.255.0.255:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=10 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 c2ed5f24 >>> <<< >>> . >>> # >>> ======================== here FS3 status is STARTING >>> ========================= >>> U 2009/09/24 19:59:26.416896 85.xxx.xx.164:41680 -> 85.xxx.xx.164:694 >>> >>> t=starting >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=11 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 6/159 486 >>> ttl=4 >>> auth=1 9d0f2a5e >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.416918 85.xxx.xx.164:50732 -> 85.xxx.xx.191:694 >>> >>> t=starting >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=11 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 6/159 486 >>> ttl=4 >>> auth=1 9d0f2a5e >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.416930 10.255.0.194:51610 -> 10.255.0.193:694 >>> >>> t=starting >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=11 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 6/159 486 >>> ttl=4 >>> auth=1 9d0f2a5e >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.416945 10.255.0.194:40530 -> 10.255.0.194:694 >>> >>> t=starting >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=11 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 6/159 486 >>> ttl=4 >>> auth=1 9d0f2a5e >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.416965 85.xxx.xx.164:51189 -> 85.xxx.xx.163:694 >>> >>> t=starting >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=11 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 6/159 486 >>> ttl=4 >>> auth=1 9d0f2a5e >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.416984 10.255.0.194:46457 -> 10.255.0.255:694 >>> >>> t=starting >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=11 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 6/159 486 >>> ttl=4 >>> auth=1 9d0f2a5e >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417038 10.255.0.194:51610 -> 10.255.0.193:694 >>> >>> t=status >>> st=active >>> dt=1b58 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=12 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/159 486 >>> ttl=4 >>> auth=1 631136a6 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417052 85.xxx.xx.164:41680 -> 85.xxx.xx.164:694 >>> >>> t=status >>> st=active >>> dt=1b58 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=12 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/159 486 >>> ttl=4 >>> auth=1 631136a6 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417057 10.255.0.194:40530 -> 10.255.0.194:694 >>> >>> t=status >>> st=active >>> dt=1b58 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=12 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/159 486 >>> ttl=4 >>> auth=1 631136a6 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417066 85.xxx.xx.164:50732 -> 85.xxx.xx.191:694 >>> >>> t=status >>> st=active >>> dt=1b58 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=12 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/159 486 >>> ttl=4 >>> auth=1 631136a6 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417081 85.xxx.xx.164:51189 -> 85.xxx.xx.163:694 >>> >>> t=status >>> st=active >>> dt=1b58 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=12 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/159 486 >>> ttl=4 >>> auth=1 631136a6 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417100 10.255.0.194:46457 -> 10.255.0.255:694 >>> >>> t=status >>> st=active >>> dt=1b58 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=12 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/159 486 >>> ttl=4 >>> auth=1 631136a6 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417769 10.255.0.194:51610 -> 10.255.0.193:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=13 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/160 487 >>> ttl=4 >>> auth=1 e7dea25 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417793 10.255.0.194:40530 -> 10.255.0.194:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=13 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/160 487 >>> ttl=4 >>> auth=1 e7dea25 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417794 85.xxx.xx.164:41680 -> 85.xxx.xx.164:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=13 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/160 487 >>> ttl=4 >>> auth=1 e7dea25 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417815 85.xxx.xx.164:51189 -> 85.xxx.xx.163:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=13 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/160 487 >>> ttl=4 >>> auth=1 e7dea25 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417818 85.xxx.xx.164:50732 -> 85.xxx.xx.191:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=13 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/160 487 >>> ttl=4 >>> auth=1 e7dea25 >>> <<< >>> . >>> # >>> ======================== here FS3 sends STONITH ========================= >>> U 2009/09/24 19:59:26.417841 85.xxx.xx.164:51189 -> 85.xxx.xx.163:694 >>> >>> t=stonith >>> node=fs2.my.domain.de >>> result=n_stnth >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=f >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 a6fc320a >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417845 85.xxx.xx.164:41680 -> 85.xxx.xx.164:694 >>> >>> t=stonith >>> node=fs2.my.domain.de >>> result=n_stnth >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=f >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 a6fc320a >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417861 85.xxx.xx.164:50732 -> 85.xxx.xx.191:694 >>> >>> t=stonith >>> node=fs2.my.domain.de >>> result=n_stnth >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=f >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 a6fc320a >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417872 10.255.0.194:46457 -> 10.255.0.255:694 >>> >>> t=status >>> st=active >>> dt=1388 >>> protocol=1 >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=13 >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 5/160 487 >>> ttl=4 >>> auth=1 e7dea25 >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417885 10.255.0.194:46457 -> 10.255.0.255:694 >>> >>> t=stonith >>> node=fs2.my.domain.de >>> result=n_stnth >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=f >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 a6fc320a >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417969 10.255.0.194:51610 -> 10.255.0.193:694 >>> >>> t=stonith >>> node=fs2.my.domain.de >>> result=n_stnth >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=f >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 a6fc320a >>> <<< >>> . >>> # >>> U 2009/09/24 19:59:26.417991 10.255.0.194:40530 -> 10.255.0.194:694 >>> >>> t=stonith >>> node=fs2.my.domain.de >>> result=n_stnth >>> src=fs3.my.domain.de >>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA== >>> seq=f >>> hg=4aa92fbc >>> ts=4abbb37e >>> ld=0.00 0.00 0.00 1/159 486 >>> ttl=4 >>> auth=1 a6fc320a >>> <<< >>> . >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Linux-HA mailing list >>> [email protected] >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>> See also: http://linux-ha.org/ReportingProblems >>> >>> >> >> > > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
