Some more info:

I compared the open ports to another HA inmstalltion which works and
found a difference.
The working HA cluster has another port open for heartbeat:

  Proto Recv-Q Send-Q Local Address           Foreign Address        
State       Benutzer   Inode      PID/Program name
  raw        0      0 0.0.0.0:1               0.0.0.0:*              
7          0          758777     22787/heartbeat: wr

which the here discussed machine does not have.

Best regards
Peter

Peter P GMX schrieb:
> Hello Michael,
>
> I updated to 2.99-3 (pacemaker-heartbeat package for Ubuntu Hardy) and
> still have the same behaviour. FS3 still ignores that FS2 is up and vice
> versa.
> (Btw: For auth method CRC it complained, that a shared secret is not
> valid, so I shouldn't use one).
>
> On each server I can
>     ping fs2
>     ping fs3
>  and see the right eth2 IP.
> Also I can correctly ping the shared IP on the other server. And I can
> ping the shared IPs from a 3rd server as soon as they are active.
>
> So fs3 ist still taking over as it considers fs2 as dead and vice versa.
> And normally, when I startup heartbeat on the second server I should see
> on the 1st server's ha-debug.log that 2nd server has started up.
> This is also not the case here. It seems that both servers do not see
> each other though HA, although there is a lot of traffic between them on
> port 694 (and DRBD does also work via eth2).
> What also irritates me, is that both servers can acquire the common IP
> resource. HA should check before, if he assignes a new IP, right?
>
> Any idea where to search? Are there other ports or messages involved?
>
> Best regards
> Peter
>
>
> Michael Schwartzkopff schrieb:
>   
>> Am Donnerstag, 24. September 2009 20:14:22 schrieb Peter P GMX:
>>   
>>     
>>> Hello,
>>> This is the frist heartbeat I setup with  heartbeat: version 2.1.3.
>>> Some setups with older versions worked fine.
>>>     
>>>       
>> Do yourself a favor and do not use that old version any more. It's buggy.
>>
>>   
>>     
>>> However I have a problem with a 2 node cluster under Ubuntu Server 8.043:
>>> Machine fs2 comes up fine, starts 2 shared IPs and mysql.
>>> Then heartbeat on Machine fs3 is started. It immediately considers fs2
>>> as dead and starts 2 shared IPs and mysql.
>>>
>>> I can ngrep the traffic on UDP port 694 on both machines and see that
>>> heartbeat messages are sent from both machines and are received on the
>>> other machine on both network interfaces I configured with HA. Auth is
>>> kept simple in order to reduce another points of failure.
>>>
>>> Here's the configs and logs:
>>> (I anonymized the domain name and public IPs)
>>>
>>> [Ubuntu] r...@fs2:/etc/ha.d# uname -n
>>> fs2.my.domain.de
>>>
>>> [Ubuntu] r...@fs3:/etc/ha.d# uname -n
>>> fs3.my.domain.de
>>>
>>> [Ubuntu] r...@fs2:/etc/ha.d# cat ha.cf
>>> auto_failback off
>>> ucast eth2 10.255.0.193
>>> ucast eth2 10.255.0.194
>>> ucast eth0 85.xxx.xx.163
>>> ucast eth0 85.xxx.xx.164
>>> ping 85.xxx.xx.165
>>> bcast eth2
>>> bcast eth0
>>> warntime 2
>>> deadtime 5
>>> initdead 10
>>> keepalive 1
>>> udpport 694
>>> node fs2.my.domain.de
>>> node fs3.my.domain.de
>>> debugfile /var/log/ha-debug.log
>>> logfile /var/log/ha-log.log
>>> respawn hacluster /usr/lib/heartbeat/ipfail
>>>
>>> [Ubuntu] r...@fs2:/etc/ha.d# cat authkeys
>>> auth 1
>>> 1 crc
>>>     
>>>       
>> what is the shared secret? You need to put a shared secret in.
>>
>>   
>>     
>>> [Ubuntu] r...@fs2:/etc/ha.d# cat haresources
>>> fs2.my.domain.de 85.xxx.xx.165 10.255.0.195 mysql
>>>     
>>>       
>> Please consider using the CRM variant for configuration and drop haresources.
>>
>>   
>>     
>>> Ngrepping the traffic when fs3 takes over:
>>> ======================== until here FS3 status is UP after Start
>>> =========================
>>> U 2009/09/24 19:59:25.918590 10.255.0.194:46457 -> 10.255.0.255:694
>>>
>>> t=status
>>> st=up
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=e
>>> hg=4aa92fbc
>>> ts=4abbb37d
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 48236807
>>> <<<
>>> .
>>> #
>>> ======================== FS2 still making Heartbeats
>>> =========================
>>> U 2009/09/24 19:59:26.040197 10.255.0.193:46920 -> 10.255.0.194:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs2.my.domain.de
>>> (1)srcuuid=MkXu7dETSGG21s6IhhOETA==
>>> seq=1fe
>>> hg=4aa922ac
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 6/287 26822
>>> ttl=4
>>> auth=1 75f50f20
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.040220 85.xxx.xx.163:53661 -> 85.xxx.xx.164:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs2.my.domain.de
>>> (1)srcuuid=MkXu7dETSGG21s6IhhOETA==
>>> seq=1fe
>>> hg=4aa922ac
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 6/287 26822
>>> ttl=4
>>> auth=1 75f50f20
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.040224 10.255.0.193:46574 -> 10.255.0.255:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs2.my.domain.de
>>> (1)srcuuid=MkXu7dETSGG21s6IhhOETA==
>>> seq=1fe
>>> hg=4aa922ac
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 6/287 26822
>>> ttl=4
>>> auth=1 75f50f20
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.040242 85.xxx.xx.163:41366 -> 85.xxx.xx.191:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs2.my.domain.de
>>> (1)srcuuid=MkXu7dETSGG21s6IhhOETA==
>>> seq=1fe
>>> hg=4aa922ac
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 6/287 26822
>>> ttl=4
>>> auth=1 75f50f20
>>> <<<
>>> .
>>> #
>>> ======================== here FS3 starts to become ACTIVE
>>> =========================
>>> U 2009/09/24 19:59:26.416790 10.255.0.194:51610 -> 10.255.0.193:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=10
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 c2ed5f24
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.416805 85.xxx.xx.164:41680 -> 85.xxx.xx.164:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=10
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 c2ed5f24
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.416812 10.255.0.194:40530 -> 10.255.0.194:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=10
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 c2ed5f24
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.416817 85.xxx.xx.164:50732 -> 85.xxx.xx.191:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=10
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 c2ed5f24
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.416834 85.xxx.xx.164:51189 -> 85.xxx.xx.163:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=10
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 c2ed5f24
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.416854 10.255.0.194:46457 -> 10.255.0.255:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=10
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 c2ed5f24
>>> <<<
>>> .
>>> #
>>> ======================== here FS3 status is STARTING
>>> =========================
>>> U 2009/09/24 19:59:26.416896 85.xxx.xx.164:41680 -> 85.xxx.xx.164:694
>>>
>>> t=starting
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=11
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 6/159 486
>>> ttl=4
>>> auth=1 9d0f2a5e
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.416918 85.xxx.xx.164:50732 -> 85.xxx.xx.191:694
>>>
>>> t=starting
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=11
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 6/159 486
>>> ttl=4
>>> auth=1 9d0f2a5e
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.416930 10.255.0.194:51610 -> 10.255.0.193:694
>>>
>>> t=starting
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=11
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 6/159 486
>>> ttl=4
>>> auth=1 9d0f2a5e
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.416945 10.255.0.194:40530 -> 10.255.0.194:694
>>>
>>> t=starting
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=11
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 6/159 486
>>> ttl=4
>>> auth=1 9d0f2a5e
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.416965 85.xxx.xx.164:51189 -> 85.xxx.xx.163:694
>>>
>>> t=starting
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=11
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 6/159 486
>>> ttl=4
>>> auth=1 9d0f2a5e
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.416984 10.255.0.194:46457 -> 10.255.0.255:694
>>>
>>> t=starting
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=11
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 6/159 486
>>> ttl=4
>>> auth=1 9d0f2a5e
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417038 10.255.0.194:51610 -> 10.255.0.193:694
>>>
>>> t=status
>>> st=active
>>> dt=1b58
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=12
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/159 486
>>> ttl=4
>>> auth=1 631136a6
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417052 85.xxx.xx.164:41680 -> 85.xxx.xx.164:694
>>>
>>> t=status
>>> st=active
>>> dt=1b58
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=12
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/159 486
>>> ttl=4
>>> auth=1 631136a6
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417057 10.255.0.194:40530 -> 10.255.0.194:694
>>>
>>> t=status
>>> st=active
>>> dt=1b58
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=12
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/159 486
>>> ttl=4
>>> auth=1 631136a6
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417066 85.xxx.xx.164:50732 -> 85.xxx.xx.191:694
>>>
>>> t=status
>>> st=active
>>> dt=1b58
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=12
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/159 486
>>> ttl=4
>>> auth=1 631136a6
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417081 85.xxx.xx.164:51189 -> 85.xxx.xx.163:694
>>>
>>> t=status
>>> st=active
>>> dt=1b58
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=12
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/159 486
>>> ttl=4
>>> auth=1 631136a6
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417100 10.255.0.194:46457 -> 10.255.0.255:694
>>>
>>> t=status
>>> st=active
>>> dt=1b58
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=12
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/159 486
>>> ttl=4
>>> auth=1 631136a6
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417769 10.255.0.194:51610 -> 10.255.0.193:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=13
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/160 487
>>> ttl=4
>>> auth=1 e7dea25
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417793 10.255.0.194:40530 -> 10.255.0.194:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=13
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/160 487
>>> ttl=4
>>> auth=1 e7dea25
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417794 85.xxx.xx.164:41680 -> 85.xxx.xx.164:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=13
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/160 487
>>> ttl=4
>>> auth=1 e7dea25
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417815 85.xxx.xx.164:51189 -> 85.xxx.xx.163:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=13
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/160 487
>>> ttl=4
>>> auth=1 e7dea25
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417818 85.xxx.xx.164:50732 -> 85.xxx.xx.191:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=13
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/160 487
>>> ttl=4
>>> auth=1 e7dea25
>>> <<<
>>> .
>>> #
>>> ======================== here FS3 sends STONITH =========================
>>> U 2009/09/24 19:59:26.417841 85.xxx.xx.164:51189 -> 85.xxx.xx.163:694
>>>
>>> t=stonith
>>> node=fs2.my.domain.de
>>> result=n_stnth
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=f
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 a6fc320a
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417845 85.xxx.xx.164:41680 -> 85.xxx.xx.164:694
>>>
>>> t=stonith
>>> node=fs2.my.domain.de
>>> result=n_stnth
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=f
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 a6fc320a
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417861 85.xxx.xx.164:50732 -> 85.xxx.xx.191:694
>>>
>>> t=stonith
>>> node=fs2.my.domain.de
>>> result=n_stnth
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=f
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 a6fc320a
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417872 10.255.0.194:46457 -> 10.255.0.255:694
>>>
>>> t=status
>>> st=active
>>> dt=1388
>>> protocol=1
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=13
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 5/160 487
>>> ttl=4
>>> auth=1 e7dea25
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417885 10.255.0.194:46457 -> 10.255.0.255:694
>>>
>>> t=stonith
>>> node=fs2.my.domain.de
>>> result=n_stnth
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=f
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 a6fc320a
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417969 10.255.0.194:51610 -> 10.255.0.193:694
>>>
>>> t=stonith
>>> node=fs2.my.domain.de
>>> result=n_stnth
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=f
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 a6fc320a
>>> <<<
>>> .
>>> #
>>> U 2009/09/24 19:59:26.417991 10.255.0.194:40530 -> 10.255.0.194:694
>>>
>>> t=stonith
>>> node=fs2.my.domain.de
>>> result=n_stnth
>>> src=fs3.my.domain.de
>>> (1)srcuuid=HN8oXpaORL2b+NJyrocEwA==
>>> seq=f
>>> hg=4aa92fbc
>>> ts=4abbb37e
>>> ld=0.00 0.00 0.00 1/159 486
>>> ttl=4
>>> auth=1 a6fc320a
>>> <<<
>>> .
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> [email protected]
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>     
>>>       
>>   
>>     
>
>   
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to