Hi everybody,

I just experienced a strange behavior, after rebooting our server manual 
the heart beat came not into service after the reboot. The message log 
show Retrying already in use? but in netstat nothing shows up on port 
694? The nodes were able to see each other. On both nodes services were 
connecting using the same link (br0).

A heartbeart stop/start did not help and resulted in the same log messages
After the a second reboot the phenomenon was gone

heartbeat V2.99.2
openSUSE 11.1

Anybody seen this before? or know the cause of it?

best regards

jeroen

 ====== log =========
ClusterNode1:/ # tail /var/log/messages
Jun 10 12:00:08 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
error binding socket. Retrying: Address already in use
Jun 10 12:00:09 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
error binding socket. Retrying: Address already in use
Jun 10 12:00:10 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
error binding socket. Retrying: Address already in use
Jun 10 12:00:11 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
error binding socket. Retrying: Address already in use
Jun 10 12:00:12 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
error binding socket. Retrying: Address already in use
Jun 10 12:00:13 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
unable to bind socket. Giving up: Address already in use
Jun 10 12:00:13 ClusterNode1 heartbeat: [5315]: ERROR: 
make_io_childpair: cannot open ucast br0
Jun 10 12:00:14 ClusterNode1 heartbeat: [5317]: CRIT: Emergency 
Shutdown: Master Control process died.
Jun 10 12:00:14 ClusterNode1 heartbeat: [5317]: CRIT: Killing pid 5315 
with SIGTERM
Jun 10 12:00:14 ClusterNode1 heartbeat: [5317]: CRIT: Emergency 
Shutdown(MCP dead): Killing ourselves.


========= netstat -ntlp ============

ClusterNode1:/ # netstat -ntlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         
State       PID/Program name
tcp        0      0 0.0.0.0:5801            0.0.0.0:*               
LISTEN      4039/xinetd
tcp        0      0 0.0.0.0:5901            0.0.0.0:*               
LISTEN      4039/xinetd
tcp        0      0 0.0.0.0:111             0.0.0.0:*               
LISTEN      3063/rpcbind
tcp        0      0 0.0.0.0:6004            0.0.0.0:*               
LISTEN      4823/Xvnc
tcp        0      0 0.0.0.0:22              0.0.0.0:*               
LISTEN      3907/sshd
tcp        0      0 127.0.0.1:631           0.0.0.0:*               
LISTEN      3841/cupsd
tcp        0      0 127.0.0.1:25            0.0.0.0:*               
LISTEN      3868/master
tcp        0      0 :::111                  :::*                    
LISTEN      3063/rpcbind
tcp        0      0 :::6004                 :::*                    
LISTEN      4823/Xvnc
tcp        0      0 :::22                   :::*                    
LISTEN      3907/sshd


======= ha.cf ==========

use_logd yes
ucast br0 192.168.1.1
ucast br0 192.168.1.2
ucast br1 172.27.74.136
ucast br1 172.27.74.137
#serial /dev/ttyS0
node ClusterNode1
node ClusterNode2
respawn root /usr/lib64/heartbeat/hbagent
apiauth mgmtd uid=root
respawn root /usr/lib64/heartbeat/mgmtd -v
crm on

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to