Hi everybody, I just experienced a strange behavior, after rebooting our server manual the heart beat came not into service after the reboot. The message log show Retrying already in use? but in netstat nothing shows up on port 694? The nodes were able to see each other. On both nodes services were connecting using the same link (br0).
A heartbeart stop/start did not help and resulted in the same log messages After the a second reboot the phenomenon was gone heartbeat V2.99.2 openSUSE 11.1 Anybody seen this before? or know the cause of it? best regards jeroen ====== log ========= ClusterNode1:/ # tail /var/log/messages Jun 10 12:00:08 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: error binding socket. Retrying: Address already in use Jun 10 12:00:09 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: error binding socket. Retrying: Address already in use Jun 10 12:00:10 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: error binding socket. Retrying: Address already in use Jun 10 12:00:11 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: error binding socket. Retrying: Address already in use Jun 10 12:00:12 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: error binding socket. Retrying: Address already in use Jun 10 12:00:13 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: unable to bind socket. Giving up: Address already in use Jun 10 12:00:13 ClusterNode1 heartbeat: [5315]: ERROR: make_io_childpair: cannot open ucast br0 Jun 10 12:00:14 ClusterNode1 heartbeat: [5317]: CRIT: Emergency Shutdown: Master Control process died. Jun 10 12:00:14 ClusterNode1 heartbeat: [5317]: CRIT: Killing pid 5315 with SIGTERM Jun 10 12:00:14 ClusterNode1 heartbeat: [5317]: CRIT: Emergency Shutdown(MCP dead): Killing ourselves. ========= netstat -ntlp ============ ClusterNode1:/ # netstat -ntlp Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:5801 0.0.0.0:* LISTEN 4039/xinetd tcp 0 0 0.0.0.0:5901 0.0.0.0:* LISTEN 4039/xinetd tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 3063/rpcbind tcp 0 0 0.0.0.0:6004 0.0.0.0:* LISTEN 4823/Xvnc tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 3907/sshd tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 3841/cupsd tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 3868/master tcp 0 0 :::111 :::* LISTEN 3063/rpcbind tcp 0 0 :::6004 :::* LISTEN 4823/Xvnc tcp 0 0 :::22 :::* LISTEN 3907/sshd ======= ha.cf ========== use_logd yes ucast br0 192.168.1.1 ucast br0 192.168.1.2 ucast br1 172.27.74.136 ucast br1 172.27.74.137 #serial /dev/ttyS0 node ClusterNode1 node ClusterNode2 respawn root /usr/lib64/heartbeat/hbagent apiauth mgmtd uid=root respawn root /usr/lib64/heartbeat/mgmtd -v crm on _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
