no I did not try lsof  or fuser (next time I will). But shouldn-t 
netstat show the process also. further it would be strange for an other 
proces to keep this port. "randomly" after a reboot it should occupy the 
same again, shouldn-t it?


regards

jeroen

Dejan Muhamedagic wrote:
> Hi,
>
> On Wed, Jun 10, 2009 at 12:20:14PM +0200, jeroen groenewegen van der weyden 
> wrote:
>   
>> Hi everybody,
>>
>> I just experienced a strange behavior, after rebooting our server manual 
>> the heart beat came not into service after the reboot. The message log 
>> show Retrying already in use? but in netstat nothing shows up on port 
>>     
>
> Did you try lsof or fuser?
>
>   
>> 694? The nodes were able to see each other. On both nodes services were 
>> connecting using the same link (br0).
>>
>> A heartbeart stop/start did not help and resulted in the same log messages
>> After the a second reboot the phenomenon was gone
>>
>> heartbeat V2.99.2
>> openSUSE 11.1
>>
>> Anybody seen this before? or know the cause of it?
>>     
>
> No. The only explanation I can imagine is that another process is
> using this port.
>
> Thanks,
>
> Dejan
>
>   
>> best regards
>>
>> jeroen
>>
>>  ====== log =========
>> ClusterNode1:/ # tail /var/log/messages
>> Jun 10 12:00:08 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
>> error binding socket. Retrying: Address already in use
>> Jun 10 12:00:09 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
>> error binding socket. Retrying: Address already in use
>> Jun 10 12:00:10 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
>> error binding socket. Retrying: Address already in use
>> Jun 10 12:00:11 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
>> error binding socket. Retrying: Address already in use
>> Jun 10 12:00:12 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
>> error binding socket. Retrying: Address already in use
>> Jun 10 12:00:13 ClusterNode1 heartbeat: [5315]: ERROR: glib: ucast: 
>> unable to bind socket. Giving up: Address already in use
>> Jun 10 12:00:13 ClusterNode1 heartbeat: [5315]: ERROR: 
>> make_io_childpair: cannot open ucast br0
>> Jun 10 12:00:14 ClusterNode1 heartbeat: [5317]: CRIT: Emergency 
>> Shutdown: Master Control process died.
>> Jun 10 12:00:14 ClusterNode1 heartbeat: [5317]: CRIT: Killing pid 5315 
>> with SIGTERM
>> Jun 10 12:00:14 ClusterNode1 heartbeat: [5317]: CRIT: Emergency 
>> Shutdown(MCP dead): Killing ourselves.
>>
>>
>> ========= netstat -ntlp ============
>>
>> ClusterNode1:/ # netstat -ntlp
>> Active Internet connections (only servers)
>> Proto Recv-Q Send-Q Local Address           Foreign Address         
>> State       PID/Program name
>> tcp        0      0 0.0.0.0:5801            0.0.0.0:*               
>> LISTEN      4039/xinetd
>> tcp        0      0 0.0.0.0:5901            0.0.0.0:*               
>> LISTEN      4039/xinetd
>> tcp        0      0 0.0.0.0:111             0.0.0.0:*               
>> LISTEN      3063/rpcbind
>> tcp        0      0 0.0.0.0:6004            0.0.0.0:*               
>> LISTEN      4823/Xvnc
>> tcp        0      0 0.0.0.0:22              0.0.0.0:*               
>> LISTEN      3907/sshd
>> tcp        0      0 127.0.0.1:631           0.0.0.0:*               
>> LISTEN      3841/cupsd
>> tcp        0      0 127.0.0.1:25            0.0.0.0:*               
>> LISTEN      3868/master
>> tcp        0      0 :::111                  :::*                    
>> LISTEN      3063/rpcbind
>> tcp        0      0 :::6004                 :::*                    
>> LISTEN      4823/Xvnc
>> tcp        0      0 :::22                   :::*                    
>> LISTEN      3907/sshd
>>
>>
>> ======= ha.cf ==========
>>
>> use_logd yes
>> ucast br0 192.168.1.1
>> ucast br0 192.168.1.2
>> ucast br1 172.27.74.136
>> ucast br1 172.27.74.137
>> #serial /dev/ttyS0
>> node ClusterNode1
>> node ClusterNode2
>> respawn root /usr/lib64/heartbeat/hbagent
>> apiauth mgmtd uid=root
>> respawn root /usr/lib64/heartbeat/mgmtd -v
>> crm on
>>
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>     
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com 
> Version: 8.5.339 / Virus Database: 270.12.60/2166 - Release Date: 06/09/09 
> 18:08:00
>
>   

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to