Hi,

On Wed, Mar 17, 2010 at 02:03:49PM -0700, Cameron Smith wrote:
> A-HA!
> I see on node2:
> 
> # netstat -lnup
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address               Foreign Address
> State       PID/Program name
> udp        0      0 0.0.0.0:694                 0.0.0.0:*
>             2214/rpc.statd
> 
> What is statd and why is it running using the port I need for heartbeat?

statd is a rpc program. Those convert rpc to ephemeral ports.
Obviously, sometimes it may clash with another process. You can
complain to your distributor.

Thanks,

Dejan

> statd on node1 is running on 690 and 693.
> Heartbeat is on 694:
> 
> udp        0      0 0.0.0.0:694                 0.0.0.0:*
>             4581/heartbeat: wri
> 
> Cameron
> 
> On Wed, Mar 17, 2010 at 1:57 PM, Cameron Smith <[email protected]>wrote:
> 
> > Thanks Andrew!
> >
> > Yes I see that but the two servers are identically configured with the only
> > difference being IP, hostname so why am I not getting that error on node1?
> > Which address is the error referring to? The main shared IP? I don't
> > understand how to troubleshoot from that error.
> >
> > Cameron
> >
> >
> >
> > On Wed, Mar 17, 2010 at 1:47 PM, Andrew Beekhof <[email protected]>wrote:
> >
> >> I wonder if this might be related:
> >>
> >> Mar 17 21:46:50 node2 heartbeat: [5289]: ERROR: glib: Error binding socket
> >> (Address already in use). Retrying.
> >>
> >>
> >> On Wed, Mar 17, 2010 at 9:44 PM, Cameron Smith <[email protected]>
> >> wrote:
> >> > Here is more info:
> >> >
> >> > In checking /var/log/messages:
> >> >
> >> > Mar 17 21:46:50 node2 heartbeat: [5288]: info: Version 2 support: false
> >> > Mar 17 21:46:50 node2 heartbeat: [5288]: WARN: Logging daemon is
> >> disabled
> >> > --enabling logging daemon is recommended
> >> > Mar 17 21:46:50 node2 heartbeat: [5288]: info:
> >> **************************
> >> > Mar 17 21:46:50 node2 heartbeat: [5288]: info: Configuration validated.
> >> > Starting heartbeat 2.1.3
> >> > Mar 17 21:46:50 node2 heartbeat: [5289]: info: heartbeat: version 2.1.3
> >> > Mar 17 21:46:50 node2 heartbeat: [5289]: info: Heartbeat generation:
> >> > 1268877071
> >> > Mar 17 21:46:50 node2 heartbeat: [5289]: ERROR: glib: Error binding
> >> socket
> >> > (Address already in use). Retrying.
> >> > Mar 17 21:46:51 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> >> message!
> >> > Mar 17 21:46:51 node2 heartbeat: [5289]: ERROR: glib: Error binding
> >> socket
> >> > (Address already in use). Retrying.
> >> > Mar 17 21:46:52 node2 heartbeat: [5289]: ERROR: glib: Error binding
> >> socket
> >> > (Address already in use). Retrying.
> >> > Mar 17 21:46:53 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> >> message!
> >> > Mar 17 21:46:53 node2 heartbeat: [5289]: ERROR: glib: Error binding
> >> socket
> >> > (Address already in use). Retrying.
> >> > Mar 17 21:46:54 node2 heartbeat: [5289]: ERROR: glib: Error binding
> >> socket
> >> > (Address already in use). Retrying.
> >> > Mar 17 21:46:55 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> >> message!
> >> > Mar 17 21:46:55 node2 heartbeat: [5289]: ERROR: glib: Error binding
> >> socket
> >> > (Address already in use). Retrying.
> >> > Mar 17 21:46:56 node2 heartbeat: [5289]: ERROR: glib: Error binding
> >> socket
> >> > (Address already in use). Retrying.
> >> > Mar 17 21:46:57 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> >> message!
> >> > Mar 17 21:46:57 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> >> message!
> >> > Mar 17 21:46:57 node2 heartbeat: [5289]: ERROR: glib: Error binding
> >> socket
> >> > (Address already in use). Retrying.
> >> > Mar 17 21:46:58 node2 heartbeat: [5289]: ERROR: glib: Error binding
> >> socket
> >> > (Address already in use). Retrying.
> >> > Mar 17 21:46:59 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> >> message!
> >> > Mar 17 21:46:59 node2 heartbeat: [5289]: ERROR: glib: Error binding
> >> socket
> >> > (Address already in use). Retrying.
> >> > Mar 17 21:47:00 node2 heartbeat: [5289]: ERROR: glib: Unable to bind
> >> socket
> >> > (Address already in use). Giving up.
> >> > Mar 17 21:47:00 node2 heartbeat: [5289]: info: glib: UDP Broadcast
> >> heartbeat
> >> > closed on port 694 interface eth0 - Status: 1
> >> > Mar 17 21:47:00 node2 heartbeat: [5289]: ERROR: make_io_childpair:
> >> cannot
> >> > open bcast eth0
> >> > Mar 17 21:47:01 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> >> message!
> >> > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Emergency Shutdown:
> >> Master
> >> > Control process died.
> >> > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Killing pid 5289 with
> >> SIGTERM
> >> > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Emergency Shutdown(MCP
> >> dead):
> >> > Killing ourselves.
> >> > Mar 17 21:47:03 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> >> message!
> >> >
> >> > Why is it doing that???
> >> >
> >> > Thanks!
> >> > Cameron
> >> >
> >> > On Wed, Mar 17, 2010 at 1:03 PM, Cameron Smith <[email protected]
> >> >wrote:
> >> >
> >> >> Brand new heartbeat user running my first test and have run into a
> >> problem.
> >> >>
> >> >> I installed heartbeat on node1 and node2
> >> >>
> >> >> it works fine on node1 and shows a webpage to the eth0:0 IP but
> >> heartbeat
> >> >> won't stay running on node2.
> >> >>
> >> >> When I start heartbeat I get:
> >> >> # service heartbeat start
> >> >> logd is already running
> >> >> Starting High-Availability services:
> >> >> 2010/03/17_21:04:44 INFO:  Resource is stopped
> >> >>                                                            [  OK  ]
> >> >>
> >> >>
> >> >> and then on a status I get:
> >> >> heartbeat OK [pid 5093 et al] is running on node2.example.net [
> >> >> node2.example.net]
> >> >>
> >> >> but then after a short amount of time the status check will return:
> >> >> # service heartbeat status
> >> >> heartbeat is stopped. No process
> >> >>
> >> >> Why is this happening?
> >> >>
> >> >> I followed these instructions:
> >> >> http://www.howtoforge.com/high_availability_heartbeat_centos
> >> >>
> >> >> I am running heartbeat 2.1.3
> >> >>
> >> >> Thanks!
> >> >> Cameron
> >> >>
> >> >>
> >> > _______________________________________________
> >> > Linux-HA mailing list
> >> > [email protected]
> >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> > See also: http://linux-ha.org/ReportingProblems
> >> >
> >> _______________________________________________
> >> Linux-HA mailing list
> >> [email protected]
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> See also: http://linux-ha.org/ReportingProblems
> >>
> >
> >
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to