Thanks Andrew!

Yes I see that but the two servers are identically configured with the only
difference being IP, hostname so why am I not getting that error on node1?
Which address is the error referring to? The main shared IP? I don't
understand how to troubleshoot from that error.

Cameron


On Wed, Mar 17, 2010 at 1:47 PM, Andrew Beekhof <[email protected]> wrote:

> I wonder if this might be related:
>
> Mar 17 21:46:50 node2 heartbeat: [5289]: ERROR: glib: Error binding socket
> (Address already in use). Retrying.
>
>
> On Wed, Mar 17, 2010 at 9:44 PM, Cameron Smith <[email protected]>
> wrote:
> > Here is more info:
> >
> > In checking /var/log/messages:
> >
> > Mar 17 21:46:50 node2 heartbeat: [5288]: info: Version 2 support: false
> > Mar 17 21:46:50 node2 heartbeat: [5288]: WARN: Logging daemon is disabled
> > --enabling logging daemon is recommended
> > Mar 17 21:46:50 node2 heartbeat: [5288]: info: **************************
> > Mar 17 21:46:50 node2 heartbeat: [5288]: info: Configuration validated.
> > Starting heartbeat 2.1.3
> > Mar 17 21:46:50 node2 heartbeat: [5289]: info: heartbeat: version 2.1.3
> > Mar 17 21:46:50 node2 heartbeat: [5289]: info: Heartbeat generation:
> > 1268877071
> > Mar 17 21:46:50 node2 heartbeat: [5289]: ERROR: glib: Error binding
> socket
> > (Address already in use). Retrying.
> > Mar 17 21:46:51 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> message!
> > Mar 17 21:46:51 node2 heartbeat: [5289]: ERROR: glib: Error binding
> socket
> > (Address already in use). Retrying.
> > Mar 17 21:46:52 node2 heartbeat: [5289]: ERROR: glib: Error binding
> socket
> > (Address already in use). Retrying.
> > Mar 17 21:46:53 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> message!
> > Mar 17 21:46:53 node2 heartbeat: [5289]: ERROR: glib: Error binding
> socket
> > (Address already in use). Retrying.
> > Mar 17 21:46:54 node2 heartbeat: [5289]: ERROR: glib: Error binding
> socket
> > (Address already in use). Retrying.
> > Mar 17 21:46:55 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> message!
> > Mar 17 21:46:55 node2 heartbeat: [5289]: ERROR: glib: Error binding
> socket
> > (Address already in use). Retrying.
> > Mar 17 21:46:56 node2 heartbeat: [5289]: ERROR: glib: Error binding
> socket
> > (Address already in use). Retrying.
> > Mar 17 21:46:57 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> message!
> > Mar 17 21:46:57 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> message!
> > Mar 17 21:46:57 node2 heartbeat: [5289]: ERROR: glib: Error binding
> socket
> > (Address already in use). Retrying.
> > Mar 17 21:46:58 node2 heartbeat: [5289]: ERROR: glib: Error binding
> socket
> > (Address already in use). Retrying.
> > Mar 17 21:46:59 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> message!
> > Mar 17 21:46:59 node2 heartbeat: [5289]: ERROR: glib: Error binding
> socket
> > (Address already in use). Retrying.
> > Mar 17 21:47:00 node2 heartbeat: [5289]: ERROR: glib: Unable to bind
> socket
> > (Address already in use). Giving up.
> > Mar 17 21:47:00 node2 heartbeat: [5289]: info: glib: UDP Broadcast
> heartbeat
> > closed on port 694 interface eth0 - Status: 1
> > Mar 17 21:47:00 node2 heartbeat: [5289]: ERROR: make_io_childpair: cannot
> > open bcast eth0
> > Mar 17 21:47:01 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> message!
> > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Emergency Shutdown: Master
> > Control process died.
> > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Killing pid 5289 with
> SIGTERM
> > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Emergency Shutdown(MCP
> dead):
> > Killing ourselves.
> > Mar 17 21:47:03 node2 rpc.statd[2214]: recv_rply: can't decode RPC
> message!
> >
> > Why is it doing that???
> >
> > Thanks!
> > Cameron
> >
> > On Wed, Mar 17, 2010 at 1:03 PM, Cameron Smith <[email protected]
> >wrote:
> >
> >> Brand new heartbeat user running my first test and have run into a
> problem.
> >>
> >> I installed heartbeat on node1 and node2
> >>
> >> it works fine on node1 and shows a webpage to the eth0:0 IP but
> heartbeat
> >> won't stay running on node2.
> >>
> >> When I start heartbeat I get:
> >> # service heartbeat start
> >> logd is already running
> >> Starting High-Availability services:
> >> 2010/03/17_21:04:44 INFO:  Resource is stopped
> >>                                                            [  OK  ]
> >>
> >>
> >> and then on a status I get:
> >> heartbeat OK [pid 5093 et al] is running on node2.example.net [
> >> node2.example.net]
> >>
> >> but then after a short amount of time the status check will return:
> >> # service heartbeat status
> >> heartbeat is stopped. No process
> >>
> >> Why is this happening?
> >>
> >> I followed these instructions:
> >> http://www.howtoforge.com/high_availability_heartbeat_centos
> >>
> >> I am running heartbeat 2.1.3
> >>
> >> Thanks!
> >> Cameron
> >>
> >>
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to