Thanks Andrew! Yes I see that but the two servers are identically configured with the only difference being IP, hostname so why am I not getting that error on node1? Which address is the error referring to? The main shared IP? I don't understand how to troubleshoot from that error.
Cameron On Wed, Mar 17, 2010 at 1:47 PM, Andrew Beekhof <[email protected]> wrote: > I wonder if this might be related: > > Mar 17 21:46:50 node2 heartbeat: [5289]: ERROR: glib: Error binding socket > (Address already in use). Retrying. > > > On Wed, Mar 17, 2010 at 9:44 PM, Cameron Smith <[email protected]> > wrote: > > Here is more info: > > > > In checking /var/log/messages: > > > > Mar 17 21:46:50 node2 heartbeat: [5288]: info: Version 2 support: false > > Mar 17 21:46:50 node2 heartbeat: [5288]: WARN: Logging daemon is disabled > > --enabling logging daemon is recommended > > Mar 17 21:46:50 node2 heartbeat: [5288]: info: ************************** > > Mar 17 21:46:50 node2 heartbeat: [5288]: info: Configuration validated. > > Starting heartbeat 2.1.3 > > Mar 17 21:46:50 node2 heartbeat: [5289]: info: heartbeat: version 2.1.3 > > Mar 17 21:46:50 node2 heartbeat: [5289]: info: Heartbeat generation: > > 1268877071 > > Mar 17 21:46:50 node2 heartbeat: [5289]: ERROR: glib: Error binding > socket > > (Address already in use). Retrying. > > Mar 17 21:46:51 node2 rpc.statd[2214]: recv_rply: can't decode RPC > message! > > Mar 17 21:46:51 node2 heartbeat: [5289]: ERROR: glib: Error binding > socket > > (Address already in use). Retrying. > > Mar 17 21:46:52 node2 heartbeat: [5289]: ERROR: glib: Error binding > socket > > (Address already in use). Retrying. > > Mar 17 21:46:53 node2 rpc.statd[2214]: recv_rply: can't decode RPC > message! > > Mar 17 21:46:53 node2 heartbeat: [5289]: ERROR: glib: Error binding > socket > > (Address already in use). Retrying. > > Mar 17 21:46:54 node2 heartbeat: [5289]: ERROR: glib: Error binding > socket > > (Address already in use). Retrying. > > Mar 17 21:46:55 node2 rpc.statd[2214]: recv_rply: can't decode RPC > message! > > Mar 17 21:46:55 node2 heartbeat: [5289]: ERROR: glib: Error binding > socket > > (Address already in use). Retrying. > > Mar 17 21:46:56 node2 heartbeat: [5289]: ERROR: glib: Error binding > socket > > (Address already in use). Retrying. > > Mar 17 21:46:57 node2 rpc.statd[2214]: recv_rply: can't decode RPC > message! > > Mar 17 21:46:57 node2 rpc.statd[2214]: recv_rply: can't decode RPC > message! > > Mar 17 21:46:57 node2 heartbeat: [5289]: ERROR: glib: Error binding > socket > > (Address already in use). Retrying. > > Mar 17 21:46:58 node2 heartbeat: [5289]: ERROR: glib: Error binding > socket > > (Address already in use). Retrying. > > Mar 17 21:46:59 node2 rpc.statd[2214]: recv_rply: can't decode RPC > message! > > Mar 17 21:46:59 node2 heartbeat: [5289]: ERROR: glib: Error binding > socket > > (Address already in use). Retrying. > > Mar 17 21:47:00 node2 heartbeat: [5289]: ERROR: glib: Unable to bind > socket > > (Address already in use). Giving up. > > Mar 17 21:47:00 node2 heartbeat: [5289]: info: glib: UDP Broadcast > heartbeat > > closed on port 694 interface eth0 - Status: 1 > > Mar 17 21:47:00 node2 heartbeat: [5289]: ERROR: make_io_childpair: cannot > > open bcast eth0 > > Mar 17 21:47:01 node2 rpc.statd[2214]: recv_rply: can't decode RPC > message! > > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Emergency Shutdown: Master > > Control process died. > > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Killing pid 5289 with > SIGTERM > > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Emergency Shutdown(MCP > dead): > > Killing ourselves. > > Mar 17 21:47:03 node2 rpc.statd[2214]: recv_rply: can't decode RPC > message! > > > > Why is it doing that??? > > > > Thanks! > > Cameron > > > > On Wed, Mar 17, 2010 at 1:03 PM, Cameron Smith <[email protected] > >wrote: > > > >> Brand new heartbeat user running my first test and have run into a > problem. > >> > >> I installed heartbeat on node1 and node2 > >> > >> it works fine on node1 and shows a webpage to the eth0:0 IP but > heartbeat > >> won't stay running on node2. > >> > >> When I start heartbeat I get: > >> # service heartbeat start > >> logd is already running > >> Starting High-Availability services: > >> 2010/03/17_21:04:44 INFO: Resource is stopped > >> [ OK ] > >> > >> > >> and then on a status I get: > >> heartbeat OK [pid 5093 et al] is running on node2.example.net [ > >> node2.example.net] > >> > >> but then after a short amount of time the status check will return: > >> # service heartbeat status > >> heartbeat is stopped. No process > >> > >> Why is this happening? > >> > >> I followed these instructions: > >> http://www.howtoforge.com/high_availability_heartbeat_centos > >> > >> I am running heartbeat 2.1.3 > >> > >> Thanks! > >> Cameron > >> > >> > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
