Hi, On Wed, Mar 17, 2010 at 02:03:49PM -0700, Cameron Smith wrote: > A-HA! > I see on node2: > > # netstat -lnup > Active Internet connections (only servers) > Proto Recv-Q Send-Q Local Address Foreign Address > State PID/Program name > udp 0 0 0.0.0.0:694 0.0.0.0:* > 2214/rpc.statd > > What is statd and why is it running using the port I need for heartbeat?
statd is a rpc program. Those convert rpc to ephemeral ports. Obviously, sometimes it may clash with another process. You can complain to your distributor. Thanks, Dejan > statd on node1 is running on 690 and 693. > Heartbeat is on 694: > > udp 0 0 0.0.0.0:694 0.0.0.0:* > 4581/heartbeat: wri > > Cameron > > On Wed, Mar 17, 2010 at 1:57 PM, Cameron Smith <[email protected]>wrote: > > > Thanks Andrew! > > > > Yes I see that but the two servers are identically configured with the only > > difference being IP, hostname so why am I not getting that error on node1? > > Which address is the error referring to? The main shared IP? I don't > > understand how to troubleshoot from that error. > > > > Cameron > > > > > > > > On Wed, Mar 17, 2010 at 1:47 PM, Andrew Beekhof <[email protected]>wrote: > > > >> I wonder if this might be related: > >> > >> Mar 17 21:46:50 node2 heartbeat: [5289]: ERROR: glib: Error binding socket > >> (Address already in use). Retrying. > >> > >> > >> On Wed, Mar 17, 2010 at 9:44 PM, Cameron Smith <[email protected]> > >> wrote: > >> > Here is more info: > >> > > >> > In checking /var/log/messages: > >> > > >> > Mar 17 21:46:50 node2 heartbeat: [5288]: info: Version 2 support: false > >> > Mar 17 21:46:50 node2 heartbeat: [5288]: WARN: Logging daemon is > >> disabled > >> > --enabling logging daemon is recommended > >> > Mar 17 21:46:50 node2 heartbeat: [5288]: info: > >> ************************** > >> > Mar 17 21:46:50 node2 heartbeat: [5288]: info: Configuration validated. > >> > Starting heartbeat 2.1.3 > >> > Mar 17 21:46:50 node2 heartbeat: [5289]: info: heartbeat: version 2.1.3 > >> > Mar 17 21:46:50 node2 heartbeat: [5289]: info: Heartbeat generation: > >> > 1268877071 > >> > Mar 17 21:46:50 node2 heartbeat: [5289]: ERROR: glib: Error binding > >> socket > >> > (Address already in use). Retrying. > >> > Mar 17 21:46:51 node2 rpc.statd[2214]: recv_rply: can't decode RPC > >> message! > >> > Mar 17 21:46:51 node2 heartbeat: [5289]: ERROR: glib: Error binding > >> socket > >> > (Address already in use). Retrying. > >> > Mar 17 21:46:52 node2 heartbeat: [5289]: ERROR: glib: Error binding > >> socket > >> > (Address already in use). Retrying. > >> > Mar 17 21:46:53 node2 rpc.statd[2214]: recv_rply: can't decode RPC > >> message! > >> > Mar 17 21:46:53 node2 heartbeat: [5289]: ERROR: glib: Error binding > >> socket > >> > (Address already in use). Retrying. > >> > Mar 17 21:46:54 node2 heartbeat: [5289]: ERROR: glib: Error binding > >> socket > >> > (Address already in use). Retrying. > >> > Mar 17 21:46:55 node2 rpc.statd[2214]: recv_rply: can't decode RPC > >> message! > >> > Mar 17 21:46:55 node2 heartbeat: [5289]: ERROR: glib: Error binding > >> socket > >> > (Address already in use). Retrying. > >> > Mar 17 21:46:56 node2 heartbeat: [5289]: ERROR: glib: Error binding > >> socket > >> > (Address already in use). Retrying. > >> > Mar 17 21:46:57 node2 rpc.statd[2214]: recv_rply: can't decode RPC > >> message! > >> > Mar 17 21:46:57 node2 rpc.statd[2214]: recv_rply: can't decode RPC > >> message! > >> > Mar 17 21:46:57 node2 heartbeat: [5289]: ERROR: glib: Error binding > >> socket > >> > (Address already in use). Retrying. > >> > Mar 17 21:46:58 node2 heartbeat: [5289]: ERROR: glib: Error binding > >> socket > >> > (Address already in use). Retrying. > >> > Mar 17 21:46:59 node2 rpc.statd[2214]: recv_rply: can't decode RPC > >> message! > >> > Mar 17 21:46:59 node2 heartbeat: [5289]: ERROR: glib: Error binding > >> socket > >> > (Address already in use). Retrying. > >> > Mar 17 21:47:00 node2 heartbeat: [5289]: ERROR: glib: Unable to bind > >> socket > >> > (Address already in use). Giving up. > >> > Mar 17 21:47:00 node2 heartbeat: [5289]: info: glib: UDP Broadcast > >> heartbeat > >> > closed on port 694 interface eth0 - Status: 1 > >> > Mar 17 21:47:00 node2 heartbeat: [5289]: ERROR: make_io_childpair: > >> cannot > >> > open bcast eth0 > >> > Mar 17 21:47:01 node2 rpc.statd[2214]: recv_rply: can't decode RPC > >> message! > >> > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Emergency Shutdown: > >> Master > >> > Control process died. > >> > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Killing pid 5289 with > >> SIGTERM > >> > Mar 17 21:47:01 node2 heartbeat: [5292]: CRIT: Emergency Shutdown(MCP > >> dead): > >> > Killing ourselves. > >> > Mar 17 21:47:03 node2 rpc.statd[2214]: recv_rply: can't decode RPC > >> message! > >> > > >> > Why is it doing that??? > >> > > >> > Thanks! > >> > Cameron > >> > > >> > On Wed, Mar 17, 2010 at 1:03 PM, Cameron Smith <[email protected] > >> >wrote: > >> > > >> >> Brand new heartbeat user running my first test and have run into a > >> problem. > >> >> > >> >> I installed heartbeat on node1 and node2 > >> >> > >> >> it works fine on node1 and shows a webpage to the eth0:0 IP but > >> heartbeat > >> >> won't stay running on node2. > >> >> > >> >> When I start heartbeat I get: > >> >> # service heartbeat start > >> >> logd is already running > >> >> Starting High-Availability services: > >> >> 2010/03/17_21:04:44 INFO: Resource is stopped > >> >> [ OK ] > >> >> > >> >> > >> >> and then on a status I get: > >> >> heartbeat OK [pid 5093 et al] is running on node2.example.net [ > >> >> node2.example.net] > >> >> > >> >> but then after a short amount of time the status check will return: > >> >> # service heartbeat status > >> >> heartbeat is stopped. No process > >> >> > >> >> Why is this happening? > >> >> > >> >> I followed these instructions: > >> >> http://www.howtoforge.com/high_availability_heartbeat_centos > >> >> > >> >> I am running heartbeat 2.1.3 > >> >> > >> >> Thanks! > >> >> Cameron > >> >> > >> >> > >> > _______________________________________________ > >> > Linux-HA mailing list > >> > [email protected] > >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> > See also: http://linux-ha.org/ReportingProblems > >> > > >> _______________________________________________ > >> Linux-HA mailing list > >> [email protected] > >> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> See also: http://linux-ha.org/ReportingProblems > >> > > > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
