[Linux-HA] various errors on ec2

Ryan Ernst Thu, 15 May 2008 12:11:21 -0700

Hi,

I'm still trying to get heartbeat working on ec2.  I am using ucast as
previously directed by this list.


I've set a logfile on each box so I can see what they are doing. I'm
currently testing with 3 nodes.  My ha.cf looks like this:

logfacility local0
# ucast members - everything but this server
ucast eth0 ip-10-251-43-97
ucast eth0 ip-10-251-27-191

# nodes, including this server
node ip-10-251-43-210
node ip-10-251-43-97
node ip-10-251-27-191

auto_failback off
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
crm on
logfile /var/log/ha.log


This is on the machine node ip-10-251-43-210.
The log file shows a couple things that I am baffled by.

First, it seems there are a number of warnings for the uuid of a node
changing. Example:

heartbeat[20366]: 2008/05/15_10:41:55 WARN: nodename ip-10-251-43-210 uuid
changed to ip-10-251-27-191

After a few of these, I get errors that look like this:

heartbeat[20366]: 2008/05/15_10:41:55 ERROR: send_rexmit_request: entry not
found in rexmit_hash_tablefor seq/node(40536 ip-10-251-43-210)

And after those errors have repeated a lot, I get the following, intermixed
more of the warnings above:

heartbeat[20366]: 2008/05/15_10:41:56 ERROR: should_drop_message: attempted
replay attack [ip-10-251-43-210]? [gen = 1210812007, curgen = 1210812019]


If it helps, here is my haresources file:

ip-10-251-43-210 \
    ldirectord \
    LVSSyncDaemonSwap::master


Can anyone help me?

Thanks
Ryan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] various errors on ec2

Reply via email to