Hi,
I'm still trying to get heartbeat working on ec2. I am using ucast as
previously directed by this list.
I've set a logfile on each box so I can see what they are doing. I'm
currently testing with 3 nodes. My ha.cf looks like this:
logfacility local0
# ucast members - everything but this server
ucast eth0 ip-10-251-43-97
ucast eth0 ip-10-251-27-191
# nodes, including this server
node ip-10-251-43-210
node ip-10-251-43-97
node ip-10-251-27-191
auto_failback off
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
crm on
logfile /var/log/ha.log
This is on the machine node ip-10-251-43-210.
The log file shows a couple things that I am baffled by.
First, it seems there are a number of warnings for the uuid of a node
changing. Example:
heartbeat[20366]: 2008/05/15_10:41:55 WARN: nodename ip-10-251-43-210 uuid
changed to ip-10-251-27-191
After a few of these, I get errors that look like this:
heartbeat[20366]: 2008/05/15_10:41:55 ERROR: send_rexmit_request: entry not
found in rexmit_hash_tablefor seq/node(40536 ip-10-251-43-210)
And after those errors have repeated a lot, I get the following, intermixed
more of the warnings above:
heartbeat[20366]: 2008/05/15_10:41:56 ERROR: should_drop_message: attempted
replay attack [ip-10-251-43-210]? [gen = 1210812007, curgen = 1210812019]
If it helps, here is my haresources file:
ip-10-251-43-210 \
ldirectord \
LVSSyncDaemonSwap::master
Can anyone help me?
Thanks
Ryan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems