hi, all

I have a strange problem that when the network recover, the offline node
can't back to the cluster.
Here is the detail information:
There is 3 nodes in the cluster: node-a, node-b, node-c
The ha.cf file is configured like this:
keepalive 125ms
deadtime 500ms
warntime 250ms
bcast enet0
auto_failback off
crm on
autojoin any


When I used ifconfig to down the enet0 on node-b. On node-a and node-c, I
saw that node-b is dead. On node-b, I saw that node-a and node-c is dead. I
think it is correct. Because the network on node-b has down.

But the problem is when I used ifconfig to up enet0 on node-b. It can't back
to the cluster and node-b always dead when I saw on node-a and node-c.
When I check the log on node-b, I found that:
heartbeat[23483]: 2007/11/21_08:32:01 ERROR: Message hist queue is filling
up (151 messages in queue)
heartbeat[23483]: 2007/11/21_08:32:01 ERROR: Message hist queue is filling
up (152 messages in queue)
heartbeat[23483]: 2007/11/21_08:32:01 ERROR: Message hist queue is filling
up (153 messages in queue)
heartbeat[23483]: 2007/11/21_08:32:01 ERROR: Message hist queue is filling
up (154 messages in queue)
heartbeat[23483]: 2007/11/21_08:32:01 ERROR: Message hist queue is filling
up (155 messages in queue)
heartbeat[23483]: 2007/11/21_08:32:01 ERROR: Message hist queue is filling
up (156 messages in queue)
heartbeat[23483]: 2007/11/21_08:32:01 ERROR: Message hist queue is filling
up (157 messages in queue)
heartbeat[23483]: 2007/11/21_08:32:01 ERROR: Message hist queue is filling
up (158 messages in queue)
heartbeat[23483]: 2007/11/21_08:32:02 ERROR: Message hist queue is filling
up (159 messages in queue)
heartbeat[23483]: 2007/11/21_08:32:02 ERROR: Message hist queue is filling
up (160 messages in queue)
heartbeat[23483]: 2007/11/21_08:32:02 ERROR: Message hist queue is filling
up (161 messages in queue)


All the logs are "Message hist queue is filling up".  When I try to stop
heartbeat, it always hung and can't work correctly.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to