Hi,

In a two-node cluster, a few times per day one of the nodes (not always
the same) reboots because it is fenced by the other node.  The logging
on the fencing node starts with:

Nov 10 22:30:14 node2 openais[3275]: [TOTEM] The token was lost in the 
OPERATIONAL state.
Nov 10 22:30:14 node2 openais[3275]: [TOTEM] Receive multicast socket recv 
buffer size (262142 bytes).
Nov 10 22:30:14 node2 openais[3275]: [TOTEM] Transmit multicast socket send 
buffer size (262142 bytes).
Nov 10 22:30:14 node2 openais[3275]: [TOTEM] entering GATHER state from 2.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] entering GATHER state from 0.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Creating commit token because I am 
the rep.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Saving state aru 32fc3 high seq 
received 32fc3
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] entering COMMIT state.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] entering RECOVERY state.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] position [0] member 
<ip-addr-of-node-2>:
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] previous ring seq 56 rep 
<ip-addr-of-node-1>
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] aru 32fc3 high delivered 32fc3 
received flag 0
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Did not need to originate any 
messages in recovery.
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Storing new sequence id for ring 3c
Nov 10 22:30:19 node2 openais[3275]: [TOTEM] Sending initial ORF token

On the fenced node, in most cases nothing is logged before the reboot.
A few times, a "fatal: filesystem consistency error" was reported on
the fenced node just before the reboot.

Should I assume that in case nothing is logged this is also caused by a
fs error, although the log was not wriiten to disk in time before being
fenced?

Thanks,

--
--    Jos Vos <[EMAIL PROTECTED]>
--    X/OS Experts in Open Systems BV   |   Phone: +31 20 6938364
--    Amsterdam, The Netherlands        |     Fax: +31 20 6948204

--
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to