On Tue, Jan 12, 2010 at 8:41 AM, hj lee <[email protected]> wrote: > > > On Mon, Jan 11, 2010 at 11:37 AM, Steven Dake <[email protected]> wrote: > >> On Fri, 2010-01-08 at 16:54 -0700, hj lee wrote: >> > Hi, >> > >> > My simple two-node cluster is configured with two ring interfaces >> > (ring 0 and ring 1). If I disconnect one of cable at one of nodes, >> > then openais enters to GATHER mode and returns to OPERATIONAL mode. >> > This takes 200 - 300 msec. In GATHER mode, the openais does not seem >> > to send and receive any message from Pacemaker like messages for >> > updating cib files. >> > >> > What is the reason behind of this behavior while the openais is still >> > connected with other ring interface? Is it possible to disable this >> > behavior? >> > >> > In my configuration, the eth0 is used both for openais and pingd >> > monitoring. So when eth0 is disconnected, pingd fails and also openais >> > enters GATHER mode at the same time. The pingd failure trigger the >> > resource migration, but this migration is delayed by 200 - 300 msec. I >> > want to get rid of this delay when pingd fails. >> > >> >> This seems quite odd. Gather should not be entered unless the network >> interface is actually changed. It may be that you are running a tool >> such as network manager which is downing the interface. >> >> Running network manager with corosync is likely to provide bad results >> because network manager destroys interfaces within the kernel. >> >> Please report back if this is your issue. >> >> > I am not sure what you mean "network manager", I do not run any thing like > that.. I just unplug one of two ring NIC cables, sometimes it enters GATHER > mode. If I do ifdown one of two ring interfaces, sometimes it enters GATHER > mode. What I want is the Openais enters GATHER mode only when both ring > interfaces are disconnected. > > > 2010-01-11 16:18:09.954914 silverthorne2-openais[3789]: [totemrrp.c:0803] > Marking seqid 1710799 ringid 1 interface 111.16.127.30 FAULTY - > adminisrtative intervention required. > 2010-01-11 16:18:10.478828 silverthorne2-pingd: [4000]: info: > stand_alone_ping: Node 111.16.127.254 is unreachable (read) > 2010-01-11 16:18:10.900847 silverthorne2-vmre[18099]: vmre_send_keepalive: > 2010-01-11 16:18:10.900875 silverthorne2-vmre[18099]: vmre_send: > 2010-01-11 16:18:10.900882 silverthorne2-vmre[18099]: vmre_send_keepalive: > vm_state 4 sent 8 bytes num_ckpts, ckpt_seq since last ka=73 current > ckpt_seq=527662 > 2010-01-11 16:18:11.138074 silverthorne2-openais[3789]: [totemsrp.c:3339] > FAILED TO RECEIVE > 2010-01-11 16:18:11.138217 silverthorne2-openais[3789]: [totemsrp.c:1732] > entering GATHER state from 6. > 2010-01-11 16:18:11.138226 silverthorne2-openais[3789]: [totemsrp.c:2788] > Creating commit token because I am the rep. > 2010-01-11 16:18:11.138235 silverthorne2-openais[3789]: [totemsrp.c:1303] > Saving state aru 4d high seq received 4d > 2010-01-11 16:18:11.138267 silverthorne2-openais[3789]: [totemsrp.c:2949] > Storing new sequence id for ring 134e0 > 2010-01-11 16:18:11.138276 silverthorne2-openais[3789]: [totemsrp.c:1771] > entering COMMIT state. > 2010-01-11 16:18:11.138744 silverthorne2-openais[3789]: [totemsrp.c:1803] > entering RECOVERY state. > 2010-01-11 16:18:11.138874 silverthorne2-openais[3789]: [totemsrp.c:1832] > position [0] member 192.168.10.21: > 2010-01-11 16:18:11.139056 silverthorne2-openais[3789]: [totemsrp.c:1836] > previous ring seq 79068 rep 192.168.10.21 > > Another instance of log when do ifdown eth1. The eth1 is one of ring NICs.
2009-12-21 13:45:15.034066 silverthorne2-kernel: [ 2549.061149] bnx2: eth1 NIC Copper Link is Down 2009-12-21 13:45:15.988851 silverthorne2-openais[3600]: [totemrrp.c:0803] Marking seqid 52259 ringid 1 interface 111.2.184.25 FAULTY - adminisrtative intervention required. 2009-12-21 13:45:16.368615 silverthorne2-openais[3600]: [totemsrp.c:1425] The token was lost in the OPERATIONAL state. 2009-12-21 13:45:16.368642 silverthorne2-openais[3600]: [totemnet.c:0995] Receive multicast socket recv buffer size (262142 bytes). 2009-12-21 13:45:16.368653 silverthorne2-openais[3600]: [totemnet.c:1001] Transmit multicast socket send buffer size (262142 bytes). 2009-12-21 13:45:16.368662 silverthorne2-openais[3600]: [totemsrp.c:1732] entering GATHER state from 2. 2009-12-21 13:45:16.613470 silverthorne2-openais[3600]: [totemsrp.c:1732] entering GATHER state from 0.
_______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
