Hi Hideo-san,

On Thu, Feb 10, 2011 at 11:06:33AM +0900, [email protected] wrote:
> Hi Russell,
> 
> > https://lists.linux-foundation.org/pipermail/openais/2011-February/015696.html
> 
> Thanks.

Did the workaround help you?

In the cluster about which I talked earlier in this thread, the
workaround led to a different set of problems. The communication
was supposed to work (the interconnect over the second interface
was up all the time), but it didn't or at least not completely.
For instance, all actions which originated on the first node,
but were executed on the second node, got lost.

May 25 15:37:33 node2 corosync[842]:   [TOTEM ] Marking seqid 95965 ringid 0 
interface 10.5.155.31 FAULTY - adminisrtative intervention required.
May 25 15:37:33 node2 corosync[12847]:   [TOTEM ] Marking seqid 95964 ringid 0 
interface 10.5.155.32 FAULTY - adminisrtative intervention required.
...
May 25 15:50:38 node1 crmd: [853]: info: te_rsc_command: Initiating action 27: 
start rsc_md_start_0 on node2
... [node2 never heard that it should start this resource]
May 25 15:51:58 node1 crmd: [853]: ERROR: print_elem: Aborting transition, 
action lost: [Action 27]: In-flight (id: rsc_md_start_0, loc: node2, priority: 
0)

node1 eventually realized that it couldn't send packets, but
only about 12 minutes later:

May 25 16:02:13 node1 corosync[842]:   [pcmk  ] ERROR: send_cluster_msg_raw: 
Child 25108 spawned to record non-fatal assertion failure line 1536: rc == 0

Corosync aborted on the healthy node (node2) a bit after the
network link was reestablished on the first node:

May 25 16:10:45 node1 kernel: [617584.397364] qlge 0000:15:00.0: ql_link_up: 
Link Up.
May 25 16:11:19 node2 corosync[12847]:   [TOTEM ] FAILED TO RECEIVE
May 25 16:11:19 node2 corosync[12847]:   [TOTEM ] FAILED TO RECEIVE
...
May 25 16:11:24 node2 corosync[12847]:   [TOTEM ] FAILED TO RECEIVE

    0x694e60 "corosync: totemsrp.c:1192: memb_consensus_agreed: Assertion 
`token_memb_entries >= 1' failed.\n"
        errstr = "Unexpected error.\n"

Cheers,

Dejan

> Hideo Yamauchi.
> 
> 
> 
> --- Russell Bryant <[email protected]> wrote:
> 
> > On Wed, Feb 9, 2011 at 7:58 PM,  <[email protected]> wrote:
> > > Hi Steven,
> > >
> > >> Please use the suggested workaround posted earlier toda.
> > >
> > > Where are these contents?
> > > Will you please teach a link?
> > 
> > https://lists.linux-foundation.org/pipermail/openais/2011-February/015696.html
> > 
> > --
> > Russell Bryant
> > 
> 
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to