On Fri, Apr 09, 2010 at 09:33:30AM +0200, Jan Friesse wrote:
> Dave,
> 
> >
> >Oh, and I may have just invented a time machine by merging partitioned
> >clusters!
> >
> >1270661597 cluster node 1 added seq 2128
> >1270661597 fenced:daemon conf 3 1 0 memb 1 2 4 join 1 left
> >1270661597 cpg_mcast_joined retried 4 protocol
> >1270661597 fenced:daemon ring 1:2128 3 memb 1 2 4
> >1270661597 fenced:default conf 3 1 0 memb 1 2 4 join 1 left  (*)
> >1270661597 add_change cg 5 joined nodeid 1
> >1270661597 add_change cg 5 counts member 3 joined 1 remove 0 failed 0
> >1270661597 check_ringid cluster 2128 cpg 2:2124
> >1270661597 fenced:default ring 1:2128 3 memb 1 2 4  (**)
> >1270661597 check_ringid done cluster 2128 cpg 1:2128
> >1270661597 check_quorum done
> >
> >* confchg callback adding node 1
> >** totem callback adding node 1
> 
> this is something little different and it is one of your requirements.

Yes, this ordering makes sense and works.  I was just pointing out that
it's not *always* true that a totem callback precedes a confchg callback
when adding a node.  Obviously Chrissie was thinking about a node starting
up and not the case of partition merging.

> ^^^ This is what you are talking about. Confchg precede totem
> callback (as your requirements)

I never had a hard requirement about callback ordering, because I didn't
know exactly what effect it would have.  But my suggestion was that when
an event caused both confchg and totem callbacks to be queued for a cpg,
the confchg_cb be queued first and the totem_cb be queued second.

Now that I've stepped through my test case a couple times with this issue
in mind, I don't think I actually require any specific ordering of
callbacks.  It looks like things will work the same regardless.

> Anyway, can you please send me (exactly) what problem (original
> problem) are you trying to solve?

My test case that hasn't worked (until now), is the following:

a. members 1,2,3
b. partition 1 / 2,3
c. merge 1,2,3
d. cluster is killed on node 1
e. cluster is started on node 1

In this case nodes 2 and 3 see:

a. cluster = 1,2,3
b. cluster -1 2228
c. cluster +1 2232
d. cluster -1 2236
e. cluster +1 2240

("cluster +/-N M" is cman callback adding/removing nodeid N with ringid M)

Node 2 begins fencing node 1 in step b, but I've configured fencing to
fail indefinitely, so the fencing doesn't complete on 2 until step e when
it sees node 1 restart cleanly (without its state).

So *after* step e, node 2 dispatches the following callbacks back to back:

u. conf +1
v. ring +1 2232
w. conf -1
x. ring -1 2236
y. ring +1 2240
z. conf +1

("conf +/-N" is confchg callback adding/removing nodeid N)
("ring +/-N M" is totem callback adding/removing nodeid N with ringid M)

Two problems I had which the the new ring id resolves:

- When I saw w, I didn't know if this was a new failure that hadn't yet
been reported via a cluster (cman) callback, or whether it was an old
failure.  In this case it corresponds to d, which I now know because the
ringid in x is 2236 and the current cluster ringid is 2240.  This is
important because I need to know whether the current quorum value from
cman is consistent with the state I've seen from cpg.

- I only want to process the latest confchg because two matching
confchg's, e.g. u and z, are otherwise impossible to uniquely reference
between nodes.  I refer to both u and z as "confchg adding nodeid 1
resulting in members 1,2,3)  If I process u, other nodes can sometimes not
tell whether I'm referring confchg u or z.

Both of these problems resulted in my app (fenced) getting one of those
two things "wrong" and becoming stuck.

Dave

_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to