On 08/04/10 15:57, Jan Friesse wrote: > Included is patch solving 2nd problem. > > In first problem, I agree with Chrissie, and really don't have any > single idea how to make regular confchg precede totem_confchg.
We can't. That is the order in which things happen. Short of implementing some form of time-machine in corosync it's not going to change :S > Christine Caulfield wrote: >> On 07/04/10 20:32, David Teigland wrote: >>> On Tue, Apr 06, 2010 at 02:05:00PM +0200, Jan Friesse wrote: >>>> Same patch but rebased on top of Steve's change (today trunk). >>> >>> Thanks, this is mostly working well, but I've found one problem, and one >>> additional thing I need (mentioned on irc already): >>> >>> 1. When a node joins, I get the totem callback before the corresponding >>> confchg callback. When a node leaves I get them in the expected order: >>> confchg followed by totem callback. >> >> >> That *is* the expected order, as far as CPG is concerned anyway. The >> process is node deemed to be a member of the group until all nodes have >> seen its join message. it also makes more logical sense because the node >> has to join the cluster before the process joins the group. >> >> >>> 2. When my app starts up it needs to be able to get the current ring id, >>> so we need to be able to get/force an initial totem callback after a >>> cpg_join that indicates the current ring id. >>> >>> >>> I've also had a problem getting the current sequence number through >>> libcman/cman_get_cluster()/ci_generation --- >>> >>> On node 2 I see: >>> >>> in cman_dispatch statechange callback: >>> call cman_get_cluster(), get generation 2124 >>> call cman_get_nodes(), see node 1 removed >>> >>> in cman_dispatch statechange callback: >>> call cman_get_cluster(), get generation 2128 >>> call cman_get_nodes(), see node 1 added >>> >>> in cman_dispatch statechange callback: >>> call cman_get_cluster(), get generation 2128 (expect 2132) >>> call cman_get_nodes(), see node 1 removed >>> >>> in cman_dispatch statechange callback: >>> call cman_get_cluster(), get generation 2136 >>> call cman_get_nodes(), see node 1 added >>> >>> The second time node 1 is removed I get the previous generation when >>> node 1 was added instead of generation 2132 which the callback is for. >>> >>> On node 4 I do get generation 2132 in that callback as expected. So it >>> seems like it could be a race, I've only gone through this test once. >>> >> >> There is almost certainly a race there. The ring IDs need to be >> delivered at the same time as the change notifications. >> > > Chrissie, > is that problem in cman or in my patch? It's because the getting of the ring ID and the conf change messages are decoupled. For what David needs, all config change messages should include the ring ID. Chrissie _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
