Re: [corosync] Message mis-delievered to a configuration ?

Steven Dake Thu, 24 Oct 2013 20:57:23 -0700

On 10/24/2013 06:57 PM, sathya bettadapura wrote:

Any state in application happens only upon receipt of a message (or aconfig.change), not merely queueing it via cpg_mcast(). Upon receiptof a config. change, internal messages are broadcast via cpg_mcast()as the last thing they do in handling the callback.
Thanks,

    Sathya

Well that blows my theory.  Do you have a test case you can share?

Regards
-steve

On Thursday, October 24, 2013 5:51 PM, Steven Dake <[email protected]>wrote:
On 10/24/2013 03:31 PM, sathya bettadapura wrote:
Hi All,
I think I am noticing what appears to be an anomaly, so just postinghere for sanity check.
We have a stress test that does frequent networkpartitioning/reunification to exercise code related to node fail-back.We're based on version 1.4.6. We were were based on 2.x.x until libqbmade its way into the core of corosync. As our company policyprecludes us from using anything but BSD style licensed third partsource code, we had to either rewrite libqb or go back.
Lets's say we have four nodes A, B, C and D. A and B are one side of anetwork segment and C and D on the other. The network can bepartitioned by pulling a cable connecting the two segments.
When there is a configuration change, we need to re-computeapplication state by sending messages to the new members. Such amessage identifies the originating node and the size of the cluster atthe time. And this message is logged in application log.
When the cluster goes from A,B,C,D to (A,B) and (C, D), on A-B side,we see message from A that says "From A, cluster size is 2".Immediately thereafter there's another config. chage to take thecluster back to (A, B, C, D). Now we see messages from A, C and D thatthe cluster size is 4. But we see two messages from B, the first onesays the cluster size is 2 and the second one says it's 4. It appearsthat the message from B when the cluster size was 2, could not bedelivered as there was a config. change right on its heel, but it'sbeing delivered to a configuration different from the one where itoriginated. Is this expected behaviour ?
Messages are originated by the totem protocol and ordered according toEVS when they are taken off the new message queue and transmitted intothe network. This is different then queing a message (via cpg), whichis not origination. Are you sure your not confusing origination withcpg_mcast?
Generally the correct way for an application to behave according toEVS is to originate all state change messages via the protocol, andact on them when received. Some devs tend to change state when theyused cpg_mcast rather then change state when a message is delivered.This would result in your example behavior.
Just to clarify, your application only changes state on delivery of amessage to the cpg application (not on queueing via cpg_mcast)?
Regards
-steve
     Sathya


_______________________________________________
discuss mailing list
[email protected]  <mailto:[email protected]>
http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss

Re: [corosync] Message mis-delievered to a configuration ?

Reply via email to