On 10/24/2013 06:57 PM, sathya bettadapura wrote:
Any state in application happens only upon receipt of a message (or a
config.change), not merely queueing it via cpg_mcast(). Upon receipt
of a config. change, internal messages are broadcast via cpg_mcast()
as the last thing they do in handling the callback.
Thanks,
Sathya
Well that blows my theory. Do you have a test case you can share?
Regards
-steve
On Thursday, October 24, 2013 5:51 PM, Steven Dake <[email protected]>
wrote:
On 10/24/2013 03:31 PM, sathya bettadapura wrote:
Hi All,
I think I am noticing what appears to be an anomaly, so just posting
here for sanity check.
We have a stress test that does frequent network
partitioning/reunification to exercise code related to node fail-back.
We're based on version 1.4.6. We were were based on 2.x.x until libqb
made its way into the core of corosync. As our company policy
precludes us from using anything but BSD style licensed third part
source code, we had to either rewrite libqb or go back.
Lets's say we have four nodes A, B, C and D. A and B are one side of a
network segment and C and D on the other. The network can be
partitioned by pulling a cable connecting the two segments.
When there is a configuration change, we need to re-compute
application state by sending messages to the new members. Such a
message identifies the originating node and the size of the cluster at
the time. And this message is logged in application log.
When the cluster goes from A,B,C,D to (A,B) and (C, D), on A-B side,
we see message from A that says "From A, cluster size is 2".
Immediately thereafter there's another config. chage to take the
cluster back to (A, B, C, D). Now we see messages from A, C and D that
the cluster size is 4. But we see two messages from B, the first one
says the cluster size is 2 and the second one says it's 4. It appears
that the message from B when the cluster size was 2, could not be
delivered as there was a config. change right on its heel, but it's
being delivered to a configuration different from the one where it
originated. Is this expected behaviour ?
Messages are originated by the totem protocol and ordered according to
EVS when they are taken off the new message queue and transmitted into
the network. This is different then queing a message (via cpg), which
is not origination. Are you sure your not confusing origination with
cpg_mcast?
Generally the correct way for an application to behave according to
EVS is to originate all state change messages via the protocol, and
act on them when received. Some devs tend to change state when they
used cpg_mcast rather then change state when a message is delivered.
This would result in your example behavior.
Just to clarify, your application only changes state on delivery of a
message to the cpg application (not on queueing via cpg_mcast)?
Regards
-steve
Sathya
_______________________________________________
discuss mailing list
[email protected] <mailto:[email protected]>
http://lists.corosync.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss