[ 
https://issues.apache.org/jira/browse/GEODE-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011230#comment-16011230
 ] 

ASF subversion and git services commented on GEODE-2865:
--------------------------------------------------------

Commit 614031725360a66fdb726dd13136002f35ac6b24 in geode's branch 
refs/heads/develop from [~bschuchardt]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=6140317 ]

GEODE-2865 data loss in initial-image replication with multicast

The state-flush algorithm relies on MembershipManager.waitForMessageState()
to ensure that all operations have been received and applied to the cache
prior to state replication starting.  For multicast there was a flaw in
the algorithm caused by two things: 1) cache operations were being sent
out-of-band, allowing them to be processed out of order with the state-
flush message, and 2) JGroupsMessenger was only waiting for the messages
to be dispatched by NAKACK2, which isn't necessarily the same as being
dispatched to the DistributionManager Executor that processes the message.

Cache operation messages are now sent in-band.

JGroupsMessenger now tracks NAKACK2 (multicast) sequence numbers of
messages dispatched to the DistributionManager and this is used in
waitForMessageState() to make sure the messages have been queued.
If multicast is enabled we now flush the serial executor to in
waitForMessageState() to make sure that all messages queued in it have
been applied to the region.


> data loss in initial-image replication with multicast
> -----------------------------------------------------
>
>                 Key: GEODE-2865
>                 URL: https://issues.apache.org/jira/browse/GEODE-2865
>             Project: Geode
>          Issue Type: Bug
>          Components: messaging
>            Reporter: Bruce Schuchardt
>
> During initial image replication ("get initial image") a state-flush 
> operation is performed to ensure that all in-flight operations are applied to 
> the region being replicated prior to replication starting.  If multicast is 
> enabled for a region it is currently possible for the state-flush to miss one 
> or more in-flight operations, so that the new repilcate is missing changes 
> that are reflected in the region being replicated.
> For example, process A sends a multicast put() replication message to process 
> B.  Simultaneously process C is replicating the affected region and performs 
> a state-flush.  Process A sends a state-stabilization message to process B 
> noting its multicast channel state (NAKACK2 outgoing message counter).  
> Process B receives this and waits for the multicast channel state to show 
> that it has received all of the messages.  Process B then sends a 
> state-stabilized message to process C (the new replicate).
> The state-stabilization algorithm in this case is faulty because it is 
> performed in the waiting-thread pool.  The algorithm assumes that it is 
> executing in the serial-executor thread pool so that any messages that 
> happened before it have been applied to the region.  This can allow messages 
> to have been received and scheduled for the serial-executor but not be 
> applied to the region before replication begins.
> The membership manager should be modified to ensure that the serial-executor 
> queue has been flushed before giving the state-flush operation the go-ahead 
> to begin replication.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to