[ https://issues.apache.org/jira/browse/GEODE-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011230#comment-16011230 ]
ASF subversion and git services commented on GEODE-2865: -------------------------------------------------------- Commit 614031725360a66fdb726dd13136002f35ac6b24 in geode's branch refs/heads/develop from [~bschuchardt] [ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=6140317 ] GEODE-2865 data loss in initial-image replication with multicast The state-flush algorithm relies on MembershipManager.waitForMessageState() to ensure that all operations have been received and applied to the cache prior to state replication starting. For multicast there was a flaw in the algorithm caused by two things: 1) cache operations were being sent out-of-band, allowing them to be processed out of order with the state- flush message, and 2) JGroupsMessenger was only waiting for the messages to be dispatched by NAKACK2, which isn't necessarily the same as being dispatched to the DistributionManager Executor that processes the message. Cache operation messages are now sent in-band. JGroupsMessenger now tracks NAKACK2 (multicast) sequence numbers of messages dispatched to the DistributionManager and this is used in waitForMessageState() to make sure the messages have been queued. If multicast is enabled we now flush the serial executor to in waitForMessageState() to make sure that all messages queued in it have been applied to the region. > data loss in initial-image replication with multicast > ----------------------------------------------------- > > Key: GEODE-2865 > URL: https://issues.apache.org/jira/browse/GEODE-2865 > Project: Geode > Issue Type: Bug > Components: messaging > Reporter: Bruce Schuchardt > > During initial image replication ("get initial image") a state-flush > operation is performed to ensure that all in-flight operations are applied to > the region being replicated prior to replication starting. If multicast is > enabled for a region it is currently possible for the state-flush to miss one > or more in-flight operations, so that the new repilcate is missing changes > that are reflected in the region being replicated. > For example, process A sends a multicast put() replication message to process > B. Simultaneously process C is replicating the affected region and performs > a state-flush. Process A sends a state-stabilization message to process B > noting its multicast channel state (NAKACK2 outgoing message counter). > Process B receives this and waits for the multicast channel state to show > that it has received all of the messages. Process B then sends a > state-stabilized message to process C (the new replicate). > The state-stabilization algorithm in this case is faulty because it is > performed in the waiting-thread pool. The algorithm assumes that it is > executing in the serial-executor thread pool so that any messages that > happened before it have been applied to the region. This can allow messages > to have been received and scheduled for the serial-executor but not be > applied to the region before replication begins. > The membership manager should be modified to ensure that the serial-executor > queue has been flushed before giving the state-flush operation the go-ahead > to begin replication. -- This message was sent by Atlassian JIRA (v6.3.15#6346)