Kenneth Howe created GEODE-5513:
-----------------------------------

             Summary: Clients may miss PR region events due to race during 
registerInterest
                 Key: GEODE-5513
                 URL: https://issues.apache.org/jira/browse/GEODE-5513
             Project: Geode
          Issue Type: Bug
          Components: client queues
            Reporter: Kenneth Howe


Here is the scenario:
 Consider two servers and client:
 - Server1 hosting the primary bucket
 - Server2 hosting secondary bucket and also primary queue for the Client2
 - Client1 Doing remove operation
 - Client2 doing register interest

- The Client1 starts remove-all operation
 - At the same time Client2 is registering interest
 - Server1 receives the remove-all operation processes it, and sends the 
adjunct message to the Server2 (Its still not yet received the interest info 
from server1)
 - While the remove-all to server2 in flight
 - Server2 sends interest profile info to Server1 for client2; and then Server2 
(as it is hosting the primary queue) starts building the initial image snapshot 
for the interest. When building initial image for PR preference is given to 
collect data from local node. During this time the removal message is still in 
flight and hasn't applied on Server2. The initial image for interest 
registration calculates the snapshot from local data, and sends it to client, 
missing the remove-all op.

This could happen with non-bulk ops; but it gets worse with bulk ops as the 
time taken to replicate the bulk ops will take more time.

The solution is to build the initial register interest response by getting the 
data from primary bucket. This will add little overhead in building the 
interest response; but considering that most or always the register response 
will involve remote node, this may be negligible.

Clients registering interest in a region



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to