nabarun created GEODE-5631:
------------------------------

             Summary: failedBatchRemovalMessageKeys never cleared
                 Key: GEODE-5631
                 URL: https://issues.apache.org/jira/browse/GEODE-5631
             Project: Geode
          Issue Type: Bug
          Components: wan
            Reporter: nabarun


*+Experiment setup+*:
* Region A created with async event listener attached to it
* For every event processed by the async listener, a new entry is put into 
another region, Region B.
* There is a client which does 1 million operations on 1500 keys on Region A. 
[to trigger conflation.] 
* 3 servers, 1 locator and 1 client.

+*Issue:*+
It was confirmed that after upgrading to 1.6.0 , we saw an increase in the 
memory footprint after all operations are completed.

+*Cause:*+
* We had a data structure to store all the queue removal messages that comes in 
when the secondary is in process of GII, called failedBatchRemovalMessageKeys.
* Two removal messages were sent to the secondary for a single event, one from 
the processor which was processing the event and another from the conflation 
thread which conflated the event and hence wants the secondary to remove it.
* Of the two messages whichever comes first, it removes the event from the 
queue.
* When the second message comes in, and we try to remove it from the queue, it 
hits an EntryNotFoundException. This makes the message think that secondary is 
in GII and hence stores it in the failedBatchRemovalMessageKeys, and expects 
that when GII is complete, this message will be processed.
* But GII was already done long before, and this data structure, 
failedBatchRemovalMessageKeys keeps storing messages which are never removed 
and hence a large memory footprint.

+*Fix:*+
The data structure failedBatchRemovalMessageKeys is not used anymore if it was 
already processed once, as GII happens once in a server’s lifecycle.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to