Donal Evans created GEODE-7643:
----------------------------------

             Summary: Gateway unprocessedTokensMap appears to grow without 
bounds with replicated regions and peer accessors
                 Key: GEODE-7643
                 URL: https://issues.apache.org/jira/browse/GEODE-7643
             Project: Geode
          Issue Type: Bug
          Components: wan
            Reporter: Donal Evans


When peer accessors do puts to a replicated region with a serial gateway sender 
via multiple threads and on the same key, 
{{ConcurrentCacheModificationException}} in {{LocalRegion.virtualPut}} causes 
{{notifyGatewaySender}} to be called, which puts the event into the queue, 
leading to the event being put in the queue twice but only removed once and 
causing the unprocessedTokensMap to accumulate events.

Here are the two stacks:
{noformat}

[warn 2019/12/02 12:47:59.102 PST <P2P message reader for 
10.255.202.119(gateway-ln-2:85182)<v97>:41004 unshared ordered uid=11 dom #2 
port=59034> tid=0x61] XXX LocalRegion.virtualPut caught 
ConcurrentCacheModificationException about to notifyGatewaySender 
eventId=EventID[10.255.202.119(accessor-ln-1)<v98>:41005;threadID=5;sequenceID=273];
 ifNew=false; ifOld=true; overwriteDestroyed=false; eventIdentity=329453507; 
eventValue=Trade[id=-1501795011; cusip=PVTL; shares=29; price=163; 
payloadLength=0 bytes]
java.lang.Exception
        at 
org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5591)
        at 
org.apache.geode.internal.cache.DistributedRegion.virtualPut(DistributedRegion.java:385)
        at 
org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:162)
        at 
org.apache.geode.internal.cache.LocalRegion.basicUpdate(LocalRegion.java:5561)
        at 
org.apache.geode.internal.cache.AbstractUpdateOperation.doPutOrCreate(AbstractUpdateOperation.java:182)
        at 
org.apache.geode.internal.cache.AbstractUpdateOperation$AbstractUpdateMessage.basicOperateOnRegion(AbstractUpdateOperation.java:287)
        at 
org.apache.geode.internal.cache.AbstractUpdateOperation$AbstractUpdateMessage.operateOnRegion(AbstractUpdateOperation.java:258)
        at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1206)
        at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1108)
        at 
org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:372)
        at 
org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:427)
{noformat}

{noformat}
[warn 2019/12/02 12:47:59.108 PST <P2P message reader for 
10.255.202.119(gateway-ln-2:85182)<v97>:41004 unshared ordered uid=11 dom #2 
port=59034> tid=0x61] XXX LocalRegion.virtualPut caught 
ConcurrentCacheModificationException about to notifyGatewaySender 
eventId=EventID[10.255.202.119(accessor-ln-1)<v98>:41005;threadID=5;sequenceID=273];
 ifNew=false; ifOld=false; overwriteDestroyed=true; eventIdentity=329453507; 
eventValue=Trade[id=-1501795011; cusip=PVTL; shares=29; price=163; 
payloadLength=0 bytes]
java.lang.Exception
        at 
org.apache.geode.internal.cache.LocalRegion.virtualPut(LocalRegion.java:5591)
        at 
org.apache.geode.internal.cache.DistributedRegion.virtualPut(DistributedRegion.java:385)
        at 
org.apache.geode.internal.cache.LocalRegionDataView.putEntry(LocalRegionDataView.java:162)
        at 
org.apache.geode.internal.cache.LocalRegion.basicUpdate(LocalRegion.java:5561)
        at 
org.apache.geode.internal.cache.AbstractUpdateOperation.doPutOrCreate(AbstractUpdateOperation.java:194)
        at 
org.apache.geode.internal.cache.AbstractUpdateOperation$AbstractUpdateMessage.basicOperateOnRegion(AbstractUpdateOperation.java:287)
        at 
org.apache.geode.internal.cache.AbstractUpdateOperation$AbstractUpdateMessage.operateOnRegion(AbstractUpdateOperation.java:258)
        at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1206)
        at 
org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1108)
{noformat}
Here are the corresponding puts into the queue:
{noformat}
[warn 2019/12/02 12:47:59.104 PST <P2P message reader for 
10.255.202.119(gateway-ln-2:85182)<v97>:41004 unshared ordered uid=11 dom #2 
port=59034> tid=0x61] XXX SerialGatewaySenderQueue.putAndGetKey key=3625; 
eventId=EventID[10.255.202.119(accessor-ln-1)<v98>:41005;threadID=0x30002|5;sequenceID=273];
 eventValue=Trade[id=-1501795011; cusip=PVTL; shares=29; 
price=163.08897399902344; payloadLength=0 bytes]
{noformat}
{noformat}
[warn 2019/12/02 12:47:59.110 PST <P2P message reader for 
10.255.202.119(gateway-ln-2:85182)<v97>:41004 unshared ordered uid=11 dom #2 
port=59034> tid=0x61] XXX SerialGatewaySenderQueue.putAndGetKey key=3635; 
eventId=EventID[10.255.202.119(accessor-ln-1)<v98>:41005;threadID=0x30002|5;sequenceID=273];
 eventValue=Trade[id=-1501795011; cusip=PVTL; shares=29; 
price=163.08897399902344; payloadLength=0 bytes]
{noformat}
On the secondary, when the event is received via normal replication, its added 
to the unprocessedEvents map:
{noformat}
[warn 2019/12/02 12:47:59.100 PST <P2P message reader for 
10.255.202.119(accessor-ln-1:85194)<v98>:41005 unshared ordered uid=13 dom #1 
port=59022> tid=0x58] 
SerialGatewaySenderEventProcessor.basicHandleSecondaryEvent put 
unprocessedEvents 
eventId=EventID[10.255.202.119(accessor-ln-1)<v98>:41005;threadID=0x30002|5;sequenceID=273]
{noformat}
The first replication from the primary queue is received which removes the 
event from the unprocessedEvents map:
{noformat}
[warn 2019/12/02 12:47:59.104 PST <P2P message reader for 
10.255.202.119(gateway-ln-1:85170)<v96>:41003 unshared ordered uid=18 dom #3 
port=59052> tid=0x68] XXX SerialSecondaryGatewayListener.afterCreate 
senderEvent=EventID[10.255.202.119(accessor-ln-1)<v98>:41005;threadID=0x30002|5;sequenceID=273]
[warn 2019/12/02 12:47:59.104 PST <Queued Gateway Listener Thread1> tid=0x5e] 
SerialGatewaySenderEventProcessor.basicHandlePrimaryEvent removed 
unprocessedEvents 
eventId=EventID[10.255.202.119(accessor-ln-1)<v98>:41005;threadID=0x30002|5;sequenceID=273];
 
value=org.apache.geode.internal.cache.wan.AbstractGatewaySender$EventWrapper@3f6df03b
{noformat}
Then the second replication from the primary queue is received which 
incorrectly adds the event to the unprocessedTokens map where is stays forever:
{noformat}
[warn 2019/12/02 12:47:59.110 PST <P2P message reader for 
10.255.202.119(gateway-ln-1:85170)<v96>:41003 unshared ordered uid=18 dom #3 
port=59052> tid=0x68] XXX SerialSecondaryGatewayListener.afterCreate 
senderEvent=EventID[10.255.202.119(accessor-ln-1)<v98>:41005;threadID=0x30002|5;sequenceID=273]
[warn 2019/12/02 12:47:59.110 PST <Queued Gateway Listener Thread1> tid=0x5e] 
SerialGatewaySenderEventProcessor.basicHandlePrimaryEvent put unprocessedTokens 
eventId=EventID[10.255.202.119(accessor-ln-1)<v98>:41005;threadID=0x30002|5;sequenceID=273];
 value=1575319799110; size=914
 {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to