[ https://issues.apache.org/jira/browse/GEODE-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868344#comment-15868344 ]
Eric Shu commented on GEODE-2485: --------------------------------- Stack trace for the above scenario: {noformat} at org.apache.geode.internal.cache.TXManagerImpl.suspend(TXManagerImpl.java:1225) at org.apache.geode.internal.cache.DistributedRegion.fetchRemoteVersionTag(DistributedRegion.java:4004) at org.apache.geode.internal.cache.TXEntryState.fetchRemoteVersionTag(TXEntryState.java:1037) at org.apache.geode.internal.cache.TXEntryState.basicPut(TXEntryState.java:1019) at org.apache.geode.internal.cache.TXState.txPutEntry(TXState.java:1288) at org.apache.geode.internal.cache.TXState.putEntry(TXState.java:1615) at org.apache.geode.internal.cache.TXStateProxyImpl.putEntry(TXStateProxyImpl.java:810) at org.apache.geode.internal.cache.LocalRegion.basicPut(LocalRegion.java:5194) at org.apache.geode.internal.cache.LocalRegion.validatedPut(LocalRegion.java:1605) at org.apache.geode.internal.cache.LocalRegion.put(LocalRegion.java:1592) at org.apache.geode.internal.cache.AbstractRegion.put(AbstractRegion.java:277) {noformat} > CacheTransactionManager suspend/resume can leak memory for 30 minutes > --------------------------------------------------------------------- > > Key: GEODE-2485 > URL: https://issues.apache.org/jira/browse/GEODE-2485 > Project: Geode > Issue Type: Bug > Components: transactions > Reporter: Darrel Schneider > > Each time you suspend/resume a transaction it leaves about 80 bytes of heap > allocated for 30 minutes. If you are doing a high rate of suspend/resume > calls then this could cause you to run out of memory in that 30 minute window. > As a workaround you can set -Dgemfire.suspendedTxTimeout to a value as small > as 1 (which would cause the memory to be freed up after 1 minute instead of > 30 minutes). > One fix for this is to periodically call cache.getCCPTimer().timerPurge() > after a certain number of resume calls have been done (for example 1000). > Currently resume is calling cancel on the TimerTask but that leaves the task > in the SystemTimer queue until it expires. Calling timerPurge it addition to > cancel will fix this bug. Calling timerPurge for every cancel may cause the > resume method to take too long and keep in mind the getCCPTimer is used by > other things so the size of the SystemTimer queue that is being purged will > not only be the number of suspended txs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)