[ 
https://issues.apache.org/jira/browse/GEODE-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963579#comment-15963579
 ] 

ASF subversion and git services commented on GEODE-2485:
--------------------------------------------------------

Commit 344f93dfd07e6ace79cedfb474bf524b97232281 in geode's branch 
refs/heads/develop from [~dschneider]
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=344f93d ]

GEODE-2485: fix leak in tx suspend/resume

The CCPTimer is now purged for every 1000 cancels done.
So we will now no longer have more than 1000
cancelled tasks eating up memory.
Now uses internalSuspend in two places the previously used suspend.
Since internalSuspend does not schedule a timer task
these places will have no more issues with leaking memory
and these code paths will perform better.

renamed resume(TxStateProxy) to internalResume for clarity.

internalResume no longer checks for a TimerTask to cancel
since internalSuspend does not add one.
Instead the only code that checks for a TimerTask is "resume".


> CacheTransactionManager suspend/resume can leak memory for 30 minutes
> ---------------------------------------------------------------------
>
>                 Key: GEODE-2485
>                 URL: https://issues.apache.org/jira/browse/GEODE-2485
>             Project: Geode
>          Issue Type: Bug
>          Components: transactions
>            Reporter: Darrel Schneider
>            Assignee: Darrel Schneider
>
> Each time you suspend/resume a transaction it leaves about 80 bytes of heap 
> allocated for 30 minutes. If you are doing a high rate of suspend/resume 
> calls then this could cause you to run out of memory in that 30 minute window.
> As a workaround you can set -Dgemfire.suspendedTxTimeout to a value as small 
> as 1 (which would cause the memory to be freed up after 1 minute instead of 
> 30 minutes).
> One fix for this is to periodically call cache.getCCPTimer().timerPurge() 
> after a certain number of resume calls have been done (for example 1000). 
> Currently resume is calling cancel on the TimerTask but that leaves the task 
> in the SystemTimer queue until it expires. Calling timerPurge it addition to 
> cancel will fix this bug. Calling timerPurge for every cancel may cause the 
> resume method to take too long and keep in mind the getCCPTimer is used by 
> other things so the size of the SystemTimer queue that is being purged will 
> not only be the number of suspended txs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to