[
https://issues.apache.org/jira/browse/OAK-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894387#comment-15894387
]
Stefan Eissing edited comment on OAK-4780 at 3/3/17 1:51 PM:
-------------------------------------------------------------
Updated my github clone with the following:
* configure {{maxIterations}} that a gc run is allowed to make (default 0 ==
no limit)
* configure {{maxDuration}} that a gc run might take (default 0 == no limit)
* configure {{batchDelay}} that gc shall sleep between modification batches
(default == 0, no delay)
added test case for cleanup in iterations.
The idea how to use these configuration parameters is:
* use {{maxIterations}} only in test setups where one wants to check the
immediate results
* use {{maxDuration}} when the gc runs in a daily (weekly?) maintenance
window, e.g. during the night and shall stop iterating when working hours
resume.
* use {{batchDelay}} when gc shall run during busy times or all the time, e.g.
on 24/7 systems. A small delay should prevent the gc from taking over the write
locks (on db/table/index), depending on database used.
was (Author: stefan.eissing):
Updated my github clone with the following:
* configure ```maxIterations``` that a gc run is allowed to make (default 0 ==
no limit)
* configure ```maxDuration``` that a gc run might take (default 0 == no limit)
* configure ```batchDelay``` that gc shall sleep between modification batches
(default == 0, no delay)
added test case for cleanup in iterations.
The idea how to use these configuration parameters is:
* use ```maxIterations``` only in test setups where one wants to check the
immediate results
* use ```maxDuration``` when the gc runs in a daily (weekly?) maintenance
window, e.g. during the night and shall stop iterating when working hours
resume.
* use ```batchDelay``` when gc shall run during busy times or all the time,
e.g. on 24/7 systems. A small delay should prevent the gc from taking over the
write locks (on db/table/index), depending on database used.
> VersionGarbageCollector should be able to run incrementally
> -----------------------------------------------------------
>
> Key: OAK-4780
> URL: https://issues.apache.org/jira/browse/OAK-4780
> Project: Jackrabbit Oak
> Issue Type: Task
> Components: core, documentmk
> Reporter: Julian Reschke
> Attachments: leafnodes.diff, leafnodes-v2.diff, leafnodes-v3.diff
>
>
> Right now, the documentmk's version garbage collection runs in several phases.
> It first collects the paths of candidate nodes, and only once this has been
> successfully finished, starts actually deleting nodes.
> This can be a problem when the regularly scheduled garbage collection is
> interrupted during the path collection phase, maybe due to other maintenance
> tasks. On the next run, the number of paths to be collected will be even
> bigger, thus making it even more likely to fail.
> We should think about a change in the logic that would allow the GC to run in
> chunks; maybe by partitioning the path space by top level directory.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)