[
https://issues.apache.org/jira/browse/OAK-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15690347#comment-15690347
]
Michael Dürig commented on OAK-5058:
------------------------------------
The more sever problem with 2. is that it will defer running gc on big
repositories for too long. Technically I really think 1. is the right solution
as we want to run gc once a certain amount of garbage has piled up. The
absolute growth of the repository should be a good (and simple) estimator here.
3. is more about user's expectations re. gc running regularly. Without 3. gc
would probably be skipped a couple of times on a fresh instance. Not a problem
(rather a feature IMO) but still I think users would complain...
> Improve GC estimation strategy based on both absolute size and relative
> percentage
> ----------------------------------------------------------------------------------
>
> Key: OAK-5058
> URL: https://issues.apache.org/jira/browse/OAK-5058
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: segment-tar
> Affects Versions: 1.5.12
> Reporter: Andrei Dulceanu
> Assignee: Andrei Dulceanu
> Priority: Minor
> Fix For: 1.6, 1.5.15
>
> Attachments: OAK-5058-01.patch
>
>
> A better way of deciding whether GC should run or not might be by looking at
> the numbers computed in {{SizeDeltaGcEstimation}} from both an absolute size
> and relative percentage point of view. For example it would make sense to
> run compaction only if at least one criterion is met: "run if there is > 50%
> increase or more than 10GB".
> Since the absolute threshold is already implemented (see
> {{SegmentGCOptions.SIZE_DELTA_ESTIMATION_DEFAULT}}), it would be nice to add
> also something like {{SegmentGCOptions.SIZE_PERCENTAGE_ESTIMATION_DEFAULT}}
> and use it in making the decision.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)