Michael Dürig created OAK-5278:
----------------------------------
Summary: Improved compaction estimator
Key: OAK-5278
URL: https://issues.apache.org/jira/browse/OAK-5278
Project: Jackrabbit Oak
Issue Type: Improvement
Components: segment-tar
Reporter: Michael Dürig
Fix For: 1.8
OAK-4293 introduced a new approach for estimating whether we actually want to
run or skip a gc cycle. That approach is purely based on the absolute growth of
the repository's on disk footprint.
I think this can be further refined as with the {{GCJournal}} we can
effectively extrapolate the amount of garbage at a given point in time given
the history of previous gc cycles. E.g. let {{S_n}} be the size of the
repository and {{G_n}} the percentage of garbage right before the {{n}}-th gc
cycle. We can then linearly extrapolate the garbage {{G_n+1}} for the
{{n+1}}-the gc cycle along the repository sizes:
{code}
G_n+1 = G_n * (S_k+1 - S_k)/(S_k - S_k-1)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)