[ 
https://issues.apache.org/jira/browse/OAK-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249867#comment-15249867
 ] 

Michael Dürig commented on OAK-3362:
------------------------------------

Moved this to {{oak-segment-next}}. However, not sure whether we should invest 
much here. I would actually propose to remove the current estimation approach 
entirely as is expensive wrt. IO, CPU and cache coherence. If we want to keep 
an estimation step we need IMO come up with a cheap way (at least 2 orders of 
magnitude cheaper than compaction). 

> Estimate compaction based on diff to previous compacted head state
> ------------------------------------------------------------------
>
>                 Key: OAK-3362
>                 URL: https://issues.apache.org/jira/browse/OAK-3362
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: segment-next
>            Reporter: Alex Parvulescu
>            Assignee: Alex Parvulescu
>            Priority: Minor
>              Labels: compaction, gc
>             Fix For: 1.6
>
>
> Food for thought: try to base the compaction estimation on a diff between the 
> latest compacted state and the current state.
> Pros
> * estimation duration would be proportional to number of changes on the 
> current head state
> * using the size on disk as a reference, we could actually stop the 
> estimation early when we go over the gc threshold.
> * data collected during this diff could in theory be passed as input to the 
> compactor so it could focus on compacting a specific subtree
> Cons
> * need to keep a reference to a previous compacted state. post-startup and 
> pre-compaction this might prove difficult (except maybe if we only persist 
> the revision similar to what the async indexer is doing currently)
> * coming up with a threshold for running compaction might prove difficult
> * diff might be costly, but still cheaper than the current full diff



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to