[ https://issues.apache.org/jira/browse/OAK-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alex Parvulescu updated OAK-3362: --------------------------------- Fix Version/s: (was: 1.3.7) 1.3.8 > Estimate compaction based on diff to previous compacted head state > ------------------------------------------------------------------ > > Key: OAK-3362 > URL: https://issues.apache.org/jira/browse/OAK-3362 > Project: Jackrabbit Oak > Issue Type: Sub-task > Components: segmentmk > Reporter: Alex Parvulescu > Priority: Minor > Labels: compaction, gc > Fix For: 1.3.8 > > > Food for thought: try to base the compaction estimation on a diff between the > latest compacted state and the current state. > Pros > * estimation duration would be proportional to number of changes on the > current head state > * using the size on disk as a reference, we could actually stop the > estimation early when we go over the gc threshold. > * data collected during this diff could in theory be passed as input to the > compactor so it could focus on compacting a specific subtree > Cons > * need to keep a reference to a previous compacted state. post-startup and > pre-compaction this might prove difficult (except maybe if we only persist > the revision similar to what the async indexer is doing currently) > * coming up with a threshold for running compaction might prove difficult > * diff might be costly, but still cheaper than the current full diff -- This message was sent by Atlassian JIRA (v6.3.4#6332)