[ 
https://issues.apache.org/jira/browse/OAK-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278870#comment-14278870
 ] 

Michael Dürig commented on OAK-2408:
------------------------------------

All the following ideas need the ability to rewrite (parts of) the revision 
history. For this we need to be able to relink in memory references to records 
(old RecordId to new RecordId). Also we need finer grained reference graphs 
(i.e. on the record level vs. on the segment level as we currently have).

Given this we could try to:
* rewrite all still referenced records in old segments (those that time out in 
our current approach).
* rewrite references to old root node states. Probably rebase them in reverse 
order on top of the new compacted head node state.
* squash the history tail by rewriting it into a new compacted node state. 
* follow up on the approach proposed by [~chetanm] on OAK-2045, relinking to 
the current head state.

To get us there we need to refactor the TarMk code such that:
* {{Record}} instances could be relinked to a new underlying persisted record. 
This could be achieved by making the various {{Record}} implementations (e.g. 
{{SegmentPropertyState}}) 'has-a' record instead of 'is-a' record. 
* Segments could be removed from disk even though some records are still 
referenced. Such records would be simply rewritten and the original segment 
would be replace with a 'proxy segment' just forwarding to the new 
segment/records. 



> Investigate ways to make revision gc more precise 
> --------------------------------------------------
>
>                 Key: OAK-2408
>                 URL: https://issues.apache.org/jira/browse/OAK-2408
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: segmentmk
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>              Labels: gc
>
> Current approaches to revision garbage collection tend to be too conservative 
> (too little space reclaimed, e.g. OAK-2045) or too aggressive (removing 
> segments still being used, e.g. OAK-2384). 
> This issue is to explore ways to make revision gc on TarMk more precise. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to