[
https://issues.apache.org/jira/browse/OAK-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171763#comment-15171763
]
Marcel Reutegger commented on OAK-3287:
---------------------------------------
I was further thinking about how to implement the remaining GC functionality.
The core problem is how we can clean up the commit markers on the commit root
document. Related issues in this epic are OAK-3716, OAK-3711 and OAK-3712. So
far we discussed two basic approaches:
- Rewrite changes on documents that they don't depend on the commit root
anymore. When this is done, the commit markers on the commit root can be
cleaned up.
- Ensure there are no uncommitted changes present up to a RevisionVector and
assume all revisions seen from that vector are committed.
There are pros and cons for each of the two approaches. Rewriting the changes
is a simple concept and does not require to change the existing model. On the
other hand, this approach would probably also require us to rewrite split
documents, which we currently consider immutable. It also means each change
will be written twice, doubling the write operations on the DocumentStore.
Cleaning up uncommitted changes and maintaining a safe RevisionVector requires
less write operations, but we'd need to change the model slightly. However,
with this approach we'd have to rewrite branch commits, because the merge
revision is only available on the commit root document.
My preference currently is to implement the second approach, mainly because it
requires less writes and is IMO easier to introduce on existing data.
> DocumentMK revision GC
> ----------------------
>
> Key: OAK-3287
> URL: https://issues.apache.org/jira/browse/OAK-3287
> Project: Jackrabbit Oak
> Issue Type: Epic
> Components: documentmk, mongomk, rdbmk
> Reporter: Michael Marth
> Assignee: Marcel Reutegger
> Fix For: 1.6
>
>
> Collector for various tasks on implementing DocMK revision GC
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)