[
https://issues.apache.org/jira/browse/OAK-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14532559#comment-14532559
]
Michael Dürig commented on OAK-2849:
------------------------------------
While we focused in banning mixed segments earlier (i.e. OAK-2192) we still
have the situation that segments generated after a compaction might start
referencing segments from before the compaction. {{SegmentWriter.writeNode()}}
tries to avoid that by checking whether a node state or its base state has been
compacted and uncompacts it if so. However, it turns out that there are other
situation where such cross generation references can be introduced.
https://github.com/mduerig/jackrabbit-oak/commit/e6188c2732dba51648a01a35706a3e4597bd5cc6
adds a generation number to the {{SegmentId}}, which should help in spotting
cross generation references. Even generation numbers correspond to segments
created through normal operation. Odd ones to such generated by compaction.
Note that this mechanism is not entirely exact as it relies on the segment ids
to stay in memory once generated. To make it more exact we would need to attach
the generation number to the segment and persist it along with the segment
(e.g. in its header). The mechanism however serves its purpose for now, which
is identifying who introduces cross segment references.
https://github.com/mduerig/jackrabbit-oak/commit/e201203e92d5968b771b082e860f9fbe3dfa8a51
logs cross segment references at the point the segment reference is written to
a segment. It also logs the stack trace, which is helpful for determining the
root cause. Furthermore it fixes (POC style) the places that have been
identified through this method.
> Improve revision gc on SegmentMK
> --------------------------------
>
> Key: OAK-2849
> URL: https://issues.apache.org/jira/browse/OAK-2849
> Project: Jackrabbit Oak
> Issue Type: Task
> Components: segmentmk
> Reporter: Michael Dürig
> Assignee: Michael Dürig
> Labels: compaction, gc
> Fix For: 1.3.0
>
>
> This is a container issue for the ongoing effort to improve revision gc of
> the SegmentMK.
> I'm exploring
> * ways to make the reference graph as exact as possible and necessary: it
> should not contain segments that are not referenceable any more and but must
> contain all segments that are referenceable.
> * ways to segregate the reference graph reducing dependencies between certain
> set of segments as much as possible.
> * Reducing the number of in memory references and their impact on gc as much
> as possible.
> Work in progress is in my private [Github
> fork|https://github.com/mduerig/jackrabbit-oak]. As soon as something is
> promising enough to make it into Oak, I spawn of an issue an make it a
> subtask of this task.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)