[ 
https://issues.apache.org/jira/browse/OAK-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14532559#comment-14532559
 ] 

Michael Dürig commented on OAK-2849:
------------------------------------

While we focused in banning mixed segments earlier (i.e. OAK-2192) we still 
have the situation that segments generated after a compaction might start 
referencing segments from before the compaction. {{SegmentWriter.writeNode()}} 
tries to avoid that by checking whether a node state or its base state has been 
compacted and uncompacts it if so. However, it turns out that there are other 
situation where such cross generation references can be introduced. 

https://github.com/mduerig/jackrabbit-oak/commit/e6188c2732dba51648a01a35706a3e4597bd5cc6
 adds a generation number to the {{SegmentId}}, which should help in spotting 
cross generation references. Even generation numbers correspond to segments 
created through normal operation. Odd ones to such generated by compaction. 
Note that this mechanism is not entirely exact as it relies on the segment ids 
to stay in memory once generated. To make it more exact we would need to attach 
the generation number to the segment and persist it along with the segment 
(e.g. in its header). The mechanism however serves its purpose for now, which 
is identifying who introduces cross segment references. 

https://github.com/mduerig/jackrabbit-oak/commit/e201203e92d5968b771b082e860f9fbe3dfa8a51
 logs cross segment references at the point the segment reference is written to 
a segment. It also logs the stack trace, which is helpful for determining the 
root cause. Furthermore it fixes (POC style) the places that have been 
identified through this method.

> Improve revision gc on SegmentMK
> --------------------------------
>
>                 Key: OAK-2849
>                 URL: https://issues.apache.org/jira/browse/OAK-2849
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: segmentmk
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>              Labels: compaction, gc
>             Fix For: 1.3.0
>
>
> This is a container issue for the ongoing effort to improve revision gc of 
> the SegmentMK. 
> I'm exploring 
> * ways to make the reference graph as exact as possible and necessary: it 
> should not contain segments that are not referenceable any more and but must 
> contain all segments that are referenceable. 
> * ways to segregate the reference graph reducing dependencies between certain 
> set of segments as much as possible. 
> * Reducing the number of in memory references and their impact on gc as much 
> as possible.
> Work in progress is in my private [Github 
> fork|https://github.com/mduerig/jackrabbit-oak]. As soon as something is 
> promising enough to make it into Oak, I spawn of an issue an make it a 
> subtask of this task. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to