[ 
https://issues.apache.org/jira/browse/OAK-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090206#comment-15090206
 ] 

Michael Dürig commented on OAK-3348:
------------------------------------

At https://github.com/mduerig/jackrabbit-oak/commits/OAK-3348 I started 
implementing a POC for above approach for 2):

* Prevent back references by flushing segment node builders into 2 sets of 
segments: free and merged. A segment is free if it has been created by a 
builder and only references free segments. Otherwise a segment is merged. 

* When rebasing a builder during merge:
** Link to records in free segments and mark those segments as merged.
** Clone all records in cross gc merged segments before linking to them.  
(Optimally there would be no such records (i.e. optimally all references  would 
point into free segments). Note: if this builder contains references to records 
in segments of other builders, those segments would also become merged along 
with all segments referencing them. 

I structured the commits such that it should be relatively easy to follow. See 
FIXME tags for what is still missing and what needs cleaning up.

cc [~frm], [~alex.parvulescu]



> Cross gc sessions might introduce references to pre-compacted segments
> ----------------------------------------------------------------------
>
>                 Key: OAK-3348
>                 URL: https://issues.apache.org/jira/browse/OAK-3348
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segmentmk
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>              Labels: candidate_oak_1_0, candidate_oak_1_2, cleanup, 
> compaction, gc
>             Fix For: 1.4
>
>         Attachments: OAK-3348-1.patch, OAK-3348-2.patch, OAK-3348.patch, 
> cross-gc-refs.pdf, image.png
>
>
> I suspect that certain write operations during compaction can cause 
> references from compacted segments to pre-compacted ones. This would 
> effectively prevent the pre-compacted segments from getting evicted in 
> subsequent cleanup phases. 
> The scenario is as follows:
> * A session is opened and a lot of content is written to it such that the 
> update limit is exceeded. This causes the changes to be written to disk. 
> * Revision gc runs causing a new, compacted root node state to be written to 
> disk.
> * The session saves its changes. This causes rebasing of its changes onto the 
> current root (the compacted one). At this point any node that has been added 
> will be added again in the sub-tree rooted at the current root. Such nodes 
> however might have been written to disk *before* revision gc ran and might 
> thus be contained in pre-compacted segments. As I suspect the node-add 
> operation in the rebasing process *not* to create a deep copy of such nodes 
> but to rather create a *reference* to them, a reference to a pre-compacted 
> segment is introduced here. 
> Going forward we need to validate above hypothesis, assess its impact if 
> necessary come up with a solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to