[ 
https://issues.apache.org/jira/browse/OAK-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782201#comment-17782201
 ] 

Stefan Egli edited comment on OAK-10526 at 11/2/23 3:49 PM:
------------------------------------------------------------

As a fix for this I'd suggest to set _sdMaxRevTime to the timestamp at the time 
of creating the split document. That would violate the meaning of that property 
- but it would ensure the split document gets deleted only a certain time (eg 
24h) after any possible checkpoint or head revision existing at that time. 
Hence it would fix this. (renaming of this property would break backwards 
compatibility and seem to be an unnecessarily costly change)


was (Author: egli):
As a fix for this I'd suggest to set _sdMaxRevTime to the timestamp at the time 
of creating the split document. That would violate the meaning of that property 
- but it would ensure the split document gets deleted only a certain time (eg 
24h) after any possible checkpoint of head revision existing at that time. 
Hence it would fix this. (renaming of this property would break backwards 
compatibility and seem to be an unnecessarily costly change)

> split doc can contain still referenced revisions without _sdMaxRevTime 
> indicating so
> ------------------------------------------------------------------------------------
>
>                 Key: OAK-10526
>                 URL: https://issues.apache.org/jira/browse/OAK-10526
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: documentmk
>    Affects Versions: 1.58.0
>            Reporter: Stefan Egli
>            Assignee: Stefan Egli
>            Priority: Major
>
> When a document grows too large, part of it is split into previous documents. 
> Those also called split documents are marked with _sdMaxRevTime reflecting 
> the newest (max) revision timestamp the document contains. GC later can 
> delete split documents where _sdMaxRevTime is older than 24h or any existing 
> checkpoint. This is based on the assumption that _sdMaxRevTime can be 
> compared to "older than 24h or any existing checkpoint" - while _sdMaxRevTime 
> only indicates the newest revision contained within. There can thus be a 
> situation when a split document contains a revision that is still referenced 
> by a current (not older than 24h) head revision or a checkpoint - but 
> _sdMaxRevTime is old enough for GC to remove that split doc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to