[ 
https://issues.apache.org/jira/browse/OAK-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036140#comment-15036140
 ] 

Vikas Saurabh edited comment on OAK-3710 at 12/2/15 5:36 PM:
-------------------------------------------------------------

[~mreutegg], I was further discussing this with [~chetanm] and it seemed that 
we might be able to 'reduce' number of document writes during 'rewrite commit 
entries (step 3.2)' if we introduce some sort of early document re-write 
attached to lastRev updates. Chetan had concerns around slowing down 
background-write so we might want to do in a separate thread with queue of docs 
similar to pending-last-revs.
The idea is to clean up document for which last rev is to updated to also be 
scanned for revisions from same cluster node older than last-rev being updated 
from ie. for last-rev update of r-0-2=r-X-2, we can clean properties with 
revisions r-Y-2 where Y<X... also, rewrite _commitRoot and _revision as well.

This, won't remove the need to scan all documents (under correct _modified 
condition) but should reduce the number of documents that would require to be 
updated. I'd open up a new issue if this makes sense (this might be similar to 
OAK-3716 btw... except that waiting for candidates to split would have lesser 
documents that would get _commitRoot moved to _revision)


was (Author: catholicon):
[~mreutegg], I was further discussing this with [~chetanm] and it seemed that 
we might be able to 'reduce' number of document writes during 'rewrite commit 
entries (step 3.2)' if we introduce some sort of early document re-write 
attached to lastRev updates. Chetan had concerns around slowing down 
background-write so we might want to do in a separate thread with queue of docs 
similar to pending-last-revs.
The idea is to clean up document for which last rev is to updated to also be 
scanned for revisions from same cluster node older than last-rev being updated 
from ie. for last-rev update of r-0-2=r-X-2, we can clean properties with 
revisions r-Y-2 where Y<X... also, rewrite _commitRoot and _revision as well.

This, won't remove the need to scan all documents (under correct _modified 
condition) but should reduce the number of documents that would require to be 
updated. I'd open up a new issue if this makes sense.

> Continuous revision GC
> ----------------------
>
>                 Key: OAK-3710
>                 URL: https://issues.apache.org/jira/browse/OAK-3710
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: core, documentmk
>            Reporter: Marcel Reutegger
>
> Implement continuous revision GC cleaning up documents older than a given 
> threshold (e.g. one day). This issue is related to OAK-3070 where each GC run 
> starts where the last one finished.
> This will avoid peak load on the system as we see it right now, when GC is 
> triggered once a day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to