[ 
https://issues.apache.org/jira/browse/OAK-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15505862#comment-15505862
 ] 

Chetan Mehrotra edited comment on OAK-4826 at 9/20/16 7:26 AM:
---------------------------------------------------------------

Makes sense. So we should implement such logic in AsyncIndexUpdate. Only 
problem is that it does not have access to any api to list checkpoints. So 
probably we would need to introduce some api like 
{{org.apache.jackrabbit.oak.checkpoint.Checkpoints}} present in oak-run and use 
that there.

OR introduce this method on NodeStore API itself as it has many other 
checkpoint related methods




was (Author: chetanm):
Makes sense. So we should implement such logic in AsyncIndexUpdate. Only 
problem is that it does not have access to any api to list checkpoints. So 
probably we would need to introduce some api like 
{{org.apache.jackrabbit.oak.checkpoint.Checkpoints}} present in oak-run and use 
that there

> Auto removal of orphaned checkpoints
> ------------------------------------
>
>                 Key: OAK-4826
>                 URL: https://issues.apache.org/jira/browse/OAK-4826
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core
>            Reporter: Chetan Mehrotra
>             Fix For: 1.6
>
>
> Currently if in a running system there are some orphaned checkpoint present 
> then they prevent the revision gc (compaction for segment) from being 
> effective. 
> So far the practice has been to use {{oak-run checkpoints rm-unreferenced}} 
> command to clean them up manually. This was set to manual as it was not 
> possible to determine whether current checkpoint is in use or not. 
> rm-unreferenced works with the basis that checkpoints are only made from 
> AsyncIndexUpdate and hence can check if the checkpoint is in use by cross 
> checking with {{:async}} state. Doing it in auto mode is risky as 
> {{checkpoint}} api can be used by any module.
> With OAK-2314 we also record some metadata like {{creator}} and {{name}}. 
> This can be used for auto cleanup. For example in some running system 
> following checkpoints are listed
> {noformat}
> Mon Sep 19 18:02:09 EDT 2016  Sun Jun 16 18:02:09 EDT 2019    
> r15744787d0a-1-1        
>  
> creator=AsyncIndexUpdate
> name=fulltext-async
> thread=sling-default-4070-Registered Service.653
>  
> Mon Sep 19 18:02:09 EDT 2016  Sun Jun 16 18:02:09 EDT 2019    
> r15744787d0a-0-1        
>  
> creator=AsyncIndexUpdate
> name=async
> thread=sling-default-4072-Registered Service.656
>  
> Fri Aug 19 18:57:33 EDT 2016  Thu May 16 18:57:33 EDT 2019    
> r156a50612e1-1-1        
>  
> creator=AsyncIndexUpdate
> name=async
> thread=sling-default-10-Registered Service.654
>  
> Wed Aug 10 12:13:20 EDT 2016  Tue May 07 12:25:52 EDT 2019    
> r156753ac38d-0-1        
>  
> creator=AsyncIndexUpdate
> name=async
> thread=sling-default-6041-Registered Service.1966
> {noformat}
> As can be seen that last 2 checkpoints are orphan and they would prevent 
> revision gc. For auto mode we can use following heuristic
> # List all current checkpoints
> # Only keep the latest checkpoint for given {{creator}} and {{name}} combo. 
> Other entries from same pair which are older i.e. creation time can be 
> consider orphan and deleted
> This logic can be implemented 
> {{org.apache.jackrabbit.oak.checkpoint.Checkpoints}} and can be invoked by 
> Revision GC logic (both in DocumentNodeStore and SegmentNodeStore) to 
> determine the base revision to keep



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to