Chetan Mehrotra created OAK-4826:

             Summary: Auto removal of orphaned checkpoints
                 Key: OAK-4826
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: core
            Reporter: Chetan Mehrotra
             Fix For: 1.6

Currently if in a running system there are some orphaned checkpoint present 
then they prevent the revision gc (compaction for segment) from being 

So far the practice has been to use {{oak-run checkpoints rm-unreferenced}} 
command to clean them up manually. This was set to manual as it was not 
possible to determine whether current checkpoint is in use or not. 
rm-unreferenced works with the basis that checkpoints are only made from 
AsyncIndexUpdate and hence can check if the checkpoint is in use by cross 
checking with {{:async}} state. Doing it in auto mode is risky as 
{{checkpoint}} api can be used by any module.

With OAK-2314 we also record some metadata like {{creator}} and {{name}}. This 
can be used for auto cleanup. For example in some running system following 
checkpoints are listed


Mon Sep 19 18:02:09 EDT 2016    Sun Jun 16 18:02:09 EDT 2019    
thread=sling-default-4070-Registered Service.653
Mon Sep 19 18:02:09 EDT 2016    Sun Jun 16 18:02:09 EDT 2019    
thread=sling-default-4072-Registered Service.656
Fri Aug 19 18:57:33 EDT 2016    Thu May 16 18:57:33 EDT 2019    
thread=sling-default-10-Registered Service.654
Wed Aug 10 12:13:20 EDT 2016    Tue May 07 12:25:52 EDT 2019    
thread=sling-default-6041-Registered Service.1966

As can be seen that last 2 checkpoints are orphan and they would prevent 
revision gc. For auto mode we can use following heuristic

# List all current checkpoints
# Only keep the latest checkpoint for given {{creator}} and {{name}} combo. 
Other entries from same pair which are older i.e. creation time can be consider 
orphan and deleted

This logic can be implemented 
{{org.apache.jackrabbit.oak.checkpoint.Checkpoints}} and can be invoked by 
Revision GC logic (both in DocumentNodeStore and SegmentNodeStore) to determine 
the base revision to keep

This message was sent by Atlassian JIRA

Reply via email to