Thomas Mueller created OAK-1341:
-----------------------------------

             Summary: MongoMK: Implement garbage collection
                 Key: OAK-1341
                 URL: https://issues.apache.org/jira/browse/OAK-1341
             Project: Jackrabbit Oak
          Issue Type: Sub-task
          Components: mongomk
    Affects Versions: 0.16
            Reporter: Thomas Mueller
            Assignee: Thomas Mueller
            Priority: Minor


For the MongoMK (as well as for other storage engines that are based on it), 
garbage collection is most easily implemented by iterating over all documents 
and removing unused entries (either whole documents, or data within the 
document). 

Iteration can be done in parallel (for example one process per shard), and it 
can be done in any order. 

The most efficient order is probably the id order; however, it might be better 
to iterate only over documents that were not changed recently, by using the 
index on the "_modified" property. That way we don't need to iterate over the 
whole repository over and over again, but just over those documents that were 
actually changed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to