[ 
https://issues.apache.org/jira/browse/OAK-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-1341:
--------------------------------

    Fix Version/s:     (was: 0.15)
                   0.16

> MongoMK: Implement garbage collection
> -------------------------------------
>
>                 Key: OAK-1341
>                 URL: https://issues.apache.org/jira/browse/OAK-1341
>             Project: Jackrabbit Oak
>          Issue Type: Sub-task
>          Components: mongomk
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>            Priority: Minor
>             Fix For: 0.16
>
>
> For the MongoMK (as well as for other storage engines that are based on it), 
> garbage collection is most easily implemented by iterating over all documents 
> and removing unused entries (either whole documents, or data within the 
> document). 
> Iteration can be done in parallel (for example one process per shard), and it 
> can be done in any order. 
> The most efficient order is probably the id order; however, it might be 
> better to iterate only over documents that were not changed recently, by 
> using the index on the "_modified" property. That way we don't need to 
> iterate over the whole repository over and over again, but just over those 
> documents that were actually changed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to