Thomas Mueller created OAK-1341:
-----------------------------------
Summary: MongoMK: Implement garbage collection
Key: OAK-1341
URL: https://issues.apache.org/jira/browse/OAK-1341
Project: Jackrabbit Oak
Issue Type: Sub-task
Components: mongomk
Affects Versions: 0.16
Reporter: Thomas Mueller
Assignee: Thomas Mueller
Priority: Minor
For the MongoMK (as well as for other storage engines that are based on it),
garbage collection is most easily implemented by iterating over all documents
and removing unused entries (either whole documents, or data within the
document).
Iteration can be done in parallel (for example one process per shard), and it
can be done in any order.
The most efficient order is probably the id order; however, it might be better
to iterate only over documents that were not changed recently, by using the
index on the "_modified" property. That way we don't need to iterate over the
whole repository over and over again, but just over those documents that were
actually changed.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)