[
https://issues.apache.org/jira/browse/OAK-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924499#comment-17924499
]
José Andrés Cordero Benítez commented on OAK-11444:
---------------------------------------------------
+1 from me.
Reading single documents from SETTINGS collection, like Sweep2 and
VersionGarbageCollector do, shouldn't affect performance because they are read
by document _id, which is indexed at Mongo level. The problem would come if
there is something we are not thinking about, that traverses all the documents
in SETTINGS collection. But I don't think there is such a use case, since this
collection has very little usage. On a normal environment there is just around
5-6 documents under SETTINGS collection.
SETTINGS/bin sounds good as a name for those nodes.
> [full-gc] Save document id and empty properties names before deletion
> ----------------------------------------------------------------------
>
> Key: OAK-11444
> URL: https://issues.apache.org/jira/browse/OAK-11444
> Project: Jackrabbit Oak
> Issue Type: Story
> Components: mongomk
> Reporter: Daniel Iancu
> Priority: Major
>
> Store document ID and empty properties names into a dedicated *_bin*
> collection
> before physically deletion from Mongo nodes collection during full gc.
> Motivation behind this change is that in case of accidentally deleting data
> that should have not been deleted (not garbage) this `log` of removed
> documents and properties will help the complete restoration from backup.
> A separate collection was preferred instead of logging to files because is
> more reliable. Logs usually needs to be exported to platform like Splunk and
> the process does not guarantee that all logs are saved.
> The data saved in *_bin* collection is temporary, the cleaning can be done
> via setting document TTL or by using an external job to remove it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)