[ 
https://issues.apache.org/jira/browse/OAK-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924042#comment-17924042
 ] 

Horia Poradici commented on OAK-11444:
--------------------------------------

>From MongoDB's documentation it appears that a TTL can be set:
 * {*}{*}Adding auto TTL index to new sub-collection: links from MongoDB 
website:
 ** {*}{*}The index can be created programmatically when the collection is 
created. An indexed field of type ‘Date’ is required, this can be the 
“deletedAt” field which should be added to the collection. 
 ** {*}{*}the expired documents will be removed by a background thread
 ** {*}{*}[*https://www.mongodb.com/docs/v6.0/core/index-ttl/:*]
 * {*}{*}A tutorial with steps to follow can be found here: 
[*https://www.mongodb.com/docs/manual/tutorial/expire-data/*]



We also need the "deletedAt" field for possible recovery of documents.

And indeed as [~reschke] mentioned there may be other use cases in Oak for 
using this TTL feature.

> [full-gc] Save document id and empty properties names before deletion 
> ----------------------------------------------------------------------
>
>                 Key: OAK-11444
>                 URL: https://issues.apache.org/jira/browse/OAK-11444
>             Project: Jackrabbit Oak
>          Issue Type: Story
>          Components: mongomk
>            Reporter: Daniel Iancu
>            Priority: Major
>
> Store document ID and empty properties names into a dedicated *_bin* 
> collection
> before physically deletion  from Mongo nodes collection during full gc.
> Motivation behind this change is that in case of accidentally deleting data 
> that should have not been deleted (not garbage) this `log` of removed 
> documents and properties will help the complete restoration from backup.
> A separate collection was preferred instead of logging to files because is 
> more reliable. Logs usually needs to be exported to platform like Splunk and 
> the process does not guarantee that all logs are saved. 
> The data saved in *_bin* collection is temporary, the cleaning can be done 
> via setting document TTL or by using an external job to remove it. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to