[ 
https://issues.apache.org/jira/browse/OAK-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929766#comment-17929766
 ] 

José Andrés Cordero Benítez commented on OAK-11444:
---------------------------------------------------

The class `Collection` where the 4 collections are defined (5 in fact, if we 
consider the BLOBS one) are used by all the DocumentNodeStore implementations. 
For example, [here is the RDB 
one|https://github.com/apache/jackrabbit-oak/blob/839c982efaefcc21d631eb3f71398872e1a5a4f7/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/rdb/RDBDocumentStore.java#L670].

If we define it here, we will need to implement it on all the other 
DocumentNodeStore implementations, althought it could be just an [Unsupported 
one like this 
one|https://github.com/apache/jackrabbit-oak/blob/839c982efaefcc21d631eb3f71398872e1a5a4f7/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/Collection.java#L93].

So the approach [~diancu] is proposing should work, we would have a new 
collection but not define, so it would be "invisible" for Oak.

> [full-gc] Save document id and empty properties names before deletion 
> ----------------------------------------------------------------------
>
>                 Key: OAK-11444
>                 URL: https://issues.apache.org/jira/browse/OAK-11444
>             Project: Jackrabbit Oak
>          Issue Type: Story
>          Components: mongomk
>            Reporter: Daniel Iancu
>            Priority: Major
>
> Store document ID and empty properties names into a dedicated *_bin* 
> collection
> before physically deletion  from Mongo nodes collection during full gc.
> Motivation behind this change is that in case of accidentally deleting data 
> that should have not been deleted (not garbage) this `log` of removed 
> documents and properties will help the complete restoration from backup.
> A separate collection was preferred instead of logging to files because is 
> more reliable. Logs usually needs to be exported to platform like Splunk and 
> the process does not guarantee that all logs are saved. 
> The data saved in *_bin* collection is temporary, the cleaning can be done 
> via setting document TTL or by using an external job to remove it. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to