mikerhodes commented on issue #1508: Selective database archiving
URL: https://github.com/apache/couchdb/issues/1508#issuecomment-412562014
 
 
   To be more positive, FWIW, I had some thoughts on this a while ago, and 
thought you could fold this kind of feature into a set of document lifecycle 
rules based on Mango selectors. In addition to archiving, you could also have 
delete as an action for TTL, so I wanted to capture the idea of combining 
potentially different actions and conditions into rules.
   
   What I thought was something that works like this. There's a set of one or 
more rules in a ddoc section which define actions and conditions using a 
selector. The action is `archive`, `delete` and so on. The conditions are 
things like comparing a `doc_expiry` field in a document with "now" :
   
   ```json
   {
     "_id": "_design/lifecyle-example",
     "_rev": "19-4426420428e8744fcfb67763cedd1ea8",
     "doc-lifecycle-rules": [
       {
         "action": "archive", 
         "condition": [ { "doc_expiry": { "$lt": "$$now"}, "type": 
"vital_entry"} ]
       },
       {
         "action": "delete", 
         "condition": [ { "doc_expiry": { "$lt": "$$now"}, "type": 
"spurious_entry"} ]
       }
     ]
   }
   ```
   
   Here:
   
   - `$$now` is a new a special variable syntax, and there would be a few 
predefined ones like `$$now` so this feature can specify dynamic rules.
   - `doc-lifecycle-rules` contains an array of (action,selector) pairs.
       - Maybe there's an `options` field which would take e.g., archiving 
destination.
   - Periodically, say once per minute, the (action,selector) pairs are 
evaluated against the documents in the database.
       - I'd expect an index would be auto-created from the condition selector 
to make this cheap, so the selector may need to be heavily restricted such that 
it can be evaluated from an index alone to identify affected documents.
   - At evaluation time, if more than one (action,selector) pair matches a 
document, only a single action of the matching actions will be run. 
       - It's undefined which action happens, but only one will happen before 
the conditions are re-evaluated. Also, we don't guarantee an ordering, as that 
would need to be a total ordering over all (action,selector) pairs in all ddocs 
in the database.
   - Action is a set of atoms rather than freeform text. Broadly the database 
supports a few key workloads rather than it being arbitrarily expandable.
   - The format allows more actions to gradually be added.
   - We obviously allow for multiple ddocs to have these rules in them.
   - Actions are evaluated async, and we don't guarantee that if, for example, 
a document should've been expired from the database that it won't appear in 
search/view/query requests, you can still GET  it, etc.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to