jcoglan opened a new pull request, #5889:
URL: https://github.com/apache/couchdb/pull/5889

   This PR builds atop https://github.com/apache/couchdb/pull/5858 to add a 
`$data` operator to Mango, which is particularly useful for VDUs as it allows 
comparisons between different fields in the input, and thus allows some 
authorisation logic to be implemented in Mango instead of JS.
   
   The syntax of the operator is `{ "$data": Path }` where `Path` has the same 
syntax as the compound field accessor, i.e. it consists of one or more property 
names separated by `.`. Thus this expression:
   
   ```json
   { "a": { "$gt": "b.c" } }
   ```
   
   Means the doc's `a` property must be greater than its `b.c` property. This 
would accept `{ a: 2, b: { c: 1 } }` and reject `{ a: 2, b: { c: 3 } }`. It 
would also reject docs where the referent of the `$data` operator is missing, 
e.g. `{ a: 2, b: {} }`.
   
   The `Path` value has one additional bit of syntax which is that it can be 
prefixed with one or more `.` characters to indicate relative paths. For 
example: `{ "a": { "$gt": { "$data": ".b" } } }` means the `a` field must be 
greater than the `b` field that is next to the `a` field. A path of `..b` would 
denote the `b` field in the parent object relative to the `a` field, and so on.
   
   The `$data` operator may only be used with operators that expect a literal 
value as input, and it cannot be used with combinators like `$and`, `$or`, 
`$not`, `$allMatch`, `$elemMatch`, and so on. This is because such uses would 
allow an input doc to inject its own selectors and use them to bypass the 
intended logic.
   
   This is implemented as follows:
   
   - During normalization, `{ "$data": Path }` expressions are pre-parsed so 
that `Path` is broken into its constituent pieces. i.e. `<<"a.b">>` becomes 
`[<<"a">>, <<"b">>]`. Relative paths produce a special token at the front to 
say how many levels to navigate back up the doc before following the path, e.g. 
`<<"..a.b">>` parses to `[{[{<<"parent">, 2}]}, <<"a">>, <<"b">>]`. This 
structure is chosen so that the "parent" token is a valid ejson value.
   - If `$data` is used incorrectly, i.e. it is used with an operator that it 
is not allowed, it is rejected at normalization so it cannot be used to 
evaluate any docs.
   - To support relative paths, variants of `mango_doc:get_field` are added 
that record the stack of objects traversed while accessing the requested field, 
and retrieve an item from a given depth in the stack. The `#ctx` record gains a 
`stack` property which tracks the stack of objects there were traversed to 
reach the current point.
   - During matching, expressions of the form `{ Op: { $data: Path } }` have 
the data reference evaluated first, to turn them into `{ Op: Value }` before 
continuing. For consistency with existing semantics, `{ Field: { $data: Path } 
}` is considered to mean `{ Field: { $eq: { $data: Path } } }`.
   
   If this is accepted, we should add `userCtx` and `securityObj` (names TBC) 
to the structure passed to VDU expressions so that they can implement auth 
logic. The tests added here use the `$data` operator to create a rule where 
updates to a doc cannot remove any of the existing items from the `tags` field.
   
   ## Testing recommendations
   
   Some unit tests for normalisation and evaluation are included but I think 
they could be more thorough, I'd welcome suggestions for edge cases I've 
overlooked.
   
   ## Related Issues or Pull Requests
   
   - https://github.com/apache/couchdb/pull/5792
   - https://github.com/apache/couchdb/pull/5839
   - https://github.com/apache/couchdb/pull/5858
   
   ## Checklist
   
   - [ ] This is my own work, I did not use AI, LLM's or similar technology
   - [ ] Code is written and works correctly
   - [ ] Changes are covered by tests
   - [ ] Any new configurable parameters are documented in 
`rel/overlay/etc/default.ini`
   - [ ] Documentation changes were made in the `src/docs` folder
   - [ ] Documentation changes were backported (separated PR) to affected 
branches
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to