jcoglan opened a new pull request, #5889: URL: https://github.com/apache/couchdb/pull/5889
This PR builds atop https://github.com/apache/couchdb/pull/5858 to add a `$data` operator to Mango, which is particularly useful for VDUs as it allows comparisons between different fields in the input, and thus allows some authorisation logic to be implemented in Mango instead of JS. The syntax of the operator is `{ "$data": Path }` where `Path` has the same syntax as the compound field accessor, i.e. it consists of one or more property names separated by `.`. Thus this expression: ```json { "a": { "$gt": "b.c" } } ``` Means the doc's `a` property must be greater than its `b.c` property. This would accept `{ a: 2, b: { c: 1 } }` and reject `{ a: 2, b: { c: 3 } }`. It would also reject docs where the referent of the `$data` operator is missing, e.g. `{ a: 2, b: {} }`. The `Path` value has one additional bit of syntax which is that it can be prefixed with one or more `.` characters to indicate relative paths. For example: `{ "a": { "$gt": { "$data": ".b" } } }` means the `a` field must be greater than the `b` field that is next to the `a` field. A path of `..b` would denote the `b` field in the parent object relative to the `a` field, and so on. The `$data` operator may only be used with operators that expect a literal value as input, and it cannot be used with combinators like `$and`, `$or`, `$not`, `$allMatch`, `$elemMatch`, and so on. This is because such uses would allow an input doc to inject its own selectors and use them to bypass the intended logic. This is implemented as follows: - During normalization, `{ "$data": Path }` expressions are pre-parsed so that `Path` is broken into its constituent pieces. i.e. `<<"a.b">>` becomes `[<<"a">>, <<"b">>]`. Relative paths produce a special token at the front to say how many levels to navigate back up the doc before following the path, e.g. `<<"..a.b">>` parses to `[{[{<<"parent">, 2}]}, <<"a">>, <<"b">>]`. This structure is chosen so that the "parent" token is a valid ejson value. - If `$data` is used incorrectly, i.e. it is used with an operator that it is not allowed, it is rejected at normalization so it cannot be used to evaluate any docs. - To support relative paths, variants of `mango_doc:get_field` are added that record the stack of objects traversed while accessing the requested field, and retrieve an item from a given depth in the stack. The `#ctx` record gains a `stack` property which tracks the stack of objects there were traversed to reach the current point. - During matching, expressions of the form `{ Op: { $data: Path } }` have the data reference evaluated first, to turn them into `{ Op: Value }` before continuing. For consistency with existing semantics, `{ Field: { $data: Path } }` is considered to mean `{ Field: { $eq: { $data: Path } } }`. If this is accepted, we should add `userCtx` and `securityObj` (names TBC) to the structure passed to VDU expressions so that they can implement auth logic. The tests added here use the `$data` operator to create a rule where updates to a doc cannot remove any of the existing items from the `tags` field. ## Testing recommendations Some unit tests for normalisation and evaluation are included but I think they could be more thorough, I'd welcome suggestions for edge cases I've overlooked. ## Related Issues or Pull Requests - https://github.com/apache/couchdb/pull/5792 - https://github.com/apache/couchdb/pull/5839 - https://github.com/apache/couchdb/pull/5858 ## Checklist - [ ] This is my own work, I did not use AI, LLM's or similar technology - [ ] Code is written and works correctly - [ ] Changes are covered by tests - [ ] Any new configurable parameters are documented in `rel/overlay/etc/default.ini` - [ ] Documentation changes were made in the `src/docs` folder - [ ] Documentation changes were backported (separated PR) to affected branches -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
