MgjLLL opened a new pull request, #8136:
URL: https://github.com/apache/paimon/pull/8136
### Purpose
Adds query-auth support to the Python client so it honors the row-level
filter and column masking rules returned by a REST catalog, matching the
existing JVM client behavior.
When the new option `query-auth.enabled` is set to `true`, before producing
a `Plan` the client calls `POST /v1/.../databases/{db}/tables/{tb}/auth` with
the projected fields, receives `{ filter, columnMasking }`, and applies them on
the read path:
- `RESTApi.auth_table_query` issues the call (new request/response models
`AuthTableQueryRequest` / `AuthTableQueryResponse`, new path in
`ResourcePaths.auth_table`).
- `TableQueryAuth` / `TableQueryAuthResult` (`catalog/table_query_auth.py`)
wrap the result and convert each split to a `QueryAuthSplit`.
- `predicate_json_parser` (`common/predicate_json_parser.py`) parses Paimon
predicate JSON into a PyArrow compute filter
(EQ/NEQ/LT/LTEQ/GT/GTEQ/IS_NULL/IS_NOT_NULL/IN/NOT_IN/STARTS_WITH/ENDS_WITH/CONTAINS/AND/OR/NOT).
- `AuthFilterReader` / `AuthMaskingReader` / `ColumnProjectReader`
(`read/reader/auth_masking_reader.py`) implement row filtering, column masking
transforms (`NULL`, `FIELD_REF`, `CAST`, `UPPER`, `LOWER`, `CONCAT`,
`CONCAT_WS`) and final projection back to the user's requested columns.
- `read_builder` / `stream_read_builder` / `table_read` / `table_scan` /
`file_store_table` / `catalog_environment` / `rest_catalog` are wired to invoke
the auth call and pull extra fields required only by the auth filter.
Behavior is gated by the new `CoreOptions.QUERY_AUTH_ENABLED`
(`query-auth.enabled`, default `false`), so existing users see no change.
### Tests
Three new test files (994+ lines, all passing locally under `pytest`):
- `paimon-python/pypaimon/tests/predicate_json_parser_test.py` — covers each
predicate kind, nested AND/OR/NOT, type coercion, null handling, and
`extract_referenced_fields`.
- `paimon-python/pypaimon/tests/auth_masking_reader_test.py` — covers each
masking transform, missing-field validation, and projection back to the
user-requested columns.
- `paimon-python/pypaimon/tests/table_query_auth_test.py` — end-to-end
coverage: REST catalog calls `auth_table_query`, the result is plumbed into the
plan, splits become `QueryAuthSplit`, and reads return filtered + masked rows.
Local check:
```
cd paimon-python
python -m pytest pypaimon/tests/predicate_json_parser_test.py \
pypaimon/tests/auth_masking_reader_test.py \
pypaimon/tests/table_query_auth_test.py -q
flake8 --config dev/cfg.ini pypaimon/ # 已在改动范围内通过
```
### API and Format
- New catalog option: `query-auth.enabled` (boolean, default `false`).
- New REST endpoint consumed by the client: `POST
/v1/{prefix}/databases/{db}/tables/{tb}/auth`. Request `{ "select": [...] }`,
response `{ "filter": [<predicate-json>...], "columnMasking": { <col>:
<transform-json>, ... } }`. The contract follows the existing Java client; no
server-side change is required for catalogs that already implement query auth.
- No change to existing user-facing Python APIs. New types
(`AuthTableQueryRequest`, `AuthTableQueryResponse`, `TableQueryAuth`,
`TableQueryAuthResult`, `QueryAuthSplit`, `AuthFilterReader`,
`AuthMaskingReader`, `ColumnProjectReader`) are additive and live under
existing modules.
- File format / on-disk layout: unchanged.
### Documentation
The new option `query-auth.enabled` should be reflected in the Python
configuration reference. Happy to add the docs entry in this PR or in a
follow-up — please advise.
This closes #8135
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]