JingsongLi commented on PR #8136: URL: https://github.com/apache/paimon/pull/8136#issuecomment-4637972034
I found a few correctness issues in the query-auth paths introduced here: 1. `paimon-python/pypaimon/read/datasource/split_provider.py:127` constructs `ReadBuilder(self._ensure_table())` directly. That bypasses `FileStoreTable.new_read_builder()`, which is where the REST query auth is injected. As a result, `pypaimon.ray.read_paimon(...)` can read REST tables without applying server-side row filters or column masking. I think this should either call `self._ensure_table().new_read_builder()` or explicitly pass the table's query auth into the builder, with a Ray regression test for row filtering/masking. 2. `paimon-python/pypaimon/read/stream_read_builder.py:117` stores `_query_auth`, but `new_streaming_scan()` does not pass it into `AsyncStreamingTableScan`. The plans returned from `streaming_table_scan.py:322` and `streaming_table_scan.py:386` also do not go through `auth_result.convert_plan()`. So `table.new_stream_read_builder()` skips row filters and column masking for both the initial scan and later delta/changelog scans. The streaming scan should preserve and apply query auth before returning each plan. 3. The auth reader wrappers currently assume the inner reader supports `read_arrow_batch()` (`paimon-python/pypaimon/read/reader/auth_masking_reader.py:38` and `:66`). For primary-key tables with non-raw-convertible splits, `TableRead` can create a `MergeFileSplitRead`, whose `create_reader()` returns the normal row `RecordReader` path rather than a `RecordBatchReader`. Wrapping that in `AuthFilterReader`/`AuthMaskingReader` will fail with `AttributeError` when query auth is enabled. This needs either a row-reader auth path, conversion to a batch-capable reader before wrapping, or routing/rejecting these splits explicitly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
