JingsongLi commented on PR #8136:
URL: https://github.com/apache/paimon/pull/8136#issuecomment-4637972034

   I found a few correctness issues in the query-auth paths introduced here:
   
   1. `paimon-python/pypaimon/read/datasource/split_provider.py:127` constructs 
`ReadBuilder(self._ensure_table())` directly. That bypasses 
`FileStoreTable.new_read_builder()`, which is where the REST query auth is 
injected. As a result, `pypaimon.ray.read_paimon(...)` can read REST tables 
without applying server-side row filters or column masking. I think this should 
either call `self._ensure_table().new_read_builder()` or explicitly pass the 
table's query auth into the builder, with a Ray regression test for row 
filtering/masking.
   
   2. `paimon-python/pypaimon/read/stream_read_builder.py:117` stores 
`_query_auth`, but `new_streaming_scan()` does not pass it into 
`AsyncStreamingTableScan`. The plans returned from 
`streaming_table_scan.py:322` and `streaming_table_scan.py:386` also do not go 
through `auth_result.convert_plan()`. So `table.new_stream_read_builder()` 
skips row filters and column masking for both the initial scan and later 
delta/changelog scans. The streaming scan should preserve and apply query auth 
before returning each plan.
   
   3. The auth reader wrappers currently assume the inner reader supports 
`read_arrow_batch()` 
(`paimon-python/pypaimon/read/reader/auth_masking_reader.py:38` and `:66`). For 
primary-key tables with non-raw-convertible splits, `TableRead` can create a 
`MergeFileSplitRead`, whose `create_reader()` returns the normal row 
`RecordReader` path rather than a `RecordBatchReader`. Wrapping that in 
`AuthFilterReader`/`AuthMaskingReader` will fail with `AttributeError` when 
query auth is enabled. This needs either a row-reader auth path, conversion to 
a batch-capable reader before wrapping, or routing/rejecting these splits 
explicitly.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to