QuakeWang opened a new pull request, #28:
URL: https://github.com/apache/paimon-mosaic/pull/28
## Problem
Python `read_table(..., columns=...)` loses the requested projection when
reading a Mosaic file with zero row groups.
`MosaicReader.read_all()` normally uses the first returned batch schema, so
projected reads with row groups return the expected schema. For zero row group
files there is no batch, and the fallback used the full reader schema. As a
result, `columns=[]` and projected column reads returned an empty table with
the full file schema.
## Fix
Keep the core/FFI behavior unchanged and make the Python fallback
projection-aware:
* Cache the projected schema in `MosaicReader.project()` after native
projection succeeds.
* Use the cached projected schema in `read_all()` when there are no
batches.
* Preserve `reader.schema` as the full file schema.
* Match native duplicate projection behavior by keeping the first
occurrence only.
* Add Python coverage for zero row group reads with empty, single-column,
and duplicate projections.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]