QuakeWang opened a new pull request, #28:
URL: https://github.com/apache/paimon-mosaic/pull/28

   ## Problem
   
   Python `read_table(..., columns=...)` loses the requested projection when 
reading a Mosaic file with zero row groups.
   
   `MosaicReader.read_all()` normally uses the first returned batch schema, so 
projected reads with row groups return the expected schema. For zero row group 
files there is no batch, and the fallback used the full reader schema. As a 
result, `columns=[]` and projected column reads returned an empty table with 
the full file schema.
   
   ## Fix
   
   Keep the core/FFI behavior unchanged and make the Python fallback 
projection-aware:
   
     * Cache the projected schema in `MosaicReader.project()` after native 
projection succeeds.
     * Use the cached projected schema in `read_all()` when there are no 
batches.
     * Preserve `reader.schema` as the full file schema.
     * Match native duplicate projection behavior by keeping the first 
occurrence only.
     * Add Python coverage for zero row group reads with empty, single-column, 
and duplicate projections.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to