QuakeWang opened a new pull request, #8017:
URL: https://github.com/apache/paimon/pull/8017

   ### Purpose
   
   close: #7998 
   
   Daft's Paimon reader already chooses between native Parquet reads and 
pypaimon fallback internally, but that routing decision was not observable from 
the public Paimon Daft API. `ReadBuilder.explain()` only describes the Paimon 
scan plan, so users could not diagnose whether a slow scan was caused by PK 
merge, deletion vectors, BLOB columns, non-Parquet format, or pushdown behavior.
   
   This PR adds a structured Daft-side scan explain API:
   
     - `explain_paimon_scan(...)`
     - `PaimonTable.explain_scan(...)`
   
   The result includes the underlying Paimon scan explain plus Daft reader 
routing details: native/fallback split and file counts, fallback reasons, 
pushed/remaining filters, projection/limit pushdown status, and optional 
per-split reader mode.
   
   The implementation reuses the same scan builder, partition filtering, and 
native/fallback routing helpers used by `PaimonDataSource.get_tasks()` to avoid 
divergence between diagnostics and actual execution.
   
     ### Tests
   
     - `pytest paimon-python/pypaimon/tests/daft/daft_explain_test.py -q`
     - `pytest paimon-python/pypaimon/tests/daft/daft_data_test.py 
paimon-python/pypaimon/tests/daft/daft_sink_test.py -q`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to