QuakeWang opened a new pull request, #8017:
URL: https://github.com/apache/paimon/pull/8017
### Purpose
close: #7998
Daft's Paimon reader already chooses between native Parquet reads and
pypaimon fallback internally, but that routing decision was not observable from
the public Paimon Daft API. `ReadBuilder.explain()` only describes the Paimon
scan plan, so users could not diagnose whether a slow scan was caused by PK
merge, deletion vectors, BLOB columns, non-Parquet format, or pushdown behavior.
This PR adds a structured Daft-side scan explain API:
- `explain_paimon_scan(...)`
- `PaimonTable.explain_scan(...)`
The result includes the underlying Paimon scan explain plus Daft reader
routing details: native/fallback split and file counts, fallback reasons,
pushed/remaining filters, projection/limit pushdown status, and optional
per-split reader mode.
The implementation reuses the same scan builder, partition filtering, and
native/fallback routing helpers used by `PaimonDataSource.get_tasks()` to avoid
divergence between diagnostics and actual execution.
### Tests
- `pytest paimon-python/pypaimon/tests/daft/daft_explain_test.py -q`
- `pytest paimon-python/pypaimon/tests/daft/daft_data_test.py
paimon-python/pypaimon/tests/daft/daft_sink_test.py -q`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]