alamb commented on issue #9371:
URL: https://github.com/apache/datafusion/issues/9371#issuecomment-2624551049
Some suggestions for anyone who wants to implement this feature:
Add a configuration option like `datafusion.explain.format` that defaults
to `'indent'` that controls how the explain plan is displayed
For example,
```sql
> set datafusion.explain.format = 'pretty'
> EXPLAIN .....
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type | plan
|
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan | Limit: skip=0, fetch=25
|
| | Sort: l DESC NULLS FIRST, fetch=25
|
| | Projection:
regexp_replace(hits.parquet.Referer,Utf8("^https?://(?:www\.)?([^/]+)/.*$"),Utf8("\1"))
AS k, AVG(character_length(hits.parquet.Referer)) AS l, COUNT(*) AS c,
MIN(hits.parquet.Referer)
|
| | Filter: COUNT(*) > Int64(100000)
|
| | Aggregate:
groupBy=[[regexp_replace(hits.parquet.Referer,
Utf8("^https?://(?:www\.)?([^/]+)/.*$"), Utf8("\1"))]],
aggr=[[AVG(CAST(character_length(hits.parquet.Referer) AS Float64)),
COUNT(UInt8(1)) AS COUNT(*), MIN(hits.parquet.Referer)]]
|
| | Filter: hits.parquet.Referer != Utf8("")
|
| | TableScan: hits.parquet projection=[Referer],
partial_filters=[hits.parquet.Referer != Utf8("")]
|
| physical_plan |
┌─────────────────────────────┐
│┌───────────────────────────┐│
││ Physical Plan ││
│└───────────────────────────┘│
└─────────────────────────────┘
┌───────────────────────────┐
│ TOP_N │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ Top 25 │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ avg(length(hits.Referer)) │
│ DESC │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ PROJECTION │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ 0 │
│ l │
│ c │
│ min(Referer) │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ FILTER │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ (count_star() > 100000) │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ EC: 99997497 │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ HASH_GROUP_BY │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ #0 │
│ count_star() │
│ avg(#1) │
│ min(#2) │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ PROJECTION │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ k │
│ length(Referer) │
│ Referer │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ FILTER │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ (Referer != '') │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ EC: 99997497 │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ PARQUET_SCAN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ Referer │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ EC: 99997497 │
└───────────────────────────┘
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set. Query took 0.045 seconds.
```
## Implementation suggestion:
1. Add a new variant to
https://docs.rs/datafusion/latest/datafusion/physical_plan/enum.DisplayFormatType.html
for `Pretty`
2. Implement it only for for a few nodes first (`Projection`, `FilterExec`,
etc) and leave the implementation of other nodes as placeholders
3. Add a test in `sqllogictests/test_files/explain_pretty.slt` for a basic
query like `SELECT * FROM foo`
Once we get the basic framework in place we can then fill out the
implementation of the rest of the ExecutionPlan nodes and figure out how to do
the same for `LogicalPlan`
This strategy would let us get this project started in an incremental way
without having to make one massive PR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]