adriangb opened a new pull request, #21767:
URL: https://github.com/apache/datafusion/pull/21767

   ## Which issue does this PR close?
   
   - Closes #.
   
   ## Rationale for this change
   
   DataFusion already emits PostgreSQL JSON (pgjson) for logical plans via 
`EXPLAIN (FORMAT pgjson) ...`. This PR extends that support to `EXPLAIN 
ANALYZE` so the physical plan, along with live execution metrics, can be fed 
into pgjson visualizers such as [Dalibo](https://explain.dalibo.com/) and PEV2.
   
   Today, `EXPLAIN ANALYZE FORMAT pgjson` is explicitly rejected in the planner 
with `"EXPLAIN ANALYZE with FORMAT is not supported"`. With this PR the 
restriction is lifted for pgjson.
   
   ## What changes are included in this PR?
   
   - Add a `format: ExplainFormat` field to the logical `Analyze` node and the 
physical `AnalyzeExec` operator, threaded through SQL parsing, logical 
planning, and physical planning.
   - Accept `EXPLAIN ANALYZE FORMAT pgjson <stmt>`. `Tree` and `Graphviz` with 
`ANALYZE` still error with a clear message (out of scope for this PR).
   - Add `DisplayableExecutionPlan::pgjson()` and a new 
`PgJsonExecutionPlanVisitor` that mirror the logical-plan `PgJsonVisitor`. 
Per-node output includes:
     - `Node Type` — `ExecutionPlan::name()`
     - `Details` — the one-line `DisplayAs::Default` rendering
     - `Actual Rows` / `Actual Total Time` — PG-canonical metric keys populated 
from `output_rows` / `elapsed_compute` (emitted as float milliseconds; note 
DataFusion records compute time, not wall time)
     - `Extras` — remaining DataFusion metrics keyed by their native name
     - `Plans` — child nodes
   - Add an optional `set_summary()` builder so `AnalyzeExec` can attach `Total 
Rows` and `Duration` at the root in verbose mode.
   - Honor existing `analyze_level` / `analyze_categories` config exactly as 
`indent()` does.
   
   ## Are these changes tested?
   
   - Unit tests in `datafusion/physical-plan/src/display.rs`:
     - `pgjson_renders_plan_without_metrics`
     - `pgjson_includes_summary_when_set`
     - `pgjson_snapshot_of_sample_plan` (insta snapshot)
   - sqllogictest coverage in 
`datafusion/sqllogictest/test_files/explain_analyze.slt`:
     - Structural golden for `EXPLAIN ANALYZE FORMAT pgjson` with 
`analyze_categories = 'none'`
     - Negative tests for `EXPLAIN ANALYZE FORMAT tree` and `EXPLAIN ANALYZE 
FORMAT graphviz`
   - `cargo clippy --all-targets --all-features -- -D warnings` clean on the 
touched crates; `cargo fmt --all` clean.
   
   ## Are there any user-facing changes?
   
   Yes — new syntax is accepted:
   
   ```sql
   EXPLAIN ANALYZE FORMAT pgjson SELECT count(*) FROM t;
   ```
   
   No existing behavior changes: the default (`EXPLAIN ANALYZE ...` with no 
`FORMAT`) still emits the indent-format plan with metrics, and `EXPLAIN (FORMAT 
pgjson) ...` on the logical plan is unchanged.
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to