adriangb opened a new pull request, #21768:
URL: https://github.com/apache/datafusion/pull/21768

   ## Which issue does this PR close?
   
   - Closes #.
   
   (Follow-up to #21160, which introduced per-category metric filtering via 
session config. This PR lets users reach those knobs inline from the EXPLAIN 
statement.)
   
   ## Rationale for this change
   
   #21160 added metric categories (`Rows`, `Bytes`, `Timing`, `Uncategorized`) 
and a verbosity level (`Summary`, `Dev`) to DataFusion's metrics, exposed today 
only via session config:
   
   - `datafusion.explain.analyze_categories`
   - `datafusion.explain.analyze_level`
   
   Users have to `SET` these out-of-band before running `EXPLAIN ANALYZE`, 
which is awkward for ad-hoc debugging. Postgres solves this with its 
parenthesized option list:
   
   ```sql
   EXPLAIN (ANALYZE, BUFFERS, VERBOSE, SETTINGS, WAL) SELECT ... ;
   ```
   
   This PR adds the same ergonomics to DataFusion, mapping option names to 
DataFusion's existing semantics rather than Postgres's buffer/WAL model.
   
   ## What changes are included in this PR?
   
   **Parser.** On dialects whose `supports_explain_with_utility_options()` 
returns true (the default `GenericDialect`, `PostgreSqlDialect`, 
`DuckDbDialect`, etc.), `DFParser::parse_explain` delegates to sqlparser's `pub 
fn parse_utility_options()` and feeds the result through a new 
`ExplainStatementOptions::from_utility_options`. The legacy keyword form 
(`EXPLAIN ANALYZE VERBOSE FORMAT tree ...`) is unchanged.
   
   **Normalized option type.** A new `ExplainStatementOptions` in 
`datafusion-common` captures the knobs parsed from either form. Argument 
parsing reuses existing `ExplainFormat::from_str`, 
`ExplainAnalyzeCategories::from_str`, and `MetricType::from_str`.
   
   **Options accepted:**
   
   | Option    | Argument         | Effect                                      
                          |
   | --------- | ---------------- | 
--------------------------------------------------------------------- |
   | `ANALYZE` | bool, default T  | Same as keyword `ANALYZE`                   
                          |
   | `VERBOSE` | bool, default T  | Same as keyword `VERBOSE`                   
                          |
   | `FORMAT`  | ident/string     | `indent` / `tree` / `pgjson` / `graphviz`   
                          |
   | `METRICS` | string           | `'all'`, `'none'`, or comma-separated 
`rows,bytes,timing,uncategorized` |
   | `LEVEL`   | ident/string     | `summary` or `dev`                          
                          |
   | `TIMING`  | bool             | Sugar: toggles inclusion of the `timing` 
category                     |
   | `SUMMARY` | bool             | Sugar: TRUE → `summary`, FALSE → `dev`      
                          |
   | `COSTS`   | bool             | Per-statement `show_statistics` override 
(not valid with `ANALYZE`)   |
   
   Postgres-only options (`BUFFERS`, `WAL`, `SETTINGS`, `GENERIC_PLAN`, 
`MEMORY`) return a helpful unsupported-option error.
   
   **Logical plan.** `Analyze` gains `analyze_level: Option<MetricType>` and 
`analyze_categories: Option<ExplainAnalyzeCategories>`. `Explain` gains 
`show_statistics: Option<bool>`. `None` means "fall back to session config" — 
existing callers are unchanged.
   
   **Physical planner.** `handle_analyze` and `handle_explain` prefer 
statement-level overrides over session config before constructing `AnalyzeExec` 
/ `ExplainExec`. `AnalyzeExec` itself needs no change — it already accepts the 
filters from #21160.
   
   **Proto** (follow-up, see TODOs in 
`datafusion/proto/src/logical_plan/mod.rs`): the new override fields are not 
yet serialized. They default to `None` on the remote side, matching pre-PR 
behavior; round-trip tests still pass.
   
   ## Are these changes tested?
   
   Yes:
   
   - **Unit tests** in `datafusion/sql/src/parser.rs` cover legacy keyword form 
on PostgreSQL dialect, each option form (`bare`, `= val`, `ON/OFF`, quoted), 
unknown-option errors, dialect gating (the parenthesized form is rejected under 
a dialect that doesn't enable it), and the error path for unsupported 
Postgres-only options.
   - **Integration tests** in `datafusion/core/tests/sql/explain_analyze.rs` — 
`explain_analyze_paren_metrics_filtering`, 
`explain_analyze_paren_level_overrides_session_config`, 
`explain_analyze_paren_metrics_overrides_session_config`, 
`explain_paren_buffers_rejected`.
   - **sqllogictest** fixtures in 
`datafusion/sqllogictest/test_files/explain.slt` covering the parenthesized 
form, round-trip with the legacy form, and each error path.
   
   Ran `cargo fmt --all` and `cargo clippy --all-targets --all-features -- -D 
warnings` (clean). Two pre-existing test failures on `main` 
(`test_display_pg_json` snapshot and a `pgjson` SLT case at `explain.slt:642`) 
are unrelated to this change — verified by running them against a clean 
checkout of the same base commit.
   
   ## Are there any user-facing changes?
   
   Yes — new syntax. User-facing docs updated at 
`docs/source/user-guide/explain-usage.md` with a new section describing the 
option list and the dialect gate. No breaking changes: the legacy keyword form 
continues to work exactly as before.
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to