zhuqi-lucas opened a new pull request, #22733:
URL: https://github.com/apache/datafusion/pull/22733
## Which issue does this PR close?
Closes #22732.
## Rationale for this change
`LogicalPlan` is 320 bytes today, sized by the `Ddl(DdlStatement)` variant —
which is itself 320 bytes because it carries `CreateExternalTable` (312 bytes)
and `CreateFunction` (288 bytes). Every non-DDL variant is at most 176 bytes
(`Join`). Every SELECT query pays the full 320-byte payload on every
`mem::take` / `mem::swap` / `Arc<LogicalPlan>` write during planning, even
though it never instantiates a DDL node.
Profiling `sql_planner` (samply, macOS aarch64) showed ~13% of CPU in
`libsystem_platform.dylib` (memcpy / memmove). Shrinking the most
frequently-moved type directly attacks that pool.
## What changes are included in this PR?
Box the two oversized DDL variants:
```rust
pub enum DdlStatement {
CreateExternalTable(Box<CreateExternalTable>),
// …
CreateFunction(Box<CreateFunction>),
// …
}
```
After this change:
- `DdlStatement` drops from **320 → ~152 bytes** (max variant is now
`CreateIndex` at 144).
- `LogicalPlan` drops from **320 → 176 bytes (–45%)** — the enum
discriminant fits inside `Join`'s alignment padding, so the enum is exactly the
same width as `Join`.
DDL plan construction takes one extra `Box::new(...)` allocation per DDL
statement — negligible because DDL plans are one-shot and not on the per-query
hot path.
The diff is mechanical: construction sites add `Box::new(...)`, the few
field-destructuring patterns are converted to a `let CreateExternalTable { … }
= ce.as_ref();` shape, and the FFI/proto crates need one `*cmd` /
`Box::new(cmd)` adjustment at the type boundaries.
A new `test_size_of_logical_plan` unit test pins `size_of::<LogicalPlan>()
== 176` and asserts `DdlStatement` stays smaller than `Join`, so future variant
growth that would re-balloon the enum trips the test rather than silently
regressing the planning hot path. (Same shape as the existing
`test_size_of_expr` in `expr.rs`.)
## Are these changes tested?
- 711 optimizer + 217 expr + 86 sql + 7 proto unit tests pass.
- SLT `create_external_table` / `create_function` / `ddl` pass.
- `cargo clippy --workspace --all-targets -- -D warnings` clean.
## Benchmarks
Local `cargo bench -p datafusion --bench sql_planner --quick` on macOS
aarch64:
| bench | main | boxed | delta |
|---|---|---|---|
| `optimizer_tpch_all` | 8.61 ms | 8.18 ms | **–5.0%** |
| `optimizer_tpcds_all` | 168.0 ms | 163.5 ms | **–2.7%** |
CI bench on the GKE aarch64 runner will give a tighter signal — requesting
maintainer to trigger `run benchmark sql_planner`.
## Are there any user-facing changes?
Yes (semantic API only). Code that pattern-matches
`DdlStatement::CreateExternalTable(CreateExternalTable { … })` must change to
bind the box and deref. Code that constructs these variants must wrap with
`Box::new(...)`. No behavioral change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]