zhuqi-lucas opened a new pull request, #22733:
URL: https://github.com/apache/datafusion/pull/22733

   ## Which issue does this PR close?
   
   Closes #22732.
   
   ## Rationale for this change
   
   `LogicalPlan` is 320 bytes today, sized by the `Ddl(DdlStatement)` variant — 
which is itself 320 bytes because it carries `CreateExternalTable` (312 bytes) 
and `CreateFunction` (288 bytes). Every non-DDL variant is at most 176 bytes 
(`Join`). Every SELECT query pays the full 320-byte payload on every 
`mem::take` / `mem::swap` / `Arc<LogicalPlan>` write during planning, even 
though it never instantiates a DDL node.
   
   Profiling `sql_planner` (samply, macOS aarch64) showed ~13% of CPU in 
`libsystem_platform.dylib` (memcpy / memmove). Shrinking the most 
frequently-moved type directly attacks that pool.
   
   ## What changes are included in this PR?
   
   Box the two oversized DDL variants:
   
   ```rust
   pub enum DdlStatement {
       CreateExternalTable(Box<CreateExternalTable>),
       // …
       CreateFunction(Box<CreateFunction>),
       // …
   }
   ```
   
   After this change:
   
   - `DdlStatement` drops from **320 → ~152 bytes** (max variant is now 
`CreateIndex` at 144).
   - `LogicalPlan` drops from **320 → 176 bytes (–45%)** — the enum 
discriminant fits inside `Join`'s alignment padding, so the enum is exactly the 
same width as `Join`.
   
   DDL plan construction takes one extra `Box::new(...)` allocation per DDL 
statement — negligible because DDL plans are one-shot and not on the per-query 
hot path.
   
   The diff is mechanical: construction sites add `Box::new(...)`, the few 
field-destructuring patterns are converted to a `let CreateExternalTable { … } 
= ce.as_ref();` shape, and the FFI/proto crates need one `*cmd` / 
`Box::new(cmd)` adjustment at the type boundaries.
   
   A new `test_size_of_logical_plan` unit test pins `size_of::<LogicalPlan>() 
== 176` and asserts `DdlStatement` stays smaller than `Join`, so future variant 
growth that would re-balloon the enum trips the test rather than silently 
regressing the planning hot path. (Same shape as the existing 
`test_size_of_expr` in `expr.rs`.)
   
   ## Are these changes tested?
   
   - 711 optimizer + 217 expr + 86 sql + 7 proto unit tests pass.
   - SLT `create_external_table` / `create_function` / `ddl` pass.
   - `cargo clippy --workspace --all-targets -- -D warnings` clean.
   
   ## Benchmarks
   
   Local `cargo bench -p datafusion --bench sql_planner --quick` on macOS 
aarch64:
   
   | bench | main | boxed | delta |
   |---|---|---|---|
   | `optimizer_tpch_all` | 8.61 ms | 8.18 ms | **–5.0%** |
   | `optimizer_tpcds_all` | 168.0 ms | 163.5 ms | **–2.7%** |
   
   CI bench on the GKE aarch64 runner will give a tighter signal — requesting 
maintainer to trigger `run benchmark sql_planner`.
   
   ## Are there any user-facing changes?
   
   Yes (semantic API only). Code that pattern-matches 
`DdlStatement::CreateExternalTable(CreateExternalTable { … })` must change to 
bind the box and deref. Code that constructs these variants must wrap with 
`Box::new(...)`. No behavioral change.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to