adriangb opened a new pull request, #20341:
URL: https://github.com/apache/datafusion/pull/20341

   ## Summary
   
   Follow-up to #20117 which added the `ExtractLeafExpressions` and 
`PushDownLeafProjections` optimizer rules for get_field pushdown.
   
   Benchmarking revealed that these rules added 5-31% overhead on *all* queries 
(including those with no struct/get_field expressions) because they 
unconditionally allocated column HashSets, extractors, and walked every 
expression tree for every Filter/Sort/Limit/Aggregate/Join node.
   
   This PR adds:
   - **`has_extractable_expr()` pre-scan**: A lightweight check using 
`Expr::exists()` that short-circuits before any expensive allocations when no 
`MoveTowardsLeafNodes` expressions are present
   - **Config option** `datafusion.optimizer.enable_leaf_expression_pushdown` 
to disable the rules entirely
   
   ### Benchmark Results (vs no-rules baseline)
   
   | Benchmark | Before Fix | After Fix |
   |---|---|---|
   | physical_select_aggregates_from_200 | +31.1% | +3.7% |
   | physical_many_self_joins | +12.9% | +2.2% |
   | physical_join_consider_sort | +12.9% | +1.0% |
   | physical_unnest_to_join | +12.5% | +1.4% |
   | physical_select_one_from_700 | +12.2% | +2.6% |
   | physical_theta_join_consider_sort | +8.7% | +0.2% |
   | physical_plan_tpch_q18 | +9.3% | +1.4% |
   | physical_plan_tpch_all | +4.8% | +2.1% |
   | physical_plan_tpcds_all | +5.6% | +2.2% |
   
   ## Test plan
   
   - [x] All 47 `extract_leaf_expressions` unit tests pass
   - [x] Benchmarked with `cargo bench -p datafusion --bench sql_planner`
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to