zhuqi-lucas opened a new issue, #22555:
URL: https://github.com/apache/datafusion/issues/22555

   ### Is your feature request related to a problem or challenge?
   
   Follow-up suggested by @2010YOUY01 in #22521.
   
   `datafusion_physical_plan::with_new_children_if_necessary` already exists 
and is the right way for callers to rebuild a plan from new children — it 
short-circuits via `Arc::ptr_eq` when children are unchanged, skipping the 
(often expensive) `with_new_children` recomputation of schema / equivalence 
properties / output ordering / partitioning.
   
   However, today there is no project-wide enforcement that callers route 
through this helper. Optimizer rules and other call sites can — and frequently 
do — call `plan.with_new_children(children)` directly, missing the 
optimization. #22521 hit this in `ensure_distribution`; the same pattern likely 
exists in other optimizer passes.
   
   ### Describe the solution you'd like
   
   1. **Audit existing `plan.with_new_children(children)` call sites** across 
the project. For each, decide whether the helper would change behavior (it 
should not, by design) and migrate to `with_new_children_if_necessary` where 
appropriate.
   
   2. **Add a clippy lint** (custom lint or `disallowed_methods` config in 
`clippy.toml`) that forbids direct `ExecutionPlan::with_new_children` outside 
of the helper itself and a small allow-list of legitimate uses (the trait 
implementations, tests, and the helper's internal call).
   
   3. **Document the convention** in the contributor guide so new optimizer 
rules adopt the helper by default.
   
   ### Describe alternatives I've considered
   
   - **Leave it as a soft convention** (status quo): relies on reviewer memory; 
same class of regression keeps appearing whenever someone adds a new optimizer 
pass.
   - **Inline the ptr_eq check at every call site**: tried in the original 
draft of #22521; rejected by reviewer in favor of the existing helper.
   
   ### Additional context
   
   Related: #22521 (concrete fix in `ensure_distribution`), #19792 (callee-side 
recomputation avoidance in `PlanProperties`). The clippy enforcement would 
systematically prevent the same regression class instead of catching it 
case-by-case in review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to