2010YOUY01 commented on code in PR #22521:
URL: https://github.com/apache/datafusion/pull/22521#discussion_r3308373184


##########
datafusion/physical-optimizer/src/ensure_requirements/enforce_distribution.rs:
##########
@@ -1333,6 +1333,22 @@ pub fn ensure_distribution(
         .map(|c| Arc::clone(&c.plan))
         .collect::<Vec<_>>();
 
+    // Skip the (often expensive) `with_new_children` rebuild when none of
+    // the children were actually replaced above. For nodes like
+    // `ProjectionExec`, `with_new_children` calls `try_new` and recomputes
+    // schema / equivalence properties / output ordering even when the
+    // input Arcs are identical. Profiling on a representative deep
+    // ProjectionExec stack showed `with_new_children` dominating
+    // `ensure_distribution` time for plans where no distribution change
+    // applies (point queries with no join / aggregate / unmet ordering),
+    // so the rebuild is wasted on the common case.
+    let original_children = plan.children();

Review Comment:
   We could introduce a helper like
   `with_new_children_if_necessary(plan, children_plans)`, and later disallow 
direct `with_new_children()` usage via clippy. This way we could enforce the 
helper project-wide to avoid similar issues.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to