wiedld commented on code in PR #9685:
URL: https://github.com/apache/arrow-datafusion/pull/9685#discussion_r1532640935


##########
datafusion/optimizer/src/common_subexpr_eliminate.rs:
##########
@@ -571,66 +584,73 @@ enum VisitRecord {
     /// `usize` is the monotone increasing series number assigned in 
pre_visit().
     /// Starts from 0. Is used to index the identifier array `id_array` in 
post_visit().
     EnterMark(usize),
+    /// the node's children were skipped => jump to f_up on same node

Review Comment:
   For my understanding (please correct here 🙏🏼 ), prior to refactoring the 2 
different visitors (`TreeNodeVisitor` and `TreeNodeRewriter`) had different 
transversal rules, and the semantics of skip vs stop mapped in weird ways (as 
[nicely described 
here](https://github.com/apache/arrow-datafusion/pull/8891#issue-2085713028) in 
the TreeNode refactor's "Rationale for Change").
   
   Prior to the refactor, during the first tree walk (which builds the 
`visited_stack`) the short-circuited nodes:
   1. were 
[skipped](https://github.com/apache/arrow-datafusion/blob/684b4fab75841672aabc4f3c2475e8c232acea20/datafusion/common/src/tree_node.rs#L242-L243),
 without adding to the stack
   1. the old semantics meant [skipping the post-visit (up) 
call](https://github.com/apache/arrow-datafusion/blob/684b4fab75841672aabc4f3c2475e8c232acea20/datafusion/common/src/tree_node.rs#L26-L27)
   1. it never looked in the stack (since no post-visit on the old semantics)
   
   Whereas with the new TreeNode refactor the jump became a 
skip-children-but-call-f_up, which meant:
   1. prior to the fix in this PR, it was using the old control flow. So it 
still didn't add to visited_stack.
   1. refactor means that it now does perform the f_up on jump
   1. f_up checks the stack (in `self.pop_enter_mark()`) and would not find it
   1. the original refactor got around this by removing the panic and adding an 
option
   
   For the changes done in the `TreeNodeVisitor` for the short-circuited nodes, 
I basically made it work with a jump that does the post-visit f_up by:
   * add a placeholder to the id_array so that it exists on f_up
   * push into the stack so it exists on the f_up
   * make a `VisitRecord::JumpMark` to be used in the stack



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to