peter-toth commented on code in PR #13046:
URL: https://github.com/apache/datafusion/pull/13046#discussion_r1815131448


##########
datafusion/common/src/cse.rs:
##########
@@ -131,11 +130,13 @@ enum NodeEvaluation {
 }
 
 /// A map that contains the evaluation stats of [`TreeNode`]s by their 
identifiers.
-type NodeStats<'n, N> = HashMap<Identifier<'n, N>, NodeEvaluation>;
+/// It also contains the position of [`TreeNode`]s in [`CommonNodes`] once a 
node is
+/// found to be common and got extracted.
+type NodeStats<'n, N> = HashMap<Identifier<'n, N>, (NodeEvaluation, 
Option<usize>)>;
 
-/// A map that contains the common [`TreeNode`]s and their alias by their 
identifiers,
-/// extracted during the second, rewriting traversal.
-type CommonNodes<'n, N> = IndexMap<Identifier<'n, N>, (N, String)>;
+/// A list that contains the common [`TreeNode`]s and their alias, extracted 
during the
+/// second, rewriting traversal.
+type CommonNodes<'n, N> = Vec<(N, String)>;

Review Comment:
   The reason for this change of type from `IndexMap` to `Vec` in `CommonNodes` 
is that physical columns works with indexes rather than names. E.g. when we 
repace a common subexpression to a column during rewrite, we need both the name 
and the index of the common subexpression in the intermediate `ProjectionExec` 
node. Storing the index in `NodeStats` and using `Vec` in `CommonNodes` better 
fits this usecase.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to