peter-toth commented on code in PR #13046: URL: https://github.com/apache/datafusion/pull/13046#discussion_r1815131448
########## datafusion/common/src/cse.rs: ########## @@ -131,11 +130,13 @@ enum NodeEvaluation { } /// A map that contains the evaluation stats of [`TreeNode`]s by their identifiers. -type NodeStats<'n, N> = HashMap<Identifier<'n, N>, NodeEvaluation>; +/// It also contains the position of [`TreeNode`]s in [`CommonNodes`] once a node is +/// found to be common and got extracted. +type NodeStats<'n, N> = HashMap<Identifier<'n, N>, (NodeEvaluation, Option<usize>)>; -/// A map that contains the common [`TreeNode`]s and their alias by their identifiers, -/// extracted during the second, rewriting traversal. -type CommonNodes<'n, N> = IndexMap<Identifier<'n, N>, (N, String)>; +/// A list that contains the common [`TreeNode`]s and their alias, extracted during the +/// second, rewriting traversal. +type CommonNodes<'n, N> = Vec<(N, String)>; Review Comment: The reason for this change of type from `IndexMap` to `Vec` in `CommonNodes` is that physical columns works with indexes rather than names. E.g. when we repace a common subexpression to a column during rewrite, we need both the name and the index of the common subexpression in the intermediate `ProjectionExec` node. Storing the index in `NodeStats` and using `Vec` in `CommonNodes` better fits this usecase. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org