tanelk commented on a change in pull request #31538:
URL: https://github.com/apache/spark/pull/31538#discussion_r574765460
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -494,11 +494,14 @@ object RemoveRedundantAliases extends Rule[LogicalPlan] {
object RemoveNoopOperators extends Rule[LogicalPlan] {
def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
// Eliminate no-op Projects
- case p @ Project(_, child) if child.sameOutput(p) => child
+ case p @ Project(projectList, child)
+ if projectList.forall(isAttribute) && child.sameOutput(p) => child
Review comment:
In the `LogicalPlan` there is this comment:
```
/**
* This method checks if the same `ExprId` refers to an unique attribute
in a plan tree.
* Some plan transformers (e.g., `RemoveNoopOperators`) rewrite logical
* plans based on this assumption.
*/
def checkIfExprIdsAreGloballyUnique(plan: LogicalPlan): Boolean = {
checkIfSameExprIdNotReused(plan) && hasUniqueExprIdsForOutput(plan)
}
```
It sounds like the `RemoveNoopOperators` is correct and the deeper issue is
with the `ExprId` getting reused.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]