Zouxxyy commented on code in PR #11212:
URL:
https://github.com/apache/incubator-gluten/pull/11212#discussion_r2572836841
##########
gluten-substrait/src/main/scala/org/apache/gluten/execution/BasicPhysicalOperatorTransformer.scala:
##########
@@ -313,7 +313,8 @@ object FilterHandler extends PredicateHelper {
}
def subtractFilters(left: Seq[Expression], right: Seq[Expression]):
Seq[Expression] = {
- (left.toSet -- right.toSet).toSeq
+ val scanSet = left.map(_.canonicalized).toSet
Review Comment:
Oh, I checked the commit history, and this PR
(https://github.com/apache/incubator-gluten/pull/6296) changed ExpressionSet to
a regular Set. The original ExpressionSet compared expressions based on both
determinism and canonical representation, whereas a regular Set only compares
expressions directly (using their default equals/hashCode).
It seems what we actually need is comparison based on canonicalized forms.
Perhaps we could modify the code like this:
```scala
def subtractFilters(left: Seq[Expression], right: Seq[Expression]):
Seq[Expression] = {
val rightCanonicalSet = right.map(_.canonicalized).toSet
left.filterNot(expr => rightCanonicalSet.contains(expr.canonicalized))
}
def combineFilters(left: Seq[Expression], right: Seq[Expression]):
Seq[Expression] = {
val seen = scala.collection.mutable.Set[Expression]()
val result = scala.collection.mutable.ListBuffer[Expression]()
def tryAdd(expr: Expression): Unit = {
val canon = expr.canonicalized
if (!seen.contains(canon)) {
seen += canon
result += expr
}
}
(left ++ right).foreach(tryAdd)
result.toList
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]