cloud-fan commented on a change in pull request #27073:
[SPARK-29768][SQL][FOLLOW-UP]Improve handling non-deterministic filter of
ScanOperation
URL: https://github.com/apache/spark/pull/27073#discussion_r362405814
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
##########
@@ -142,14 +140,21 @@ object ScanOperation extends OperationHelper with
PredicateHelper {
case Filter(condition, child) =>
collectProjectsAndFilters(child) match {
case Some((fields, filters, other, aliases)) =>
- // Follow CombineFilters and only keep going if the collected
Filters
- // are all deterministic and this filter doesn't have common
non-deterministic
+ // Follow CombineFilters and only keep going if 1) the collected
Filters
+ // and this filter are all deterministic or 2) if this filter is
non-deterministic,
+ // but it's the only one who doesn't have common non-deterministic
// expressions with lower Project.
- if (filters.forall(_.deterministic) &&
- !hasCommonNonDeterministic(Seq(condition), aliases)) {
+ if (filters.nonEmpty && filters.forall(_.deterministic)) {
Review comment:
how about
```
val canMergeFilters = (filters.nonEmpty && filters.forall(_.deterministic)
&& condition.deterministic) || filters.isEmpty
if (canMergeFilters && !hasCommonNonDeterministic(...)) {
...
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]