peter-toth commented on a change in pull request #29170:
URL: https://github.com/apache/spark/pull/29170#discussion_r518693068



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -116,7 +116,8 @@ abstract class Optimizer(catalogManager: CatalogManager)
         operatorOptimizationRuleSet.filterNot(_ == InferFiltersFromConstraints)
       Batch("Operator Optimization before Inferring Filters", fixedPoint,
         rulesWithoutInferFiltersFromConstraints: _*) ::
-      Batch("Infer Filters", Once,
+      Batch("Infer Filters", fixedPoint,
+        PushDownPredicates,

Review comment:
       Unfortunately, there is no UT in the commit to see what was the exact 
issue and why `InferFiltersFromConstraints` needs to be separated entirely from 
other rules.
   
   If I get the issue described in 
https://issues.apache.org/jira/browse/SPARK-21652 right, if we combine 
`InferFiltersFromConstraints` with other rules into a batch then we can end up 
in an infinite loop in that batch. But this is only because 
`InferFiltersFromConstraints` can create a new constraint that a subsequent 
rule removes. The example in https://issues.apache.org/jira/browse/SPARK-21652 
required running `ConstantPropagation`, `ConstantFolding` and then 
`BooleanSimplification` to remove such an inferred constraint/filter.
   I think if we combine `InferFiltersFromConstraints` with other rules that 
doesn't reduce constraints (like `PushDownPredicates`) then we are still good. 
But I might be wrong so please share your thoughts on this.
   
   Sidebar:
   I think the source of the above mentioned loop is that `ConstantPropagation` 
doesn't propagate constants into a join condition (it handles filters only). I 
had an old PR to enhance that rule where I also commented why it is not so 
simple to propagate constants into a join: 
https://github.com/apache/spark/pull/24553/files#diff-d43484d56a4d9991066b5c00d12ec2465c75131e055fc02ee7fb6dfd45b5006fR76-R79
 but it is doable if we fix https://issues.apache.org/jira/browse/SPARK-30598 
(https://github.com/apache/spark/pull/27309).
   But I'm not saying that `ConstantPropagation` is the only reduction rule 
that can collide with `InferFiltersFromConstraints`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to