Github user nongli commented on a diff in the pull request:
https://github.com/apache/spark/pull/11665#discussion_r56239773
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -598,50 +598,41 @@ object NullPropagation extends Rule[LogicalPlan] {
}
/**
- * Attempts to eliminate reading (unnecessary) NULL values if they are not
required for correctness
- * by inserting isNotNull filters in the query plan. These filters are
currently inserted beneath
- * existing Filters and Join operators and are inferred based on their
data constraints.
+ * Eliminate reading unnecessary values if they are not required for
correctness (and can help in
+ * optimizing the query) by inserting relevant filters in the query plan
based on an operator's
+ * data constraints. These filters are currently inserted to the existing
conditions in the Filter
+ * operators and on either side of Join operators.
*
* Note: While this optimization is applicable to all types of join, it
primarily benefits Inner and
* LeftSemi joins.
*/
-object NullFiltering extends Rule[LogicalPlan] with PredicateHelper {
+object InferFiltersFromConstraints extends Rule[LogicalPlan] with
PredicateHelper {
+ // We generate a list of additional filters from the operator's existing
constraint but remove
--- End diff --
This comment seems like a clearer version of the first sentence in the
object comment. I'd remove this here and replace the first sentence of the
above comment.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]