JoshRosen commented on a change in pull request #24765: [SPARK-27915][SQL][WIP] 
Update logical Filter's output nullability based on IsNotNull conditions
URL: https://github.com/apache/spark/pull/24765#discussion_r289614125
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
 ##########
 @@ -119,6 +119,49 @@ trait PredicateHelper {
     case e: Unevaluable => false
     case e => e.children.forall(canEvaluateWithinJoin)
   }
+
+  /**
+   * Given an IsNotNull expression, returns the IDs of expressions whose 
not-nullness
+   * is implied by the IsNotNull expressions.
+   */
+  protected def getImpliedNotNullExprIds(isNotNullExpr: IsNotNull): 
Set[ExprId] = {
+    // This logic is a little tricky, so we'll use an example to build some 
intuition.
+    // Consider the expression IsNotNull(f(g(x), y)). By definition, its child 
is not null:
+    //    f(g(x), y) is not null
+    // In addition, if `f` is NullIntolerant then it would be null if either 
child was null:
+    //    g(x) is null => f(g(x), y) is null
+    //    y is null    => f(g(x), y) is null
+    // Via A => B <=> !B || A, we have:
+    //    g(x) is not null || f(g(x), y) is null
+    //    y is not null    || f(g(x), y) is null
+    // Since we know that f(g(x), y) is not null, we must therefore conclude 
that
+    //    g(x) is not null
+    //    y is not null
+    // By recursively applying this logic, if g is NullIntolerant then x is 
not null.
+    // However, if g is NOT NullIntolerant (e.g. if g(null) is non-null) then 
we cannot
+    // conclude anything about x's nullability.
+    def getExprIdIfNamed(expr: Expression): Set[ExprId] = expr match {
+      case ne: NamedExpression => Set(ne.toAttribute.exprId)
 
 Review comment:
   Maybe this should be `AttributeReference`? I couldn't remember offhand how 
to get `ExprIds` from arbitrary expressions, hence this hack.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to