wangyum commented on a change in pull request #28642:
URL: https://github.com/apache/spark/pull/28642#discussion_r737478319



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
##########
@@ -1215,6 +1215,14 @@ object InferFiltersFromConstraints extends 
Rule[LogicalPlan]
     }
   }
 
+  // Whether the result of this expression may be null. For example: 
CAST(strCol AS double)
+  // We will infer an IsNotNull expression for this expression to avoid skew 
join.
+  private def resultMayBeNull(e: Expression): Boolean = e match {
+    case Cast(child, dataType, _, _) => !Cast.canUpCast(child.dataType, 
dataType)
+    case _: Coalesce => true
+    case _ => false
+  }

Review comment:
       @cloud-fan @HyukjinKwon It will not infer all equality join keys. For 
example:
   
   Infer | Will not infer
   -- | --
   cast(strCol AS double) = doubleCol | upper(strCol) = upperStrCol
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to