cfmcgrady commented on a change in pull request #32488:
URL: https://github.com/apache/spark/pull/32488#discussion_r643724873



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala
##########
@@ -121,6 +131,77 @@ object UnwrapCastInBinaryComparison extends 
Rule[LogicalPlan] {
         if canImplicitlyCast(fromExp, toType, literalType) =>
       simplifyNumericComparison(be, fromExp, toType, value)
 
+    // As the analyzer makes sure that the list of In is already of the same 
data type, then the
+    // rule can simply check the first literal in `in.list` can implicitly 
cast to `toType` or not,
+    // and note that:

Review comment:
       > Another idea: We can use AnsiCast(...).eval, which fails if overflow 
happens. Then null literal is the same as other literals that can cast to 
fromExp.dataType, and we don't need to distinguish them.
   
   Good idea,
   
   >The ideal steps in my mind:
   > 1. Call unwrapCast with non-null literals in the list
   > 2. If there is no literal that can cast to fromExp.dataType, return the 
original expression
   > 3. Otherwise, use the literals that can cast to fromExp.dataType and the 
null literals to create a new In expression, with additional falseIfNotNull 
check.
   
   but I think this one is more clear.
   
   Another question for stage 2: we should not return original expression when 
the can not cast literals is non-empty, but return `falseIfNotNull` instead, 
this is because original expression `In(Cast(fromExp, toType), list)` can't be 
optimized by other rules and trigger a filter stage, but `falseIfNotNull` can 
be optimized by `BooleanSimplification`, WDYT?
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to