wangyum commented on PR #44133: URL: https://github.com/apache/spark/pull/44133#issuecomment-1906194854
> > if the value of bigIntCol exceeds the range of int, the result of try_cast(b.bigIntCol AS int) is null, and the result of a.intCol = try_cast(b.bigIntCol AS int) in the join condition is false > > This is wrong. The result will be null if one side of the binary comparison is null. In addition, it's very weird to optimize the plan in planner. Shouldn't we do it in an optimizer rule? 1. Similar to `ReplaceNullWithFalseInPredicate`. The final result of null is also false in predicate. 2. We could not add an optimizer rule because we don’t know whether the number of shuffles will be reduced during the optimization phase. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
