Github user guoxiaolongzte commented on a diff in the pull request:
https://github.com/apache/spark/pull/23104#discussion_r236929433
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -459,6 +459,7 @@ object LimitPushDown extends Rule[LogicalPlan] {
val newJoin = joinType match {
case RightOuter => join.copy(right = maybePushLocalLimit(exp,
right))
case LeftOuter => join.copy(left = maybePushLocalLimit(exp, left))
+ case Cross => join.copy(left = maybePushLocalLimit(exp, left),
right = maybePushLocalLimit(exp, right))
--- End diff --
There are two tables as followsï¼
CREATE TABLE `**test1**`(`id` int, `name` int);
CREATE TABLE `**test2**`(`id` int, `name` int);
test1 table data:
2,2
1,1
test2 table data:
2,2
3,3
4,4
Execute sql select * from test1 t1 **left anti join** test2 t2 on
t1.id=t2.id limit 1; The result:
1,1
But
we push the limit 1 on left side, the result is not correct. Result is
empty.
we push the limit 1 on right side, the result is not correct. Result is
empty.
So
left anti join no need to push down limit. Similarly, left semi join is
the same logic.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]