Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/19647#discussion_r148710991
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -332,32 +332,18 @@ object LimitPushDown extends Rule[LogicalPlan] {
// pushdown Limit.
case LocalLimit(exp, Union(children)) =>
LocalLimit(exp, Union(children.map(maybePushLocalLimit(exp, _))))
- // Add extra limits below OUTER JOIN. For LEFT OUTER and FULL OUTER
JOIN we push limits to the
- // left and right sides, respectively. For FULL OUTER JOIN, we can
only push limits to one side
- // because we need to ensure that rows from the limited side still
have an opportunity to match
- // against all candidates from the non-limited side. We also need to
ensure that this limit
- // pushdown rule will not eventually introduce limits on both sides if
it is applied multiple
- // times. Therefore:
+ // Add extra limits below OUTER JOIN. For LEFT OUTER and RIGHT OUTER
JOIN we push limits to
+ // the left and right sides, respectively. It's not safe to push
limits below FULL OUTER
+ // JOIN in the general case without a more invasive rewrite.
+ // We also need to ensure that this limit pushdown rule will not
eventually introduce limits
+ // on both sides if it is applied multiple times. Therefore:
// - If one side is already limited, stack another limit on top if
the new limit is smaller.
// The redundant limit will be collapsed by the CombineLimits rule.
// - If neither side is limited, limit the side that is estimated to
be bigger.
case LocalLimit(exp, join @ Join(left, right, joinType, _)) =>
val newJoin = joinType match {
case RightOuter => join.copy(right = maybePushLocalLimit(exp,
right))
case LeftOuter => join.copy(left = maybePushLocalLimit(exp, left))
- case FullOuter =>
--- End diff --
Thanks for working on it! We should still keep it.
Let me fix it based on my original PR:
https://github.com/apache/spark/pull/10454
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]