Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48388339
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends 
Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type 
is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side 
child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with 
PredicateHelper {
    --- End diff --
    
    I am not sure what is the best way to prove it. If we can add extra `limit` 
below `union all`, `left outer` and `right outer`,  can we add extra `limit` 
below `full outer`? In the traditional RDBMS, ```full outer = union all(left 
outer, right outer)```. I am not sure if Spark SQL has the same semantics. 
    
    `(A full outer join B) limit 5` 
    = `(A left outer join B) limit 5` `union all` `(A right outer join B) limit 
5`
    = `((A limit 5) left outer join (B limit 5))` `union all` `((A limit 5) 
right outer join (B limit 5))` 
    = `((A limit 5) full outer join (B limit 5))`
    
    
    However, inner join has a big issue if we are trying to add extra `limit`. 
I am not sure my answer is clear. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to