GitHub user dilipbiswal opened a pull request:

    https://github.com/apache/spark/pull/21941

    [SPARK-24966][SQL] Implement precedence rules for set operations.

    ## What changes were proposed in this pull request?
    
    Currently the set operations INTERSECT, UNION and EXCEPT are assigned the 
same precedence. This PR fixes the problem by giving INTERSECT  higher 
precedence than UNION and EXCEPT. UNION and EXCEPT operators are evaluated in 
the order in which they appear in the query from left to right.
    
    This results in change in behavior because of the change in order of 
evaluations of set operators in a query. The old behavior is still preserved 
under a newly added config parameter.
    
    Query `:`
    ```
    SELECT * FROM t1
    UNION 
    SELECT * FROM t2
    EXCEPT
    SELECT * FROM t3
    INTERSECT
    SELECT * FROM t4
    ```
    Parsed plan before the change `:`
    ```
    == Parsed Logical Plan ==
    'Intersect false
    :- 'Except false
    :  :- 'Distinct
    :  :  +- 'Union
    :  :     :- 'Project [*]
    :  :     :  +- 'UnresolvedRelation `t1`
    :  :     +- 'Project [*]
    :  :        +- 'UnresolvedRelation `t2`
    :  +- 'Project [*]
    :     +- 'UnresolvedRelation `t3`
    +- 'Project [*]
       +- 'UnresolvedRelation `t4`
    ```
    Parsed plan after the change `:`
    ```
    == Parsed Logical Plan ==
    'Except false
    :- 'Distinct
    :  +- 'Union
    :     :- 'Project [*]
    :     :  +- 'UnresolvedRelation `t1`
    :     +- 'Project [*]
    :        +- 'UnresolvedRelation `t2`
    +- 'Intersect false
       :- 'Project [*]
       :  +- 'UnresolvedRelation `t3`
       +- 'Project [*]
          +- 'UnresolvedRelation `t4`
    ```
    ## How was this patch tested?
    Added tests in PlanParserSuite, SQLQueryTestSuite.
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dilipbiswal/spark SPARK-24966

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21941.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21941
    
----
commit c0821b6dd8e713edf2bd1ddd9a27f1999970d8f8
Author: Dilip Biswal <dbiswal@...>
Date:   2018-07-30T05:10:29Z

    [SPARK-24966] Implement precedence rules for set operations.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to