GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/11840

    [Spark-14019] [SQL] Remove noop SortOrder in Sort

    #### What changes were proposed in this pull request?
    
    This PR is to add a new Optimizer rule for pruning Sort if its SortOrder is 
no-op. In the phase of **Optimizer**, if a specific `SortOrder` does not have 
any reference, it has no effect on the sorting results. If `Sort` is empty, 
remove the whole `Sort`. 
    
    For example, in the following SQL query
    ```SQL
    SELECT * FROM t ORDER BY NULL + 5
    ```
    
    Before the fix, the plan is like
    ```
    == Analyzed Logical Plan ==
    a: int, b: int
    Sort [(cast(null as int) + 5) ASC], true
    +- Project [a#92,b#93]
       +- SubqueryAlias t
          +- Project [_1#89 AS a#92,_2#90 AS b#93]
             +- LocalRelation [_1#89,_2#90], [[1,2],[1,2]]
    
    == Optimized Logical Plan ==
    Sort [null ASC], true
    +- LocalRelation [a#92,b#93], [[1,2],[1,2]]
    
    == Physical Plan ==
    WholeStageCodegen
    :  +- Sort [null ASC], true, 0
    :     +- INPUT
    +- Exchange rangepartitioning(null ASC, 5), None
       +- LocalTableScan [a#92,b#93], [[1,2],[1,2]]
    ```
    
    After the fix, the plan is like
    ```
    == Analyzed Logical Plan ==
    a: int, b: int
    Sort [(cast(null as int) + 5) ASC], true
    +- Project [a#92,b#93]
       +- SubqueryAlias t
          +- Project [_1#89 AS a#92,_2#90 AS b#93]
             +- LocalRelation [_1#89,_2#90], [[1,2],[1,2]]
    
    == Optimized Logical Plan ==
    LocalRelation [a#92,b#93], [[1,2],[1,2]]
    
    == Physical Plan ==
    LocalTableScan [a#92,b#93], [[1,2],[1,2]]
    ```
    
    cc @rxin @cloud-fan @marmbrus Thanks!
    
    #### How was this patch tested?
    Added a test suite for covering this rule


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark sortElimination

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11840.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11840
    
----
commit 96cd8ce0e7b3729483a21b73b9c54c480d627ab7
Author: gatorsmile <[email protected]>
Date:   2016-03-19T05:19:16Z

    PruneSorts

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to