GitHub user heary-cao opened a pull request:

    https://github.com/apache/spark/pull/20541

    [SPARK-23356][SQL]Pushes Project to both sides of Union when expression is 
non-deterministic

    ## What changes were proposed in this pull request?
    
    Currently, PushProjectionThroughUnion optimizer only supports pushdown 
project operator to both sides of a Union operator when expression is 
deterministic , in fact, we can be like pushdown filters, also support pushdown 
project operator to both sides of a Union operator when expression is 
non-deterministic , this PR description fix this problem。now the explain 
looks like:
    
    ```
    === Applying Rule 
org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion ===
    Input LogicalPlan:
    Project [a#0, rand(10) AS rnd#9]
    +- Union
       :- LocalRelation <empty>, [a#0, b#1, c#2]
       :- LocalRelation <empty>, [d#3, e#4, f#5]
       +- LocalRelation <empty>, [g#6, h#7, i#8]
    
    Output LogicalPlan:
    Project [a#0, rand(10) AS rnd#9]
    +- Union
       :- Project [a#0]
       :  +- LocalRelation <empty>, [a#0, b#1, c#2]
       :- Project [d#3]
       :  +- LocalRelation <empty>, [d#3, e#4, f#5]
       +- Project [g#6]
          +- LocalRelation <empty>, [g#6, h#7, i#8]
    ```
    
    ## How was this patch tested?
    
    add new test cases

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/heary-cao/spark PushProjectionThroughUnion

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20541.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20541
    
----
commit 36dbc9c543f36dc5952a89c354bd70067ddd6883
Author: caoxuewen <cao.xuewen@...>
Date:   2018-02-08T08:02:17Z

    Pushes Project to both sides of Union when expression is non-deterministic

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to