GitHub user wangyum opened a pull request:

    https://github.com/apache/spark/pull/19727

    [WIP][SPARK-22497][SQL] Project reuse

    ## What changes were proposed in this pull request?
    
    The below SQL will scan `table1` twice. This PR reuse the `p1` and scan 
`table1` once.
    ```sql
    with p1 as (select * from table1 where key < 100), 
    s1 as (SELECT key, count(*) FROM p1 group by key), 
    s2 as (SELECT key, count(*) FROM p1 where key > -100 group by key) 
    select s1.* from s1 join s2 on s1.key= s2.key
    ```
    
    ## How was this patch tested?
    
    unit tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/wangyum/spark SPARK-22497

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19727.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19727
    
----
commit 1c458b8860b3b17f137db18eff9f97df81b47a76
Author: Yuming Wang <[email protected]>
Date:   2017-11-12T16:14:38Z

    Reuse project

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to