GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/19727
[WIP][SPARK-22497][SQL] Project reuse
## What changes were proposed in this pull request?
The below SQL will scan `table1` twice. This PR reuse the `p1` and scan
`table1` once.
```sql
with p1 as (select * from table1 where key < 100),
s1 as (SELECT key, count(*) FROM p1 group by key),
s2 as (SELECT key, count(*) FROM p1 where key > -100 group by key)
select s1.* from s1 join s2 on s1.key= s2.key
```
## How was this patch tested?
unit tests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/wangyum/spark SPARK-22497
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19727.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19727
----
commit 1c458b8860b3b17f137db18eff9f97df81b47a76
Author: Yuming Wang <[email protected]>
Date: 2017-11-12T16:14:38Z
Reuse project
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]