GitHub user nsyca opened a pull request:
https://github.com/apache/spark/pull/16572
[SPARK-18863][SQL] Output non-aggregate expressions without GROUP BY in a
subquery does not yield an error
## What changes were proposed in this pull request?
This PR will report proper error messages when a subquery expression
contain an invalid plan. This problem is fixed by calling CheckAnalysis for the
plan inside a subquery.
## How was this patch tested?
Existing tests and two new test cases on 2 forms of subquery, namely,
scalar subquery and in/exists subquery.
````
-- TC 01.01
-- The column t2b in the SELECT of the subquery is invalid
-- because it is neither an aggregate function nor a GROUP BY column.
select t1a, t2b
from t1, t2
where t1b = t2c
and t2b = (select max(avg)
from (select t2b, avg(t2b) avg
from t2
where t2a = t1.t1b
)
)
;
-- TC 01.02
-- Invalid due to the column t2b not part of the output from table t2.
select *
from t1
where t1a in (select min(t2a)
from t2
group by t2c
having t2c in (select max(t3c)
from t3
group by t3b
having t3b > t2b ))
;
````
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/nsyca/spark 18863
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/16572.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #16572
----
commit b98865127a39bde885f9b1680cfe608629d59d51
Author: Nattavut Sutyanyong <[email protected]>
Date: 2016-07-29T21:43:56Z
[SPARK-16804][SQL] Correlated subqueries containing LIMIT return incorrect
results
## What changes were proposed in this pull request?
This patch fixes the incorrect results in the rule ResolveSubquery in
Catalyst's Analysis phase.
## How was this patch tested?
./dev/run-tests
a new unit test on the problematic pattern.
commit 069ed8f8e5f14dca7a15701945d42fc27fe82f3c
Author: Nattavut Sutyanyong <[email protected]>
Date: 2016-07-29T21:50:02Z
[SPARK-16804][SQL] Correlated subqueries containing LIMIT return incorrect
results
## What changes were proposed in this pull request?
This patch fixes the incorrect results in the rule ResolveSubquery in
Catalyst's Analysis phase.
## How was this patch tested?
./dev/run-tests
a new unit test on the problematic pattern.
commit edca333c081e6d4e53a91b496fba4a3ef4ee89ac
Author: Nattavut Sutyanyong <[email protected]>
Date: 2016-07-30T00:28:15Z
New positive test cases
commit 64184fdb77c1a305bb2932e82582da28bb4c0e53
Author: Nattavut Sutyanyong <[email protected]>
Date: 2016-08-01T13:20:09Z
Fix unit test case failure
commit 29f82b05c9e40e7934397257c674b260a8e8a996
Author: Nattavut Sutyanyong <[email protected]>
Date: 2016-08-05T17:42:01Z
blocking TABLESAMPLE
commit ac43ab47907a1ccd6d22f920415fbb4de93d4720
Author: Nattavut Sutyanyong <[email protected]>
Date: 2016-08-05T21:10:19Z
Fixing code styling
commit 631d396031e8bf627eb1f4872a4d3a17c144536c
Author: Nattavut Sutyanyong <[email protected]>
Date: 2016-08-07T18:39:44Z
Correcting Scala test style
commit 7eb9b2dbba3633a1958e38e0019e3ce816300514
Author: Nattavut Sutyanyong <[email protected]>
Date: 2016-08-08T02:31:09Z
One (last) attempt to correct the Scala style tests
commit 1387cf51541408ac20048064fa5e559836af932c
Author: Nattavut Sutyanyong <[email protected]>
Date: 2016-08-12T20:11:50Z
Merge remote-tracking branch 'upstream/master'
commit 3faa2d5edc030495f8b870d2c017cb714c17b6a7
Author: Nattavut Sutyanyong <[email protected]>
Date: 2016-12-14T16:35:52Z
Merge remote-tracking branch 'upstream/master'
commit a30863457ef49f99aff001b1987da75093c20f86
Author: Nattavut Sutyanyong <[email protected]>
Date: 2016-12-30T17:18:18Z
Merge remote-tracking branch 'upstream/master'
commit 2f463de8d4bf566e5fd59f39ddef6ceba5cfc894
Author: Nattavut Sutyanyong <[email protected]>
Date: 2017-01-01T16:18:54Z
first fix (incomplete)
commit 6e2f686f8e516e63235e1e6ccb13bdf8a9e6e314
Author: Nattavut Sutyanyong <[email protected]>
Date: 2017-01-03T20:06:12Z
first attempt
commit f1524b99aff70e688e4763db7898da53286a321e
Author: Nattavut Sutyanyong <[email protected]>
Date: 2017-01-03T22:08:03Z
Merge remote-tracking branch 'upstream/master'
commit 6dfa8e5e132fbc489e20e3181a2a2faaa339ec3a
Author: Nattavut Sutyanyong <[email protected]>
Date: 2017-01-04T01:58:47Z
Merge branch 'master' into 18863
commit e9bdde6e1268170ef89c2a3f402dbd766b5cad00
Author: Nattavut Sutyanyong <[email protected]>
Date: 2017-01-05T15:31:08Z
New test cases
commit deec874947a7028aa4a7bef0a1b5898609a6d79c
Author: Nattavut Sutyanyong <[email protected]>
Date: 2017-01-05T16:04:36Z
Masking exprIDs
commit 5c36dce4df0051cdf1957ac448b354db0ee22e2d
Author: Nattavut Sutyanyong <[email protected]>
Date: 2017-01-13T01:05:45Z
Merge remote-tracking branch 'upstream/master'
commit 98cbd606e215b9e1e57978e604755368e1bf6948
Author: Nattavut Sutyanyong <[email protected]>
Date: 2017-01-13T01:07:17Z
Merge branch 'master' into 18863
commit bcae3363db60cdf93d1bb9b741f96ec0e088cf0b
Author: Nattavut Sutyanyong <[email protected]>
Date: 2017-01-13T01:12:55Z
reverse back accidental change
commit 51f7fb92e47e92208f4e7b2d3cd6d9745177509e
Author: Nattavut Sutyanyong <[email protected]>
Date: 2017-01-13T01:16:05Z
port from SPARK-19017
commit 24397cf6c8728b4dfff22da14dc909dbb3b0a4e5
Author: Nattavut Sutyanyong <[email protected]>
Date: 2017-01-13T02:19:11Z
remove unrelated comment
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]