GitHub user hvanhovell opened a pull request:

    https://github.com/apache/spark/pull/12954

    [SPARK-15122][SQL] Fix TPC-DS 41 - Normalize predicates before pulling them 
out

    ## What changes were proposed in this pull request?
    The official TPC-DS 41 query currently fails because it contains a scalar 
subquery with a disjunctive correlated predicate (the correlated predicates 
were nested in ORs). This makes the `Analyzer` pull out the entire predicate 
which is wrong and causes the following (correct) analysis exception: `The 
correlated scalar subquery can only contain equality predicates`
    
    This PR fixes this by first simplifing (or normalizing) the correlated 
predicates before pulling them out of the subquery. I have also added a small 
optimizer rule that rewrites correlated scalar subqueries into predicate 
subqueries if they are used in a `Filter` and are wrapped by a predicate. This 
is allows us to use semi joins instead of left outer joins.
    
    ## How was this patch tested?
    Manual testing on TPC-DS 41, and added a test to SubquerySuite.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hvanhovell/spark SPARK-15122

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/12954.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12954
    
----
commit f0871c921285a05602cf566c9f2c23901224d73e
Author: Herman van Hovell <[email protected]>
Date:   2016-05-06T13:39:43Z

    Fix TPC-DS 41 - normalize predicates before pulling them out.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to