GitHub user hvanhovell opened a pull request:

    https://github.com/apache/spark/pull/13629

    [SPARK-15370][SQL] Fix count bug

    # What changes were proposed in this pull request?
    This pull request fixes the COUNT bug in the 
`RewriteCorrelatedScalarSubquery` rule.
    
    After this change, the rule tests the expression at the root of the 
correlated subquery to determine whether the expression returns `NULL` on empty 
input. If the expression does not return `NULL`, the rule generates additional 
logic in the `Project` operator above the rewritten subquery. This additional 
logic intercepts `NULL` values coming from the outer join and replaces them 
with the value that the subquery's expression would return on empty input.
    
    This PR is a takes over https://github.com/apache/spark/pull/13155, and it 
only fixes an issue with `Literal` construction and some style.  All credits 
should go @frreiss.
    
    # How was this patch tested?
    Added regression tests to cover all branches of the updated rule (see 
changes to `SubquerySuite`).
    Ran all existing automated regression tests after merging with latest trunk.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hvanhovell/spark SPARK-15370-cleanup

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13629.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13629
    
----
commit 3b1649105869c72ccb16f86732e04829aaae0e93
Author: frreiss <[email protected]>
Date:   2016-05-16T17:58:00Z

    Commit before merge.

commit 58df60d5468e53c4b6fc41a1d7c896abfb01cdd1
Author: frreiss <[email protected]>
Date:   2016-05-16T17:58:21Z

    Merge branch 'master' of https://github.com/apache/spark

commit 910cbf54e2300a57640e017610c204da2d462964
Author: frreiss <[email protected]>
Date:   2016-05-16T20:46:55Z

    Merge branch 'master' of https://github.com/apache/spark

commit 76d9f4528b8536d1e5680279ab76b9e26dd3a873
Author: frreiss <[email protected]>
Date:   2016-05-17T14:52:46Z

    Merge branch 'master' of https://github.com/apache/spark

commit 1615d560310a59b08a4c03677dd53eb3b9b49e06
Author: frreiss <[email protected]>
Date:   2016-05-20T02:01:33Z

    Second version of the updated rewrite

commit 1b4ba5ed629d9b1e72d919d89b3592f7b29f3f3c
Author: frreiss <[email protected]>
Date:   2016-05-20T14:57:24Z

    Merge branch 'master' of https://github.com/apache/spark

commit fb7cb4304ba02815a79278d1d5d6d194fe8db25c
Author: frreiss <[email protected]>
Date:   2016-05-24T18:11:54Z

    Merge branch 'master' of https://github.com/apache/spark

commit 8cd2877179dded4557c8da92e5b16011637289b0
Author: frreiss <[email protected]>
Date:   2016-06-10T05:02:47Z

    Addressing additional corner cases and review comments.

commit e5c592032b5604a8f8f10326ecd10ade22b5dc43
Author: Herman van Hovell <[email protected]>
Date:   2016-06-12T23:43:30Z

    Style fixes

commit 39f7e043c0abbe27823499699877e986f6fa2eb7
Author: Herman van Hovell <[email protected]>
Date:   2016-06-12T23:43:32Z

    Merge remote-tracking branch 'apache-github/master' into SPARK-15370-cleanup

commit 30dd0bd7d560151085e53667fcc4f6a8895844ed
Author: Herman van Hovell <[email protected]>
Date:   2016-06-12T23:57:18Z

    Some simplification

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to