GitHub user jiangxb1987 opened a pull request:

    https://github.com/apache/spark/pull/14917

    [SPARK-17142][SQL] Complex query triggers binding error in HashAggregateExec

    ## What changes were proposed in this pull request?
    
    In `ReorderAssociativeOperator` rule, we extract foldable expressions with 
Add/Multiply arithmetics, and replace with eval literal. For example, '(a + 1) 
+ (b + 2)' is optimized to '(a + b + 3)' by this rule.
    For aggregate operator, output expressions should be derived from 
groupingExpressions, current implemenation of `ReorderAssociativeOperator` rule 
may break this promise. A instance could be:
    SELECT
      ((t1.a + 1) + (t2.a + 2)) AS out_col
    FROM
      testdata2 AS t1
    INNER JOIN
      testdata2 AS t2
    ON
      (t1.a = t2.a)
    GROUP BY (t1.a + 1), (t2.a + 2)
    ((t1.a + 1) + (t2.a + 2)) is optimized to (t1.a + t2.a + 3), which could 
not be derived from ExpressionSet((t1.a +1), (t2.a + 2)).
    Maybe we should improve the rule of `ReorderAssociativeOperator` by adding 
a GroupingExpressionSet to keep Aggregate.groupingExpressions, and respect 
these expressions during the optimize stage.
    
    
    ## How was this patch tested?
    
    Add new test case in `ReorderAssociativeOperatorSuite`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jiangxb1987/spark rao

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14917.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14917
    
----
commit dc3b1b288d7340183250acf2765da61497790c64
Author: jiangxingbo <[email protected]>
Date:   2016-09-01T10:54:20Z

    bugfix

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to