GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/9038

    [SPARK-11017] Support ImperativeAggregates in TungstenAggregate

    This patch extends TungstenAggregate to support ImperativeAggregate 
functions. The existing TungstenAggregate operator only supported 
DeclarativeAggregate functions, which are defined in terms of Catalyst 
expressions and can be evaluated via generated projections. ImperativeAggregate 
functions, on the other hand, are evaluated by calling their `initialize`, 
`update`, `merge`, and `eval` methods.
    
    The basic strategy here is similar to how SortBasedAggregate evaluates both 
types of aggregate functions: use a generated projection to evaluate the 
expression-based declarative aggregates with dummy placeholder expressions 
inserted in place of the imperative aggregate function output, then invoke the 
imperative aggregate functions and target them against the aggregation buffer. 
The bulk of the diff here consists of code that was copied and adapted from 
SortBasedAggregate, with some key changes to handle TungstenAggregate's sort 
fallback path.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark 
support-interpreted-in-tungsten-agg-final

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9038.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9038
    
----
commit 9d141c44818d415b37965444bd001ea8aaa54877
Author: Josh Rosen <[email protected]>
Date:   2015-10-07T23:57:02Z

    Add initialInputBufferOffset to TungstenAggregate.

commit 3e92fd1176403512e6076acebc6241d33123f95d
Author: Josh Rosen <[email protected]>
Date:   2015-10-08T00:36:08Z

    Refactor TungstenAggregationIterator constructor to accept imperative 
aggregate functions, too.

commit 78aaab2b5844b093705bac04cca6e6a55a8c7a8d
Author: Josh Rosen <[email protected]>
Date:   2015-10-08T00:39:09Z

    Try enabling ImperativeAggregate for agg queries w/o distinct to see what 
breaks.

commit fc9c2a0a9866a768b029dc9588bd63a99d46dcff
Author: Josh Rosen <[email protected]>
Date:   2015-10-08T01:41:20Z

    Fix result projection for interpreted aggs.

commit fdd6b91156300c7910e90f44d4a929e1bbc63640
Author: Josh Rosen <[email protected]>
Date:   2015-10-08T01:51:43Z

    Use SpecificMutableRow in more places

commit cec6cef6669680c86c49100cca1283fcb608aa64
Author: Josh Rosen <[email protected]>
Date:   2015-10-08T02:25:50Z

    Re-initialize aggregate functions when switching to sort.

commit 53b6462f677c7746589d7d3e8a65ee74c8135b59
Author: Josh Rosen <[email protected]>
Date:   2015-10-08T02:26:23Z

    Remove stray println

commit 2d2ab17a7793c1f53939baab685f84e1bf468350
Author: Josh Rosen <[email protected]>
Date:   2015-10-08T02:35:24Z

    Fix None.get issue.

commit c945bd610a52ad7732cb217fd5b5c077640fe221
Author: Josh Rosen <[email protected]>
Date:   2015-10-08T02:45:56Z

    Work around lazy val initialization issues to fix attr. binding errors.

commit b5be45402612be83fad6196870e53d2fec7b87d5
Author: Josh Rosen <[email protected]>
Date:   2015-10-08T02:52:06Z

    Use NoOp instead of a null literal.

commit e820d78e645bad43ff3641f368f4796d42686b1e
Author: Josh Rosen <[email protected]>
Date:   2015-10-08T08:17:32Z

    Improvements to agg buffer initialization.

commit 7a34e03696bb8b96b23c7a5c6fa9ead169ce4602
Author: Josh Rosen <[email protected]>
Date:   2015-10-08T18:21:37Z

    Reset input buffer offset after spilling to sort

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to