GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/19501
[SPARK-22223][SQL] ObjectHashAggregate should not introduce unnecessary
shuffle
## What changes were proposed in this pull request?
`ObjectHashAggregateExec` should override `outputPartitioning` in order to
avoid unnecessary shuffle.
## How was this patch tested?
Added Jenkins test.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/viirya/spark-1 SPARK-22223
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19501.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19501
----
commit c84562763034e3fc6a7ddba785131cb4a1c36eb4
Author: Liang-Chi Hsieh <[email protected]>
Date: 2017-10-15T06:02:59Z
ObjectHashAggregate should not introduce unnecessary shuffle.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]