Github user maropu commented on the issue:
https://github.com/apache/spark/pull/17164
A benchmark result:
https://github.com/apache/spark/pull/17164/files#diff-b7bf86a20a79d572f81093300568db6eR44
```
/*
range/limit/sum: Best/Avg Time(ms) Rate(M/s) Per
Row(ns) Relative
----------------------------------------------------------------------------------------------
range/limit/sum wholestage off 617 / 617 13.6
73.5 1.0X
range/limit/sum wholestage on 70 / 92 120.2
8.3 8.8X
*/
/*
aggregate non-sorted data: Best/Avg Time(ms) Rate(M/s) Per
Row(ns) Relative
----------------------------------------------------------------------------------------------
non-sorted data wholestage off 2540 / 2735 3.3
302.8 1.0X
non-sorted data wholestage on 1226 / 1528 6.8
146.1 2.1X
*/
/*
aggregate cached and sorted data: Best/Avg Time(ms) Rate(M/s) Per
Row(ns) Relative
----------------------------------------------------------------------------------------------
cached and sorted data wholestage off 1455 / 1586 5.8
173.4 1.0X
cached and sorted data wholestage on 663 / 767 12.7
79.0 2.2X
*/
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]