[
https://issues.apache.org/jira/browse/SPARK-10100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Herman van Hovell updated SPARK-10100:
--------------------------------------
Attachment: SPARK-10100.perf.test.scala
[~yhuai] I did some benchmarking today (results are below). First off, I
couldn't find a major regression (~1%) when switching from the old MIN/MAX
functions to the new functions; what did you find? The modified functions
perform about 2% better than the old interface.
{noformat}
MASTER Aggregate2
Benchmark
[0]: 15890 ms.
[1]: 15968 ms.
[2]: 15590 ms.
[3]: 15605 ms.
[4]: 15712 ms.
[5]: 15489 ms.
[6]: 15610 ms.
[7]: 15741 ms.
[8]: 15632 ms.
[9]: 15570 ms.
[10]: 15638 ms.
avg. 15676 ms.
MASTER Aggregate1
Benchmark
[0]: 15543 ms.
[1]: 15569 ms.
[2]: 15613 ms.
[3]: 15538 ms.
[4]: 15588 ms.
[5]: 15655 ms.
[6]: 15599 ms.
[7]: 15654 ms.
[8]: 15517 ms.
[9]: 15622 ms.
[10]: 15562 ms.
avg. 15587 ms.
MASTER Aggregate2 Modified
Benchmark
[0]: 15281 ms.
[1]: 15397 ms.
[2]: 15644 ms.
[3]: 15367 ms.
[4]: 15490 ms.
[5]: 15229 ms.
[6]: 15148 ms.
[7]: 15142 ms.
[8]: 15268 ms.
[9]: 15277 ms.
[10]: 15286 ms.
avg. 15320 ms.
{noformat}
> AggregateFunction2's Max is slower than AggregateExpression1's MaxFunction
> --------------------------------------------------------------------------
>
> Key: SPARK-10100
> URL: https://issues.apache.org/jira/browse/SPARK-10100
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 1.5.0
> Reporter: Yin Huai
> Assignee: Herman van Hovell
> Attachments: SPARK-10100.perf.test.scala
>
>
> Looks like Max (probably Min) implemented based on AggregateFunction2 is
> slower than the old MaxFunction.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]