[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18258 Seems to me that the hash map metrics to join operator can be done in later PR. So this change can be small to review. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18258 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77876/ Test PASSed. ---

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18258 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18258 **[Test build #77876 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77876/testReport)** for PR 18258 at commit [`ee3d88f`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18258 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77872/ Test PASSed. ---

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18258 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18258 **[Test build #77872 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77872/testReport)** for PR 18258 at commit [`55cd6ad`](https://github.com/apache/spark/commit/5

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18258 **[Test build #77876 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77876/testReport)** for PR 18258 at commit [`ee3d88f`](https://github.com/apache/spark/commit/ee

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18258 **[Test build #77872 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77872/testReport)** for PR 18258 at commit [`55cd6ad`](https://github.com/apache/spark/commit/55

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18258 Ok. I'll remove the flag. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled an

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18258 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77866/ Test PASSed. ---

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18258 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18258 **[Test build #77866 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77866/testReport)** for PR 18258 at commit [`e4cfe1c`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18258 Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feat

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18258 If there is no regression, I'd remove the flag. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18258 Sure. Three times for each. Track = F: Aggregate w keys:Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative -

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18258 Can you run it a few more times to tell? Right now it's a difference of 7% almost --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. I

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18258 Is it significant? Seems to me that it's in the variance of different runs? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18258 16.8 vs 15.8? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18258 I just ran the existing `AggregateBenchmark` with the new tracking config: Java HotSpot(TM) 64-Bit Server VM 1.8.0_102-b14 on Linux 4.9.27-moby Intel(R) Core(TM) i7-5557U CPU @ 3.1

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18258 Sure. Will update later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18258 Can you test the perf degradation? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled an

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18258 The `enablePerfMetrics` parameter of `UnsafeFixedWidthAggregationMap` has this comment: * @param enablePerfMetrics if true, performance metrics will be recorded (has minor perf impact)

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18258 Why would the tracking have perf impact? It's just a simple counter increase isn't it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark issue #18258: [SPARK-20953][SQL][WIP] Add hash map metrics to aggregat...

2017-06-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18258 **[Test build #77866 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77866/testReport)** for PR 18258 at commit [`e4cfe1c`](https://github.com/apache/spark/commit/e4