[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-30 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/9067 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-30 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-152638318 @viirya Did not realized that you had don similar things, i created https://github.com/davies/spark/commit/5707f5b3421fdab0b01ee2a66acf50b59752152b, could you review

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-28 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-151793559 @rxin I ran a simple performance measure as following. Record count: 1333635318 Record after group by: 259200 SQL query looks like: `SELECT SUM(a) as

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-151179909 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-151179890 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-26 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-151182583 **[Test build #44359 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44359/consoleFull)** for PR 9067 at commit

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-151228414 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-151228419 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-15 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-148296408 It's best to have a feature flag for this, in case it yields worse performance. Eventually we should find a way to make this the default. Can you do some

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-15 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-148294059 @JoshRosen Thanks for explaining this patch. It does exactly what you said. @rxin Do you think it is ok to add a configuration for turning on/off this feature?

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-14 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-148182209 Can you explain what you mean by "mixing"? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-14 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/9067#discussion_r42060953 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TungstenAggregationIterator.scala --- @@ -470,12 +484,27 @@ class

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-14 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-148218738 @rxin, my understanding of this patch is that it lets us continue to perform hash-based pre-aggregation on the remainder of the iterator after we've decided to spill

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-14 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-148219612 Ideally we should be able to turn partial aggregation off when we don't see reduction. We had that in Shark, and a lot of query engines do this. --- If your project is

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147298428 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147298737 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-12 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147298690 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147298751 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147298785 [Test build #43559 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43559/consoleFull) for PR 9067 at commit

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147298429 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147298405 [Test build #43557 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43557/console) for PR 9067 at commit

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147324890 [Test build #43559 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43559/console) for PR 9067 at commit

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147325034 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147325035 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147231673 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147249270 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147249443 [Test build #43546 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43546/consoleFull) for PR 9067 at commit

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147255496 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147255494 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147255481 [Test build #43546 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43546/console) for PR 9067 at commit

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147292908 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147292912 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11055][SQL] Use mixing hash-based and s...

2015-10-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9067#issuecomment-147294294 [Test build #43557 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43557/consoleFull) for PR 9067 at commit