[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16371 Thank you @cloud-fan @hvanhovell @rxin @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16371 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70810/ Test PASSed. ---

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70810 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70810/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-03 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/16371 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70810 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70810/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16371 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70801/ Test FAILed. ---

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-02 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16371 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2017-01-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70801 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70801/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70756/ Test PASSed. ---

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70756 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70756/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70756 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70756/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70733/ Test PASSed. ---

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70733 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70733/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70733 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70733/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70713/ Test PASSed. ---

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70713 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70713/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70713 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70713/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16371 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70707/ Test FAILed. ---

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70709/ Test FAILed. ---

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70709 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70709/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70707 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70707/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-28 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16371 @cloud-fan @marmbrus ok. I will first address @hvanhovell's comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-28 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/16371 +1 I think we can move forward. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16371 I think we should support partial aggregation for collect, in the future we can define an interface to declare "should partial aggregate", so that a single collect function won't cause a 2-phase

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-28 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16371 @cloud-fan I already did this in current PR. I am waiting for some consensus so I can push ahead to address the review comments from @hvanhovell. --- If your project is set up for it, you can reply

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-28 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16371 I think we should implement collect with `TypedImperativeAggregate` and remove the `supportsPartual` flag, @viirya do you have time to work on it? --- If your project is set up for it, you can

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16371 @hvanhovell Got it. Thanks for review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-22 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16371 sounds good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-22 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/16371 @viirya I left a few comments on the PR. Lets wait for some consensus before pushing ahead. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70504/ Test PASSed. ---

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70504 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70504/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70504 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70504/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16371 Just for reference. Hive's collect_list and collect_set seems supporting partial aggregation. Of course we don't necessarily follow Hive's.

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70467/ Test FAILed. ---

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70467 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70467/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16371 @marmbrus I see that you filed the ticket. We can make this work with partial aggregation, but we shouldn't run them this way. Maybe it's time to define something as partial aggregateable vs should be

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16371 Why would we even want to support partial aggregation for collect_list? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70467 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70467/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70464/ Test FAILed. ---

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16371 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70459/ Test FAILed. ---

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70464 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70464/testReport)** for PR 16371 at commit

[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16371 **[Test build #70459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70459/testReport)** for PR 16371 at commit