[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9380 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread JihongMA
Github user JihongMA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156249080 @mengxr sure, will take care mean via seperate PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156234586 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156247920 LGTM. Merged into master and branch-1.6. Thanks! Btw, there is a minor style issue I marked inline. @JihongMA Could you submit another PR to change the output of

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156234403 **[Test build #45755 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45755/consoleFull)** for PR 9380 at commit

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156234588 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156264807 No, `min` and `max` can be used on string and other types so should not return `NaN`. We should follow the SQL standard here. --- If your project is set up for it,

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156181656 **[Test build #45749 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45749/consoleFull)** for PR 9380 at commit

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156189548 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread yu-iskw
Github user yu-iskw commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156181157 @JihongMA thanks for the update! Could you revert `Skewness.scala` and `Kurtosis.scala`. Since I don't think the change relates to the issue. I know this is a minor

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156189442 **[Test build #45749 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45749/consoleFull)** for PR 9380 at commit

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156189544 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread JihongMA
Github user JihongMA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156171034 @AmplabJenkins please retest the change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread yu-iskw
Github user yu-iskw commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156177206 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156178175 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156178213 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156198135 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156198432 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156201232 **[Test build #45755 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45755/consoleFull)** for PR 9380 at commit

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156198404 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155987720 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155987721 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155970380 These SparkR support has just been added so this change breaks tests ``` 1. Failure (at test_sparkSQL.R#1010): group by, agg functions --

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155956026 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155986721 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-156024999 @JihongMA yap that should fix them --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155956027 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155955969 **[Test build #45678 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45678/consoleFull)** for PR 9380 at commit

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155988221 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread JihongMA
Github user JihongMA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155988170 @felixcheung Thank you! this is the change I have made to make it pass for R. I am not familiar with R . df3 <- agg(gd, age = "stddev") expect_is(df3,

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155988223 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155927889 **[Test build #45678 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45678/consoleFull)** for PR 9380 at commit

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155927109 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155927086 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-10 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155630541 @JihongMA Could you merge the current master? There are some merge conflicts. For `NaN` vs. `null`, we had some discussion in

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-09 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155224857 Which part is boxing the result? I tested the following on master with changes from #9480: ~~~scala val df = sqlContext.range(1)

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-09 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155226869 The eval call is boxing which you aren't going to see without a groupby. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-09 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-155228360 But eval only happens once per group. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154579413 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-06 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154593369 Even though its simple, I think this implementation is boxing the result, which could result in slower performance on real workloads (but is harder to see in micro

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-06 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154579243 **[Test build #45256 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45256/consoleFull)** for PR 9380 at commit

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-06 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154509839 @rxin We still need to run some benchmark. The formulation is simple and fixed, and hence codegen won't bring much performance gain. @JihongMA Could you address

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-06 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154546741 **[Test build #45256 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45256/consoleFull)** for PR 9380 at commit

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154544406 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154544379 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-05 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154163473 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-05 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154166042 +1 on @yu-iskw 's suggestion. Let's keep the changes in this PR minimal. Just replace `StdDev` implementation by imperative agg. --- If your project is set up for it,

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-05 Thread yu-iskw
Github user yu-iskw commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154164156 @JihongMA I don't know if there are any strong reasons in terms of catalyst. However, personally I think we should separate changing the return type and `null` from the

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154168859 **[Test build #45135 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45135/consoleFull)** for PR 9380 at commit

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-05 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154168774 BTW with https://github.com/apache/spark/pull/9480, we might not need to replace it with imperative aggregate anymore. --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154165567 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154165662 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-05 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154163504 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154209280 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154209027 **[Test build #45135 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45135/consoleFull)** for PR 9380 at commit

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-154209278 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-04 Thread JihongMA
Github user JihongMA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-153847977 @mengxr rebased with the changes @rxin [SPARK-11490], stddev / variance mapped to the corresponding sample stddev / variance. I checked Hive doesn't support this

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-04 Thread JihongMA
Github user JihongMA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-153775399 @mengxr Please take another look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-03 Thread JihongMA
Github user JihongMA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-153453307 I propose to return null for all cases which currently Double.NaN is returned. and change getStatistics() to return Any instead of Double. --- If your project is set

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-03 Thread JihongMA
Github user JihongMA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-153451606 so for skewness and kurtosis in case of count =1, we want to return null instead of 0. I can address it, but instead of returning Double.NaN, should we return null

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-03 Thread yu-iskw
Github user yu-iskw commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-153462689 @JihongMA I'm not sure about that. I don't think we should return `null`, instead of `Double.NaN`. Why do we need to change the return type? --- If your project is set

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-03 Thread JihongMA
Github user JihongMA commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-153475180 getStatistics() will continue to return Double value for normal cases, changing it to return null only for edge cases. is there a strong reason to return Double.NaN?

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-03 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/9380#discussion_r43772715 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala --- @@ -1135,7 +992,76 @@ abstract class

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-03 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-153412815 @JihongMA Could you address @sethah 's comment and rebase master? There are some merge conflicts. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-11-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9380#discussion_r43662570 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala --- @@ -1135,7 +992,76 @@ abstract class

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

2015-10-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9380#issuecomment-152594052 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your