GitHub user rxin opened a pull request:
https://github.com/apache/spark/pull/9446
[SPARK-11489][SQL] Only include common first order statistics in GroupedData
We added a bunch of 2nd order statistics such as skewness and kurtosis to
GroupedData. I don't think they are common enough to justify being listed,
since users can always use the normal statistics aggregate functions.
That is to say, after this change, we won't support
df.groupBy("key").kurtosis("colA", "colB")
However, we will still support
df.groupBy("key").agg(kurtosis("colA"), kurtosis("colB"))
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rxin/spark SPARK-11489
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9446.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9446
----
commit 856942f9d0677d381dfe18d8777fa6f0a2e858c8
Author: Reynold Xin <[email protected]>
Date: 2015-11-03T21:53:18Z
[SPARK-11489][SQL] Only include common first order statistics in GroupedData
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]