[GitHub] spark pull request: [SPARK-11489][SQL] Only include common first o...

rxin Tue, 03 Nov 2015 13:54:19 -0800

GitHub user rxin opened a pull request:

    https://github.com/apache/spark/pull/9446


    [SPARK-11489][SQL] Only include common first order statistics in GroupedData

    We added a bunch of 2nd order statistics such as skewness and kurtosis to 
GroupedData. I don't think they are common enough to justify being listed, 
since users can always use the normal statistics aggregate functions.
    
    That is to say, after this change, we won't support
    
    df.groupBy("key").kurtosis("colA", "colB")
    
    However, we will still support
    
    df.groupBy("key").agg(kurtosis("colA"), kurtosis("colB"))


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rxin/spark SPARK-11489

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9446.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9446
    
----
commit 856942f9d0677d381dfe18d8777fa6f0a2e858c8
Author: Reynold Xin <[email protected]>
Date:   2015-11-03T21:53:18Z

    [SPARK-11489][SQL] Only include common first order statistics in GroupedData

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-11489][SQL] Only include common first o...

Reply via email to