[GitHub] spark pull request: [SPARK-8992] [SQL] Add pivot to dataframe api

aray Fri, 31 Jul 2015 13:58:02 -0700

GitHub user aray opened a pull request:

    https://github.com/apache/spark/pull/7841


    [SPARK-8992] [SQL] Add pivot to dataframe api

    This adds a pivot method to the dataframe api.
    
    Following the lead of cube and rollup this adds a Pivot operator that is 
translated into an Aggregate by the analyzer.
    
    Currently the syntax is like:
    
        courseSales.pivot(Seq($"year"), $"course", Seq("dotNET", "Java"), 
sum($"earnings"))
    
    Would we be interested in the following syntax also/alternatively?
    
        courseSales.groupBy($"year").pivot($"course", "dotNET", 
"Java").agg(sum($"earnings"))
    
    Later we can add it to `SQLParser`, but as Hive doesn't support it we cant 
add it there, right?

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aray/spark sql-pivot

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/7841.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #7841
    
----
commit 599e9e0b9bd46be798da1274e9ba9839151b2aaf
Author: Andrew Ray <[email protected]>
Date:   2015-07-29T21:05:21Z

    Add pivot to dataframe api

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-8992] [SQL] Add pivot to dataframe api

Reply via email to