[ 
https://issues.apache.org/jira/browse/SPARK-43295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng reassigned SPARK-43295:
-------------------------------------

    Assignee: Haejoon Lee

> Make DataFrameGroupBy.sum support for string type columns
> ---------------------------------------------------------
>
>                 Key: SPARK-43295
>                 URL: https://issues.apache.org/jira/browse/SPARK-43295
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Pandas API on Spark
>    Affects Versions: 4.0.0
>            Reporter: Haejoon Lee
>            Assignee: Haejoon Lee
>            Priority: Major
>              Labels: pull-request-available
>
> From pandas 2.0.0, DataFrameGroupBy.sum also works for string type columns:
> {code:java}
> >>> psdf
>    A    B  C      D
> 0  1  3.1  a   True
> 1  2  4.1  b  False
> 2  1  4.1  b  False
> 3  2  3.1  a   True
> >>> psdf.groupby("A").sum().sort_index()
>      B  D
> A
> 1  7.2  1
> 2  7.2  1
> >>> psdf.to_pandas().groupby("A").sum().sort_index()
>      B   C  D
> A
> 1  7.2  ab  1
> 2  7.2  ba  1 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to