(spark) branch master updated: [SPARK-55490][PS][FOLLOW-UP] Fix `groupby(as_index=False).agg` with dict

dongjoon Wed, 18 Feb 2026 11:47:38 -0800

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 6adabbd76bad [SPARK-55490][PS][FOLLOW-UP] Fix 
`groupby(as_index=False).agg` with dict
6adabbd76bad is described below

commit 6adabbd76bad7a3155398ff4de6085d34dc9e693
Author: Takuya Ueshin <[email protected]>
AuthorDate: Wed Feb 18 11:46:21 2026 -0800

    [SPARK-55490][PS][FOLLOW-UP] Fix `groupby(as_index=False).agg` with dict
    
    ### What changes were proposed in this pull request?
    
    This is a follow-up of apache/spark#54276.
    
    Fixes `groupby(as_index=False).agg` with dict.
    
    ### Why are the changes needed?
    
    The case of `groupby(as_index=False).agg` with dict was missing at 
apache/spark#54276.
    
    ```py
    >>> psdf = ps.DataFrame(
    ...     {"A": [1, 1, 2, 2], "B": [1, 2, 3, 4], "C": [0.362, 0.227, 1.267, 
-0.562]}
    ... )
    >>> psdf.groupby(psdf.A, as_index=False).agg({"B": "min", "C": "sum"})
       A  B      C
    0  1  1  0.589
    1  2  3  0.705
    >>>
    >>> psdf.groupby(psdf.A + 1, as_index=False).agg({"B": "min", "C": "sum"})
       B      C
    0  1  0.589
    1  3  0.705
    ```
    
    whereas pandas 3:
    
    ```py
    >>> pdf = pd.DataFrame(
    ...     {"A": [1, 1, 2, 2], "B": [1, 2, 3, 4], "C": [0.362, 0.227, 1.267, 
-0.562]}
    ... )
    >>> pdf.groupby(pdf.A, as_index=False).agg({"B": "min", "C": "sum"})
       A  B      C
    0  1  1  0.589
    1  2  3  0.705
    >>>
    >>> pdf.groupby(pdf.A + 1, as_index=False).agg({"B": "min", "C": "sum"})
       A  B      C
    0  2  1  0.589
    1  3  3  0.705
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, it will behave more like pandas 3.
    
    ### How was this patch tested?
    
    The existing tests should pass.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Codex (GPT-5.3-Codex)
    
    Closes #54352 from ueshin/issuse/SPARK-55490/as_index_dict.
    
    Authored-by: Takuya Ueshin <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 python/pyspark/pandas/groupby.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/pyspark/pandas/groupby.py b/python/pyspark/pandas/groupby.py
index 37993e5f2499..f9e8123555ad 100644
--- a/python/pyspark/pandas/groupby.py
+++ b/python/pyspark/pandas/groupby.py
@@ -329,7 +329,7 @@ class GroupBy(Generic[FrameLike], metaclass=ABCMeta):
                     i for i, gkey in enumerate(self._groupkeys) if gkey._psdf 
is not self._psdf
                 )
             else:
-                column_names = [column.name for column in self._agg_columns]
+                column_names = set(func_or_funcs)
                 should_drop_index = set(
                     i for i, gkey in enumerate(self._groupkeys) if gkey.name 
in column_names
                 )


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-55490][PS][FOLLOW-UP] Fix `groupby(as_index=False).agg` with dict

Reply via email to