[spark] branch master updated: [SPARK-41747][SPARK-41744][SPARK-41748][SPARK-41749][CONNECT][TESTS] Reeanble tests for multiple arguments in max, min, sum and avg in groupby

gurwls223 Wed, 28 Dec 2022 18:03:43 -0800

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 93fc677d282 
[SPARK-41747][SPARK-41744][SPARK-41748][SPARK-41749][CONNECT][TESTS] Reeanble 
tests for multiple arguments in max, min, sum and avg in groupby
93fc677d282 is described below

commit 93fc677d28247df22e93238c5c841d6f7007ecf3
Author: Hyukjin Kwon <gurwls...@apache.org>
AuthorDate: Thu Dec 29 11:03:18 2022 +0900

    [SPARK-41747][SPARK-41744][SPARK-41748][SPARK-41749][CONNECT][TESTS] 
Reeanble tests for multiple arguments in max, min, sum and avg in groupby
    
    ### What changes were proposed in this pull request?
    
    This PR is technically a followup of 
https://github.com/apache/spark/pull/39254 that enables the related doctests 
back.
    
    ### Why are the changes needed?
    
    To reenable skipped tests.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, test-only.
    
    ### How was this patch tested?
    
    Enabling skipped tests. Manually ran this and checked it locally.
    
    Closes #39271 from HyukjinKwon/SPARK-41747.
    
    Authored-by: Hyukjin Kwon <gurwls...@apache.org>
    Signed-off-by: Hyukjin Kwon <gurwls...@apache.org>
---
 python/pyspark/sql/group.py | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/python/pyspark/sql/group.py b/python/pyspark/sql/group.py
index 0aa8edcba7b..ac661e39741 100644
--- a/python/pyspark/sql/group.py
+++ b/python/pyspark/sql/group.py
@@ -230,7 +230,6 @@ class GroupedData(PandasGroupedOpsMixin):
         """
 
     # TODO(SPARK-41743): groupBy(...).agg(...).sort does not actually sort the 
output
-    # TODO(SPARK-41747): Support multiple arguments in groupBy.avg(...)
     @df_varargs_api
     def avg(self, *cols: str) -> DataFrame:
         """Computes average values for each numeric columns for each group.
@@ -274,7 +273,7 @@ class GroupedData(PandasGroupedOpsMixin):
 
         Calculate the mean of the age and height in all data.
 
-        >>> df.groupBy().avg('age', 'height').show()  # doctest: +SKIP
+        >>> df.groupBy().avg('age', 'height').show()
         +--------+-----------+
         |avg(age)|avg(height)|
         +--------+-----------+
@@ -283,7 +282,6 @@ class GroupedData(PandasGroupedOpsMixin):
         """
 
     # TODO(SPARK-41743): groupBy(...).agg(...).sort does not actually sort the 
output
-    # TODO(SPARK-41744): Support multiple arguments in groupBy.max(...)
     @df_varargs_api
     def max(self, *cols: str) -> DataFrame:
         """Computes the max value for each numeric columns for each group.
@@ -320,7 +318,7 @@ class GroupedData(PandasGroupedOpsMixin):
 
         Calculate the max of the age and height in all data.
 
-        >>> df.groupBy().max("age", "height").show()  # doctest: +SKIP
+        >>> df.groupBy().max("age", "height").show()
         +--------+-----------+
         |max(age)|max(height)|
         +--------+-----------+
@@ -329,7 +327,6 @@ class GroupedData(PandasGroupedOpsMixin):
         """
 
     # TODO(SPARK-41743): groupBy(...).agg(...).sort does not actually sort the 
output
-    # TODO(SPARK-41748): Support multiple arguments in groupBy.min(...)
     @df_varargs_api
     def min(self, *cols: str) -> DataFrame:
         """Computes the min value for each numeric column for each group.
@@ -371,7 +368,7 @@ class GroupedData(PandasGroupedOpsMixin):
 
         Calculate the min of the age and height in all data.
 
-        >>> df.groupBy().min("age", "height").show()  # doctest: +SKIP
+        >>> df.groupBy().min("age", "height").show()
         +--------+-----------+
         |min(age)|min(height)|
         +--------+-----------+
@@ -380,7 +377,6 @@ class GroupedData(PandasGroupedOpsMixin):
         """
 
     # TODO(SPARK-41743): groupBy(...).agg(...).sort does not actually sort the 
output
-    # TODO(SPARK-41749): Support multiple arguments in groupBy.sum(...)
     @df_varargs_api
     def sum(self, *cols: str) -> DataFrame:
         """Computes the sum for each numeric columns for each group.
@@ -422,7 +418,7 @@ class GroupedData(PandasGroupedOpsMixin):
 
         Calculate the sum of the age and height in all data.
 
-        >>> df.groupBy().sum("age", "height").show()  # doctest: +SKIP
+        >>> df.groupBy().sum("age", "height").show()
         +--------+-----------+
         |sum(age)|sum(height)|
         +--------+-----------+


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41747][SPARK-41744][SPARK-41748][SPARK-41749][CONNECT][TESTS] Reeanble tests for multiple arguments in max, min, sum and avg in groupby

Reply via email to