This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 0d0918b1e96 [SPARK-41383][CONNECT][PYTHON][TESTS][FOLLOWUP] Add tests 
for `grouping` and `grouping_id`
0d0918b1e96 is described below

commit 0d0918b1e968789657d9ae6e49d57fdb7c529434
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Sat Dec 31 14:06:39 2022 +0800

    [SPARK-41383][CONNECT][PYTHON][TESTS][FOLLOWUP] Add tests for `grouping` 
and `grouping_id`
    
    ### What changes were proposed in this pull request?
    Add tests for `grouping` and `grouping_id`
    
    ### Why are the changes needed?
    For test coverage
    
    ### Does this PR introduce _any_ user-facing change?
    no, test-only
    
    ### How was this patch tested?
    added tests
    
    Closes #39321 from zhengruifeng/connect_function_grouping_test.
    
    Authored-by: Ruifeng Zheng <[email protected]>
    Signed-off-by: Ruifeng Zheng <[email protected]>
---
 python/pyspark/sql/tests/connect/test_connect_function.py | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/python/pyspark/sql/tests/connect/test_connect_function.py 
b/python/pyspark/sql/tests/connect/test_connect_function.py
index 38f7492e019..96ecf8208b3 100644
--- a/python/pyspark/sql/tests/connect/test_connect_function.py
+++ b/python/pyspark/sql/tests/connect/test_connect_function.py
@@ -521,7 +521,6 @@ class SparkConnectFunctionTests(SparkConnectFuncTestCase):
         cdf = self.connect.sql(query)
         sdf = self.spark.sql(query)
 
-        # TODO(SPARK-41383): add tests for grouping, grouping_id after 
DataFrame.cube is supported.
         for cfunc, sfunc in [
             (CF.approx_count_distinct, SF.approx_count_distinct),
             (CF.approxCountDistinct, SF.approxCountDistinct),
@@ -574,6 +573,18 @@ class SparkConnectFunctionTests(SparkConnectFuncTestCase):
                 sdf.groupBy("a").agg(sfunc(sdf.b, "c")).toPandas(),
             )
 
+        # test grouping
+        self.assert_eq(
+            cdf.cube("a").agg(CF.grouping("a"), 
CF.sum("c")).orderBy("a").toPandas(),
+            sdf.cube("a").agg(SF.grouping("a"), 
SF.sum("c")).orderBy("a").toPandas(),
+        )
+
+        # test grouping_id
+        self.assert_eq(
+            cdf.cube("a").agg(CF.grouping_id(), 
CF.sum("c")).orderBy("a").toPandas(),
+            sdf.cube("a").agg(SF.grouping_id(), 
SF.sum("c")).orderBy("a").toPandas(),
+        )
+
         # test percentile_approx
         self.assert_eq(
             cdf.select(CF.percentile_approx(cdf.b, 0.5, 1000)).toPandas(),


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to