This is an automated email from the ASF dual-hosted git repository.
ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 0d0918b1e96 [SPARK-41383][CONNECT][PYTHON][TESTS][FOLLOWUP] Add tests
for `grouping` and `grouping_id`
0d0918b1e96 is described below
commit 0d0918b1e968789657d9ae6e49d57fdb7c529434
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Sat Dec 31 14:06:39 2022 +0800
[SPARK-41383][CONNECT][PYTHON][TESTS][FOLLOWUP] Add tests for `grouping`
and `grouping_id`
### What changes were proposed in this pull request?
Add tests for `grouping` and `grouping_id`
### Why are the changes needed?
For test coverage
### Does this PR introduce _any_ user-facing change?
no, test-only
### How was this patch tested?
added tests
Closes #39321 from zhengruifeng/connect_function_grouping_test.
Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
---
python/pyspark/sql/tests/connect/test_connect_function.py | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/python/pyspark/sql/tests/connect/test_connect_function.py
b/python/pyspark/sql/tests/connect/test_connect_function.py
index 38f7492e019..96ecf8208b3 100644
--- a/python/pyspark/sql/tests/connect/test_connect_function.py
+++ b/python/pyspark/sql/tests/connect/test_connect_function.py
@@ -521,7 +521,6 @@ class SparkConnectFunctionTests(SparkConnectFuncTestCase):
cdf = self.connect.sql(query)
sdf = self.spark.sql(query)
- # TODO(SPARK-41383): add tests for grouping, grouping_id after
DataFrame.cube is supported.
for cfunc, sfunc in [
(CF.approx_count_distinct, SF.approx_count_distinct),
(CF.approxCountDistinct, SF.approxCountDistinct),
@@ -574,6 +573,18 @@ class SparkConnectFunctionTests(SparkConnectFuncTestCase):
sdf.groupBy("a").agg(sfunc(sdf.b, "c")).toPandas(),
)
+ # test grouping
+ self.assert_eq(
+ cdf.cube("a").agg(CF.grouping("a"),
CF.sum("c")).orderBy("a").toPandas(),
+ sdf.cube("a").agg(SF.grouping("a"),
SF.sum("c")).orderBy("a").toPandas(),
+ )
+
+ # test grouping_id
+ self.assert_eq(
+ cdf.cube("a").agg(CF.grouping_id(),
CF.sum("c")).orderBy("a").toPandas(),
+ sdf.cube("a").agg(SF.grouping_id(),
SF.sum("c")).orderBy("a").toPandas(),
+ )
+
# test percentile_approx
self.assert_eq(
cdf.select(CF.percentile_approx(cdf.b, 0.5, 1000)).toPandas(),
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]