This is an automated email from the ASF dual-hosted git repository.
ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new f6e4a466705 [SPARK-46063][PYTHON][CONNECT] Improve error messages
related to argument types in cute, rollup, groupby, and pivot
f6e4a466705 is described below
commit f6e4a4667057e226a06b4d1b063a62b698ffb25f
Author: Hyukjin Kwon <[email protected]>
AuthorDate: Thu Nov 23 15:33:15 2023 +0800
[SPARK-46063][PYTHON][CONNECT] Improve error messages related to argument
types in cute, rollup, groupby, and pivot
### What changes were proposed in this pull request?
This PR improves error messages related to argument types in `cute`,
`rollup`, `groupBy`, and `pivot`.
```bash
./bin/pyspark --remote local
```
```python
>>> help(spark.range(1).cube)
Help on method cube in module pyspark.sql.connect.dataframe:
cube(*cols: 'ColumnOrName') -> 'GroupedData' method of
pyspark.sql.connect.dataframe.DataFrame instance
Create a multi-dimensional cube for the current :class:`DataFrame` using
the specified columns, allowing aggregations to be performed on them.
...
```
**Before:**
```python
>>> spark.range(1).cube(1.2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../python/pyspark/sql/connect/dataframe.py", line 544, in cube
raise PySparkTypeError(
pyspark.errors.exceptions.base.PySparkTypeError: [NOT_COLUMN_OR_STR]
Argument `cube` should be a Column or str, got float.
```
**After:**
```
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../python/pyspark/sql/connect/dataframe.py", line 544, in cube
raise PySparkTypeError(
pyspark.errors.exceptions.base.PySparkTypeError: [NOT_COLUMN_OR_STR]
Argument `cols` should be a Column or str, got float.
```
### Why are the changes needed?
For better error messages to end users.
### Does this PR introduce _any_ user-facing change?
Yes, it fixes the user-facing error message.
### How was this patch tested?
Manually tested.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #43968 from HyukjinKwon/SPARK-46063.
Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
---
python/pyspark/sql/connect/dataframe.py | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/python/pyspark/sql/connect/dataframe.py
b/python/pyspark/sql/connect/dataframe.py
index c713bb85c1e..c7b51205363 100644
--- a/python/pyspark/sql/connect/dataframe.py
+++ b/python/pyspark/sql/connect/dataframe.py
@@ -495,7 +495,7 @@ class DataFrame:
else:
raise PySparkTypeError(
error_class="NOT_COLUMN_OR_STR",
- message_parameters={"arg_name": "groupBy", "arg_type":
type(c).__name__},
+ message_parameters={"arg_name": "cols", "arg_type":
type(c).__name__},
)
return GroupedData(df=self, group_type="groupby", grouping_cols=_cols)
@@ -520,7 +520,7 @@ class DataFrame:
else:
raise PySparkTypeError(
error_class="NOT_COLUMN_OR_STR",
- message_parameters={"arg_name": "rollup", "arg_type":
type(c).__name__},
+ message_parameters={"arg_name": "cols", "arg_type":
type(c).__name__},
)
return GroupedData(df=self, group_type="rollup", grouping_cols=_cols)
@@ -543,7 +543,7 @@ class DataFrame:
else:
raise PySparkTypeError(
error_class="NOT_COLUMN_OR_STR",
- message_parameters={"arg_name": "cube", "arg_type":
type(c).__name__},
+ message_parameters={"arg_name": "cols", "arg_type":
type(c).__name__},
)
return GroupedData(df=self, group_type="cube", grouping_cols=_cols)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]