This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new e014248434ac [SPARK-46880][PYTHON][CONNECT][TESTS] Improve and test
warning for Arrow-optimized Python UDF
e014248434ac is described below
commit e014248434ac241b9681aceff79f900f0c41dd28
Author: Xinrong Meng <[email protected]>
AuthorDate: Fri Jan 26 15:43:32 2024 -0800
[SPARK-46880][PYTHON][CONNECT][TESTS] Improve and test warning for
Arrow-optimized Python UDF
### What changes were proposed in this pull request?
Improve and test warning for Arrow-optimized Python UDF
### Why are the changes needed?
To improve usability and test coverage.
### Does this PR introduce _any_ user-facing change?
Only a user warning changed.
FROM
```
>>> udf(lambda: print("do"), useArrow=True)
UserWarning: Arrow optimization for Python UDFs cannot be enabled.
warnings.warn(
<function <lambda> at ..>
```
TO
```
>>> udf(lambda: print("do"), useArrow=True)
UserWarning: Arrow optimization for Python UDFs cannot be enabled for
functions without arguments.
warnings.warn(
<function <lambda> at ..>
```
### How was this patch tested?
Unit test.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #44905 from xinrong-meng/arr_udf_warn.
Authored-by: Xinrong Meng <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
python/pyspark/sql/connect/udf.py | 3 ++-
python/pyspark/sql/tests/test_arrow_python_udf.py | 9 +++++++++
python/pyspark/sql/udf.py | 3 ++-
3 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/python/pyspark/sql/connect/udf.py
b/python/pyspark/sql/connect/udf.py
index 5386398bdca8..1c42f4d74b7a 100644
--- a/python/pyspark/sql/connect/udf.py
+++ b/python/pyspark/sql/connect/udf.py
@@ -85,7 +85,8 @@ def _create_py_udf(
eval_type = PythonEvalType.SQL_ARROW_BATCHED_UDF
else:
warnings.warn(
- "Arrow optimization for Python UDFs cannot be enabled.",
+ "Arrow optimization for Python UDFs cannot be enabled for
functions"
+ " without arguments.",
UserWarning,
)
diff --git a/python/pyspark/sql/tests/test_arrow_python_udf.py
b/python/pyspark/sql/tests/test_arrow_python_udf.py
index c59326edc31a..114fdf602223 100644
--- a/python/pyspark/sql/tests/test_arrow_python_udf.py
+++ b/python/pyspark/sql/tests/test_arrow_python_udf.py
@@ -188,6 +188,15 @@ class PythonUDFArrowTestsMixin(BaseUDFTestsMixin):
},
)
+ def test_warn_no_args(self):
+ with self.assertWarns(UserWarning) as w:
+ udf(lambda: print("do"), useArrow=True)
+ self.assertEqual(
+ str(w.warning),
+ "Arrow optimization for Python UDFs cannot be enabled for
functions"
+ " without arguments.",
+ )
+
class PythonUDFArrowTests(PythonUDFArrowTestsMixin, ReusedSQLTestCase):
@classmethod
diff --git a/python/pyspark/sql/udf.py b/python/pyspark/sql/udf.py
index ca38556431ad..0324bc678667 100644
--- a/python/pyspark/sql/udf.py
+++ b/python/pyspark/sql/udf.py
@@ -142,7 +142,8 @@ def _create_py_udf(
eval_type = PythonEvalType.SQL_ARROW_BATCHED_UDF
else:
warnings.warn(
- "Arrow optimization for Python UDFs cannot be enabled.",
+ "Arrow optimization for Python UDFs cannot be enabled for
functions"
+ " without arguments.",
UserWarning,
)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]