This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.5 by this push:
new 07d3e8e5287 [SPARK-44380][PYTHON][FOLLOWUP] Set __doc__ for analyze
static method when Arrow is enabled
07d3e8e5287 is described below
commit 07d3e8e52878ea9631d4757b67119a18fbdf0230
Author: Takuya UESHIN <[email protected]>
AuthorDate: Sat Jul 22 16:49:14 2023 +0900
[SPARK-44380][PYTHON][FOLLOWUP] Set __doc__ for analyze static method when
Arrow is enabled
### What changes were proposed in this pull request?
This is a follow-up of apache/spark#41948.
Set `__doc__` for `analyze` static method when Arrow is enabled.
### Why are the changes needed?
When Arrow is enabled, `analyze` static method doesn't have `__doc__` that
should be the same as the original contents.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Updated the related tests.
Closes #42111 from ueshin/issues/SPARK-44380/analyze_doc.
Authored-by: Takuya UESHIN <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
(cherry picked from commit 784b942196bb08a7959222f549722c6db3a3588e)
Signed-off-by: Hyukjin Kwon <[email protected]>
---
python/pyspark/sql/tests/test_udtf.py | 8 +++++++-
python/pyspark/sql/udtf.py | 3 +++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/python/pyspark/sql/tests/test_udtf.py
b/python/pyspark/sql/tests/test_udtf.py
index 2c76d2f7e15..0fdb1c9b8a1 100644
--- a/python/pyspark/sql/tests/test_udtf.py
+++ b/python/pyspark/sql/tests/test_udtf.py
@@ -728,6 +728,11 @@ class BaseUDTFTestsMixin:
"""Initialize the UDTF"""
...
+ @staticmethod
+ def analyze(x: AnalyzeArgument) -> AnalyzeResult:
+ """Analyze the argument."""
+ ...
+
def eval(self, x: int):
"""Evaluate the input row."""
yield x + 1,
@@ -736,9 +741,10 @@ class BaseUDTFTestsMixin:
"""Terminate the UDTF."""
...
- cls = udtf(TestUDTF, returnType="y: int").func
+ cls = udtf(TestUDTF).func
self.assertIn("A UDTF for test", cls.__doc__)
self.assertIn("Initialize the UDTF", cls.__init__.__doc__)
+ self.assertIn("Analyze the argument", cls.analyze.__doc__)
self.assertIn("Evaluate the input row", cls.eval.__doc__)
self.assertIn("Terminate the UDTF", cls.terminate.__doc__)
diff --git a/python/pyspark/sql/udtf.py b/python/pyspark/sql/udtf.py
index 3ab74193093..e930daa9f51 100644
--- a/python/pyspark/sql/udtf.py
+++ b/python/pyspark/sql/udtf.py
@@ -134,6 +134,9 @@ def _vectorize_udtf(cls: Type) -> Type:
if hasattr(cls, "terminate"):
getattr(vectorized_udtf, "terminate").__doc__ = getattr(cls,
"terminate").__doc__
+ if hasattr(vectorized_udtf, "analyze"):
+ getattr(vectorized_udtf, "analyze").__doc__ = getattr(cls,
"analyze").__doc__
+
return vectorized_udtf
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]