(spark) branch master updated: [MINOR][DOCS] Fix an Arrow UDF example

ruifengz Wed, 13 Aug 2025 20:08:09 -0700

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new eba5381e4ce5 [MINOR][DOCS] Fix an Arrow UDF example
eba5381e4ce5 is described below

commit eba5381e4ce50622c80beb6bd47b7208b17061d6
Author: Ruifeng Zheng <ruife...@apache.org>
AuthorDate: Thu Aug 14 11:07:43 2025 +0800

    [MINOR][DOCS] Fix an Arrow UDF example
    
    ### What changes were proposed in this pull request?
    Fix an Arrow UDF example
    
    ### Why are the changes needed?
    it was not properly rendered
    
    <img width="1120" height="386" alt="image" 
src="https://github.com/user-attachments/assets/f3c98a29-8f91-439b-accc-50b3d3198ee0";
 />
    
    ### Does this PR introduce _any_ user-facing change?
    doc-only change
    
    ### How was this patch tested?
    manually check
    
    <img width="663" height="422" alt="image" 
src="https://github.com/user-attachments/assets/1fab8726-6a96-46b9-9af6-eb5c9e99d805";
 />
    
    ### Was this patch authored or co-authored using generative AI tooling?
    no
    
    Closes #52023 from zhengruifeng/fix_a_arrow_example.
    
    Authored-by: Ruifeng Zheng <ruife...@apache.org>
    Signed-off-by: Ruifeng Zheng <ruife...@apache.org>
---
 python/pyspark/sql/pandas/functions.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/python/pyspark/sql/pandas/functions.py 
b/python/pyspark/sql/pandas/functions.py
index 91a2df8c3524..1079e160386e 100644
--- a/python/pyspark/sql/pandas/functions.py
+++ b/python/pyspark/sql/pandas/functions.py
@@ -221,7 +221,7 @@ def arrow_udf(f=None, returnType=None, functionType=None):
 
         The function takes `pyarrow.Array` and returns a scalar value. The 
returned scalar
         can be a python primitive type, (e.g., int or float), a numpy data 
type (e.g.,
-        numpy.int64 or numpy.float64), or a pyarrow.Scalar instance which 
supports complex
+        numpy.int64 or numpy.float64), or a `pyarrow.Scalar` instance which 
supports complex
         return types.
         `Any` should ideally be a specific scalar type accordingly.
 
@@ -240,6 +240,7 @@ def arrow_udf(f=None, returnType=None, functionType=None):
         +---+-----------+
 
         The retun type can also be a complex type such as struct, list, or map.
+
         >>> @arrow_udf("struct<m1: double, m2: double>")
         ... def min_max_udf(v: pa.Array) -> pa.Scalar:
         ...     m1 = pa.compute.min(v)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [MINOR][DOCS] Fix an Arrow UDF example

Reply via email to