[jira] [Resolved] (SPARK-52821) Support int to DecimalType return type coercion in Pandas UDFs (useArrow=True)

Hyukjin Kwon (Jira) Wed, 23 Jul 2025 17:45:04 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-52821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hyukjin Kwon resolved SPARK-52821.
----------------------------------
    Fix Version/s: 4.1.0
       Resolution: Fixed

Issue resolved by pull request 51538
[https://github.com/apache/spark/pull/51538]

> Support int to DecimalType return type coercion in Pandas UDFs (useArrow=True)
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-52821
>                 URL: https://issues.apache.org/jira/browse/SPARK-52821
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 4.1.0
>            Reporter: Ben Hurdelhey
>            Assignee: Ben Hurdelhey
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 4.1.0
>
>         Attachments: Screenshot 2025-07-16 at 11.49.31.png
>
>
> Problem: pyspark UDFs with useArrow=True do not support type coercion from 
> int to DecimalType if the target precision of the DecimalType is too low. 
> Example:
> {code:java}
> @udf(returnType=DecimalType(2, 1), useArrow=True)
> def test:
>   return 1
> spark.range(1,2,1,1).select(test(col('id'))).display() # expected: (Decimal) 
> 1.0
> {code}
> throws 
> {code:java}
> pyarrow.lib.ArrowInvalid: Precision is not great enough for the result. It 
> should be at least 20{code}
>  
> For a better overview of the current behavior, check out this publicly 
> available 
> [notebook|https://www.databricks.com/wp-content/uploads/notebooks/python-udf-type-coercion.html],
>  with the proposed change highlighted in the screenshot.
>  
> Proposed solution: Add integer to decimal conversion for pyspark udf return 
> types. This is a net-new use case, it was not supported previously (threw an 
> error). Thus, this is not a breaking change. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-52821) Support int to DecimalType return type coercion in Pandas UDFs (useArrow=True)

Reply via email to