[ 
https://issues.apache.org/jira/browse/SPARK-32018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172688#comment-17172688
 ] 

Sunitha Kambhampati commented on SPARK-32018:
---------------------------------------------

The important issue is we should not return incorrect results. In general, it 
is not a good practice to back port a change to a stable branch and cause more 
queries to return incorrect results.

Just to reiterate:
 # This current PR that has back ported the UnsafeRow fix causes queries to 
return incorrect results. This is for v2.4.x and v3.0.x line. This change by 
itself has unsafe side effects and results in incorrect results being returned.
 # It does not matter whether you have whole stage on or off, ansi on or off, 
you will get more queries returning incorrect results.
 # Incorrect results is very serious and it is not good for Spark users to run 
into it for common operations like sum.

{code:java}
 
scala> val decStr = "1" + "0" * 19
decStr: String = 10000000000000000000
scala> val d3 = spark.range(0, 1, 1, 1).union(spark.range(0, 11, 1, 1))
d3: org.apache.spark.sql.Dataset[Long] = [id: bigint]
 
scala>  val d5 = d3.select(expr(s"cast('$decStr' as decimal (38, 18)) as 
d"),lit(1).as("key")).groupBy("key").agg(sum($"d").alias("sumd")).select($"sumd")
d5: org.apache.spark.sql.DataFrame = [sumd: decimal(38,18)]
scala> d5.show(false)   <----- INCORRECT RESULTS
+---------------------------------------+
|sumd                                   |
+---------------------------------------+
|20000000000000000000.000000000000000000|
+---------------------------------------+
{code}
 

> Fix UnsafeRow set overflowed decimal
> ------------------------------------
>
>                 Key: SPARK-32018
>                 URL: https://issues.apache.org/jira/browse/SPARK-32018
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.6, 3.0.0
>            Reporter: Allison Wang
>            Assignee: Wenchen Fan
>            Priority: Major
>             Fix For: 2.4.7, 3.0.1, 3.1.0
>
>
> There is a bug that writing an overflowed decimal into UnsafeRow is fine but 
> reading it out will throw ArithmeticException. This exception is thrown when 
> calling {{getDecimal}} in UnsafeRow with input decimal's precision greater 
> than the input precision. Setting the value of the overflowed decimal to null 
> when writing into UnsafeRow should fix this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to