[ 
https://issues.apache.org/jira/browse/SPARK-35955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-35955:
-------------------------------
    Description: 
Fix decimal overflow issues for decimal average in ANSI mode. Linked to 
SPARK-32018 and SPARK-28067, which address decimal sum.

Repro:

 
{code:java}
import org.apache.spark.sql.functions._
spark.conf.set("spark.sql.ansi.enabled", true)

val df = Seq(
 (BigDecimal("10000000000000000000"), 1),
 (BigDecimal("10000000000000000000"), 1),
 (BigDecimal("10000000000000000000"), 2),
 (BigDecimal("10000000000000000000"), 2),
 (BigDecimal("10000000000000000000"), 2),
 (BigDecimal("10000000000000000000"), 2),
 (BigDecimal("10000000000000000000"), 2),
 (BigDecimal("10000000000000000000"), 2),
 (BigDecimal("10000000000000000000"), 2),
 (BigDecimal("10000000000000000000"), 2),
 (BigDecimal("10000000000000000000"), 2),
 (BigDecimal("10000000000000000000"), 2)).toDF("decNum", "intNum")
val df2 = df.withColumnRenamed("decNum", "decNum2").join(df, 
"intNum").agg(mean("decNum"))
df2.show(40,false)
{code}
 

Should throw an exception (as sum overflows), but instead returns:

 
{code:java}
+-----------+
|avg(decNum)|
+-----------+
|null       |
+-----------+{code}
 

  was:Return null on overflow for decimal average. Linked to SPARK-32018 and 
SPARK-28067, which address decimal sum.


> Fix decimal overflow issues for Average
> ---------------------------------------
>
>                 Key: SPARK-35955
>                 URL: https://issues.apache.org/jira/browse/SPARK-35955
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Karen Feng
>            Priority: Major
>
> Fix decimal overflow issues for decimal average in ANSI mode. Linked to 
> SPARK-32018 and SPARK-28067, which address decimal sum.
> Repro:
>  
> {code:java}
> import org.apache.spark.sql.functions._
> spark.conf.set("spark.sql.ansi.enabled", true)
> val df = Seq(
>  (BigDecimal("10000000000000000000"), 1),
>  (BigDecimal("10000000000000000000"), 1),
>  (BigDecimal("10000000000000000000"), 2),
>  (BigDecimal("10000000000000000000"), 2),
>  (BigDecimal("10000000000000000000"), 2),
>  (BigDecimal("10000000000000000000"), 2),
>  (BigDecimal("10000000000000000000"), 2),
>  (BigDecimal("10000000000000000000"), 2),
>  (BigDecimal("10000000000000000000"), 2),
>  (BigDecimal("10000000000000000000"), 2),
>  (BigDecimal("10000000000000000000"), 2),
>  (BigDecimal("10000000000000000000"), 2)).toDF("decNum", "intNum")
> val df2 = df.withColumnRenamed("decNum", "decNum2").join(df, 
> "intNum").agg(mean("decNum"))
> df2.show(40,false)
> {code}
>  
> Should throw an exception (as sum overflows), but instead returns:
>  
> {code:java}
> +-----------+
> |avg(decNum)|
> +-----------+
> |null       |
> +-----------+{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to