[
https://issues.apache.org/jira/browse/SPARK-28316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16884398#comment-16884398
]
Marco Gaido commented on SPARK-28316:
-------------------------------------
Well, IIUC, this is just the result of Postgres having no limit on decimal
precision, while Spark's Decimal max precision is 38. Our decimal
implementation draws from SQLServer's (and Hive's, which follows SQLServer)
one.
> Decimal precision issue
> -----------------------
>
> Key: SPARK-28316
> URL: https://issues.apache.org/jira/browse/SPARK-28316
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Yuming Wang
> Priority: Major
>
> Multiply check:
> {code:sql}
> -- Spark SQL
> spark-sql> select cast(-34338492.215397047 as decimal(38, 10)) *
> cast(-34338492.215397047 as decimal(38, 10));
> 1179132047626883.596862
> -- PostgreSQL
> postgres=# select cast(-34338492.215397047 as numeric(38, 10)) *
> cast(-34338492.215397047 as numeric(38, 10));
> ?column?
> ---------------------------------------
> 1179132047626883.59686213585632020900
> (1 row)
> {code}
> Division check:
> {code:sql}
> -- Spark SQL
> spark-sql> select cast(93901.57763026 as decimal(38, 10)) / cast(4.31 as
> decimal(38, 10));
> 21786.908963
> -- PostgreSQL
> postgres=# select cast(93901.57763026 as numeric(38, 10)) / cast(4.31 as
> numeric(38, 10));
> ?column?
> --------------------
> 21786.908962937355
> (1 row)
> {code}
> POWER(10, LN(value)) check:
> {code:sql}
> -- Spark SQL
> spark-sql> SELECT CAST(POWER(cast('10' as decimal(38, 18)),
> LN(ABS(round(cast(-24926804.04504742 as decimal(38, 10)),200)))) AS
> decimal(38, 10));
> 107511333880051856
> -- PostgreSQL
> postgres=# SELECT CAST(POWER(cast('10' as numeric(38, 18)),
> LN(ABS(round(cast(-24926804.04504742 as numeric(38, 10)),200)))) AS
> numeric(38, 10));
> power
> -------------------------------
> 107511333880052007.0414112467
> (1 row)
> {code}
> AVG, STDDEV and VARIANCE returns double type:
> {code:sql}
> -- Spark SQL
> spark-sql> create temporary view t1 as select * from values
> > (cast(-24926804.04504742 as decimal(38, 10))),
> > (cast(16397.038491 as decimal(38, 10))),
> > (cast(7799461.4119 as decimal(38, 10)))
> > as t1(t);
> spark-sql> SELECT AVG(t), STDDEV(t), VARIANCE(t) FROM t1;
> -5703648.53155214 1.7096528995154984E7 2.922913036821751E14
> -- PostgreSQL
> postgres=# SELECT AVG(t), STDDEV(t), VARIANCE(t) from (values
> (cast(-24926804.04504742 as decimal(38, 10))), (cast(16397.038491 as
> decimal(38, 10))), (cast(7799461.4119 as decimal(38, 10)))) t1(t);
> avg | stddev |
> variance
> -----------------------+-------------------------------+--------------------------------------
> -5703648.531552140000 | 17096528.99515498420743029415 |
> 292291303682175.09401722225695880000
> (1 row)
> {code}
> EXP returns double type:
> {code:sql}
> -- Spark SQL
> spark-sql> select exp(cast(1.0 as decimal(31,30)));
> 2.718281828459045
> -- PostgreSQL
> postgres=# select exp(cast(1.0 as decimal(31,30)));
> exp
> ----------------------------------
> 2.718281828459045235360287471353
> (1 row)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]