Github user mgaido91 commented on the issue:
https://github.com/apache/spark/pull/20023
@gatorsmile that is scenario 3. I will explain you why and after I will do
and errata corrige of the summary I did in my last e-mail, because I made a
mistake about how DB2 computes the result precision and scale, sorry for that.
Anyway, what you showed is an example of point 3 because DB2 computes the
result type as `DECIMAL( MIN(31, p1 + p2), MIN(31, s1 + s2) )`. Therefore, in
your case the result type was `DECIMAL(31, 31)`. Since your result had more
than 0 significant digits, it was out out of the range of the representable
values and an overflow exception was thrown.
You can reproduce case 2 as follows:
```
db2 => create table decimals_test (id int, a decimal(31,31), b
decimal(31,31))
DB20000I The SQL command completed successfully.
db2 => insert into decimals_test values(1, 0.12345678912345678912345689,
0.12345678912345678912345)
DB20000I The SQL command completed successfully.
db2 => select a*b from dd
1
---------------------------------
.0152415787806736785461049526020
```
As you can see a truncation occurred.
Now, let me amend my table to summarize the behavior of the many DBs:
1. **Rules to determine precision and scale**
- *Hive, SQLServer (and Spark after the PR)*: I won't include the
exact formulas, anyway the relevant part is that in case of precision
higher that the maximum value, we use the maximum available value (38) as
precision and the maximum between the needed scale (computing according the
relevant formula) and a minimum value guaranteed for the scale which is 6.
- *DB2*: computes the result type as `DECIMAL( MIN(31, p1 + p2),
MIN(31, s1 + s2) )`.
- *Postgres and Oracle*: NA
- *SQL ANSI 2011*: no indication
- *Spark now*: if the precision needed is more than 38, use 38 as
precision; use the needed scale without any adjustment.
2. **Behavior in case of precision loss but result in the range of the
representable values**
- *Oracle, Hive, SQLServer (and Spark after the PR)*: round the result.
- *DB2*: truncates the result (and sets a warning flag).
- *Postgres*: NA, it has infinite precision...
- *SQL ANSI 2011*: either truncate or round the value.
- *Spark now*: returns NULL.
3. **Behavior in case of result out of the range of the representable
values (i.e overflow)**
- *Oracle, DB2, SQLServer*: throw an exception.
- *Postgres*: NA, they have nearly infinite precision...
- *SQL ANSI 2011*: an exception should be raised
- *Spark now, Hive*: return NULL (for Hive, there is a open ticket to
make it compliant to the SQL standard).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]