[jira] [Commented] (SPARK-28024) Incorrect numeric values when out of range
[ https://issues.apache.org/jira/browse/SPARK-28024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837292#comment-17837292 ] Nicholas Chammas commented on SPARK-28024: -- [~cloud_fan] - Given the updated descriptions for Cases 2, 3, and 4, do you still consider there to be a problem here? Or shall we just consider this an acceptable difference between how Spark and Postgres handle these cases? > Incorrect numeric values when out of range > -- > > Key: SPARK-28024 > URL: https://issues.apache.org/jira/browse/SPARK-28024 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.4, 3.0.0 >Reporter: Yuming Wang >Priority: Major > Labels: correctness > Attachments: SPARK-28024.png > > > Spark on {{master}} at commit {{de00ac8a05aedb3a150c8c10f76d1fe5496b1df3}} > with {{set spark.sql.ansi.enabled=true;}} as compared to the default behavior > on PostgreSQL 16. > Case 1: > {code:sql} > select tinyint(128) * tinyint(2); -- 0 > select smallint(2147483647) * smallint(2); -- -2 > select int(2147483647) * int(2); -- -2 > SELECT smallint((-32768)) * smallint(-1); -- -32768 > {code} > With ANSI mode enabled, this case is no longer an issue. All 4 of the above > statements now yield {{CAST_OVERFLOW}} or {{ARITHMETIC_OVERFLOW}} errors. > Case 2: > {code:sql} > spark-sql> select cast('10e-70' as float), cast('-10e-70' as float); > 0.0 -0.0 > postgres=# select cast('10e-70' as float), cast('-10e-70' as float); > float8 | float8 > + > 1e-69 | -1e-69 {code} > Case 3: > {code:sql} > spark-sql> select cast('10e-400' as double), cast('-10e-400' as double); > 0.0 -0.0 > postgres=# select cast('10e-400' as double precision), cast('-10e-400' as > double precision); > ERROR: "10e-400" is out of range for type double precision > LINE 1: select cast('10e-400' as double precision), cast('-10e-400' ... > ^ {code} > Case 4: > {code:sql} > spark-sql (default)> select exp(1.2345678901234E200); > Infinity > postgres=# select exp(1.2345678901234E200); > ERROR: value overflows numeric format {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28024) Incorrect numeric values when out of range
[ https://issues.apache.org/jira/browse/SPARK-28024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836706#comment-17836706 ] Nicholas Chammas commented on SPARK-28024: -- I've just retried cases 2-4 on master with ANSI mode enabled, and Spark's behavior appears to be the same as when I last checked it in February. I also ran those same cases against PostgreSQL 16. I couldn't replicate the output for Case 4, and I believe there was a mistake in the original description of that case where the sign was flipped. So I've adjusted the sign accordingly and shown Spark and Postgres's behavior side-by-side. Here is the original Case 4 with the negative sign: {code:sql} spark-sql (default)> select exp(-1.2345678901234E200); 0.0 postgres=# select exp(-1.2345678901234E200); 0. {code} So I don't think there is a problem there. With a positive sign, the behavior is different as shown in the ticket description above. > Incorrect numeric values when out of range > -- > > Key: SPARK-28024 > URL: https://issues.apache.org/jira/browse/SPARK-28024 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.4, 3.0.0 >Reporter: Yuming Wang >Priority: Major > Labels: correctness > Attachments: SPARK-28024.png > > > As compared to PostgreSQL 16. > Case 1: > {code:sql} > select tinyint(128) * tinyint(2); -- 0 > select smallint(2147483647) * smallint(2); -- -2 > select int(2147483647) * int(2); -- -2 > SELECT smallint((-32768)) * smallint(-1); -- -32768 > {code} > Case 2: > {code:sql} > spark-sql> select cast('10e-70' as float), cast('-10e-70' as float); > 0.0 -0.0 > postgres=# select cast('10e-70' as float), cast('-10e-70' as float); > float8 | float8 > + > 1e-69 | -1e-69 {code} > Case 3: > {code:sql} > spark-sql> select cast('10e-400' as double), cast('-10e-400' as double); > 0.0 -0.0 > postgres=# select cast('10e-400' as double precision), cast('-10e-400' as > double precision); > ERROR: "10e-400" is out of range for type double precision > LINE 1: select cast('10e-400' as double precision), cast('-10e-400' ... > ^ {code} > Case 4: > {code:sql} > spark-sql (default)> select exp(1.2345678901234E200); > Infinity > postgres=# select exp(1.2345678901234E200); > ERROR: value overflows numeric format {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28024) Incorrect numeric values when out of range
[ https://issues.apache.org/jira/browse/SPARK-28024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044374#comment-17044374 ] Wenchen Fan commented on SPARK-28024: - I've lowered the priority from blocker to major. > Incorrect numeric values when out of range > -- > > Key: SPARK-28024 > URL: https://issues.apache.org/jira/browse/SPARK-28024 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.4, 3.0.0 >Reporter: Yuming Wang >Priority: Major > Labels: correctness > Attachments: SPARK-28024.png > > > For example > Case 1: > {code:sql} > select tinyint(128) * tinyint(2); -- 0 > select smallint(2147483647) * smallint(2); -- -2 > select int(2147483647) * int(2); -- -2 > SELECT smallint((-32768)) * smallint(-1); -- -32768 > {code} > Case 2: > {code:sql} > spark-sql> select cast('10e-70' as float), cast('-10e-70' as float); > 0.0 -0.0 > {code} > Case 3: > {code:sql} > spark-sql> select cast('10e-400' as double), cast('-10e-400' as double); > 0.0 -0.0 > {code} > Case 4: > {code:sql} > spark-sql> select exp(-1.2345678901234E200); > 0.0 > postgres=# select exp(-1.2345678901234E200); > ERROR: value overflows numeric format > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28024) Incorrect numeric values when out of range
[ https://issues.apache.org/jira/browse/SPARK-28024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040848#comment-17040848 ] Wenchen Fan commented on SPARK-28024: - these are the behaviors of Java: {code} scala> java.lang.Float.valueOf("10e-70") res0: Float = 0.0 scala> java.lang.StrictMath.exp(-1.2345678901234E200) res1: Double = 0.0 {code} Although it's not officially documented, Spark arithmetic follows Java since the very beginning. I won't treat them as correctness bug simply because they are not ANSI-compliance. You won't report this as correctness bug to JDK, right? I'd suggest we close this ticket. These behaviors are well defined (follows Java). We need to improve our document though. BTW when we complete the ANSI mode and turn it on by default, these problems would go away. > Incorrect numeric values when out of range > -- > > Key: SPARK-28024 > URL: https://issues.apache.org/jira/browse/SPARK-28024 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.4, 3.0.0 >Reporter: Yuming Wang >Priority: Blocker > Labels: correctness > Attachments: SPARK-28024.png > > > For example > Case 1: > {code:sql} > select tinyint(128) * tinyint(2); -- 0 > select smallint(2147483647) * smallint(2); -- -2 > select int(2147483647) * int(2); -- -2 > SELECT smallint((-32768)) * smallint(-1); -- -32768 > {code} > Case 2: > {code:sql} > spark-sql> select cast('10e-70' as float), cast('-10e-70' as float); > 0.0 -0.0 > {code} > Case 3: > {code:sql} > spark-sql> select cast('10e-400' as double), cast('-10e-400' as double); > 0.0 -0.0 > {code} > Case 4: > {code:sql} > spark-sql> select exp(-1.2345678901234E200); > 0.0 > postgres=# select exp(-1.2345678901234E200); > ERROR: value overflows numeric format > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28024) Incorrect numeric values when out of range
[ https://issues.apache.org/jira/browse/SPARK-28024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17039775#comment-17039775 ] Hyukjin Kwon commented on SPARK-28024: -- Case 1 is fixed. Case 2, 3, and 4 seem not fixed > Incorrect numeric values when out of range > -- > > Key: SPARK-28024 > URL: https://issues.apache.org/jira/browse/SPARK-28024 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.4, 3.0.0 >Reporter: Yuming Wang >Priority: Blocker > Labels: correctness > Attachments: SPARK-28024.png > > > For example > Case 1: > {code:sql} > select tinyint(128) * tinyint(2); -- 0 > select smallint(2147483647) * smallint(2); -- -2 > select int(2147483647) * int(2); -- -2 > SELECT smallint((-32768)) * smallint(-1); -- -32768 > {code} > Case 2: > {code:sql} > spark-sql> select cast('10e-70' as float), cast('-10e-70' as float); > 0.0 -0.0 > {code} > Case 3: > {code:sql} > spark-sql> select cast('10e-400' as double), cast('-10e-400' as double); > 0.0 -0.0 > {code} > Case 4: > {code:sql} > spark-sql> select exp(-1.2345678901234E200); > 0.0 > postgres=# select exp(-1.2345678901234E200); > ERROR: value overflows numeric format > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28024) Incorrect numeric values when out of range
[ https://issues.apache.org/jira/browse/SPARK-28024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869879#comment-16869879 ] Marco Gaido commented on SPARK-28024: - [~joshrosen] thanks for linking them, Yes, I did try having them in, because it is a very confusing and unexpected behavior for many users, especially when migrating workloads from other SQL systems. Moreover, having them as configs lets users choose the behavior they prefer. But I received negative feedbacks on them as you can see. I hope that since there have been several other people sustaining this is a problem, those PRs may be reconsidered. Moreover, now we're approaching 3.0, so it may be a good moment for them. I am updating them resolving conflicts. This issue may be closed as a duplicate IMHO. Thanks. > Incorrect numeric values when out of range > -- > > Key: SPARK-28024 > URL: https://issues.apache.org/jira/browse/SPARK-28024 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Critical > Labels: correctness > Attachments: SPARK-28024.png > > > For example > Case 1: > {code:sql} > select tinyint(128) * tinyint(2); -- 0 > select smallint(2147483647) * smallint(2); -- -2 > select int(2147483647) * int(2); -- -2 > SELECT smallint((-32768)) * smallint(-1); -- -32768 > {code} > Case 2: > {code:sql} > spark-sql> select cast('10e-70' as float), cast('-10e-70' as float); > 0.0 -0.0 > {code} > Case 3: > {code:sql} > spark-sql> select cast('10e-400' as double), cast('-10e-400' as double); > 0.0 -0.0 > {code} > Case 4: > {code:sql} > spark-sql> select exp(-1.2345678901234E200); > 0.0 > postgres=# select exp(-1.2345678901234E200); > ERROR: value overflows numeric format > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28024) Incorrect numeric values when out of range
[ https://issues.apache.org/jira/browse/SPARK-28024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869552#comment-16869552 ] Josh Rosen commented on SPARK-28024: I've linked two existing, related tickets: * SPARK-26218 (fail on integer overflow, no feature flag) * SPARK-23179 (fail on decimal overflow, guarded by feature flag) Both have WIP patches from [~mgaido]. > Incorrect numeric values when out of range > -- > > Key: SPARK-28024 > URL: https://issues.apache.org/jira/browse/SPARK-28024 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Critical > Labels: correctness > Attachments: SPARK-28024.png > > > For example > Case 1: > {code:sql} > select tinyint(128) * tinyint(2); -- 0 > select smallint(2147483647) * smallint(2); -- -2 > select int(2147483647) * int(2); -- -2 > SELECT smallint((-32768)) * smallint(-1); -- -32768 > {code} > Case 2: > {code:sql} > spark-sql> select cast('10e-70' as float), cast('-10e-70' as float); > 0.0 -0.0 > {code} > Case 3: > {code:sql} > spark-sql> select cast('10e-400' as double), cast('-10e-400' as double); > 0.0 -0.0 > {code} > Case 4: > {code:sql} > spark-sql> select exp(-1.2345678901234E200); > 0.0 > postgres=# select exp(-1.2345678901234E200); > ERROR: value overflows numeric format > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28024) Incorrect numeric values when out of range
[ https://issues.apache.org/jira/browse/SPARK-28024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862747#comment-16862747 ] Wenchen Fan commented on SPARK-28024: - AFAIK other databases would throw overflow exception. We may need a config to change this behavior. > Incorrect numeric values when out of range > -- > > Key: SPARK-28024 > URL: https://issues.apache.org/jira/browse/SPARK-28024 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > Attachments: SPARK-28024.png > > > For example: > {code:sql} > select tinyint(128) * tinyint(2); -- 0 > select smallint(2147483647) * smallint(2); -- -2 > select int(2147483647) * int(2); -- -2 > SELECT smallint((-32768)) * smallint(-1); -- -32768 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28024) Incorrect numeric values when out of range
[ https://issues.apache.org/jira/browse/SPARK-28024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862712#comment-16862712 ] Yuming Wang commented on SPARK-28024: - JDK also has this issue: !SPARK-28024.png! > Incorrect numeric values when out of range > -- > > Key: SPARK-28024 > URL: https://issues.apache.org/jira/browse/SPARK-28024 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > Attachments: SPARK-28024.png > > > For example: > {code:sql} > select tinyint(128) * tinyint(2); -- 0 > select smallint(2147483647) * smallint(2); -- -2 > select int(2147483647) * int(2); -- -2 > SELECT smallint((-32768)) * smallint(-1); -- -32768 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org