[
https://issues.apache.org/jira/browse/SPARK-40903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gengliang Wang resolved SPARK-40903.
------------------------------------
Fix Version/s: 3.4.0
Resolution: Fixed
Issue resolved by pull request 38379
[https://github.com/apache/spark/pull/38379]
> Avoid reordering decimal Add for canonicalization
> -------------------------------------------------
>
> Key: SPARK-40903
> URL: https://issues.apache.org/jira/browse/SPARK-40903
> Project: Spark
> Issue Type: Test
> Components: SQL
> Affects Versions: 3.4.0
> Reporter: Gengliang Wang
> Assignee: Gengliang Wang
> Priority: Major
> Fix For: 3.4.0
>
>
> Avoid reordering Add for canonicalizing if it is decimal type.
> Expressions are canonicalized for comparisons and explanations. For
> non-decimal Add expression, the order can be sorted by hashcode, and the
> result is supposed to be the same.
> However, for Add expression of Decimal type, the behavior is different: Given
> decimal (p1, s1) and another decimal (p2, s2), the result integral part is
> `max(p1-s1, p2-s2) +1`, the result decimal part is `max(s1, s2)`. Thus the
> result data type is `(max(p1-s1, p2-s2) +1 + max(s1, s2), max(s1, s2))`.
> Thus the order matters:
> * For `(decimal(12,5) + decimal(12,6)) + decimal(3, 2)`, the first add
> `decimal(12,5) + decimal(12,6)` results in `decimal(14, 6)`, and then
> `decimal(14, 6) + decimal(3, 2)` results in `decimal(15, 6)`
> * For `(decimal(12, 6) + decimal(3,2)) + decimal(12, 5)`, the first add
> `decimal(12, 6) + decimal(3,2)` results in `decimal(13, 6)`, and then
> `decimal(13, 6) + decimal(12, 5)` results in `decimal(14, 6)`
> In the following query:
> ```
> create table foo(a decimal(12, 5), b decimal(12, 6)) using orc
> select sum(coalesce(a+b+ 1.75, a)) from foo
> ```
> At first `coalesce(a+b+ 1.75, a)` is resolved as `coalesce(a+b+ 1.75, cast(a
> as decimal(15, 6))`. In the canonicalized version, the expression becomes
> `coalesce(1.75+b+a, cast(a as decimal(15, 6))`. As explained above,
> `1.75+b+a` is of decimal(14, 6), which is different from `cast(a as
> decimal(15, 6)`. Thus the following error will happen:
> {code:java}
> java.lang.IllegalArgumentException: requirement failed: All input types must
> be the same except nullable, containsNull, valueContainsNull flags. The input
> types found are
> DecimalType(14,6)
> DecimalType(15,6)
> at scala.Predef$.require(Predef.scala:281)
> at
> org.apache.spark.sql.catalyst.expressions.ComplexTypeMergingExpression.dataTypeCheck(Expression.scala:1149)
> at
> org.apache.spark.sql.catalyst.expressions.ComplexTypeMergingExpression.dataTypeCheck$(Expression.scala:1143)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]