Gengliang Wang created SPARK-40903:
--------------------------------------

             Summary: Avoid reordering decimal Add for canonicalization
                 Key: SPARK-40903
                 URL: https://issues.apache.org/jira/browse/SPARK-40903
             Project: Spark
          Issue Type: Test
          Components: SQL
    Affects Versions: 3.4.0
            Reporter: Gengliang Wang
            Assignee: Gengliang Wang


Avoid reordering Add for canonicalizing if it is decimal type.
Expressions are canonicalized for comparisons and explanations. For non-decimal 
Add expression, the order can be sorted by hashcode, and the result is supposed 
to be the same.
However, for Add expression of Decimal type, the behavior is different: Given 
decimal (p1, s1) and another decimal (p2, s2), the result integral part is 
`max(p1-s1, p2-s2) +1`, the result decimal part is `max(s1, s2)`. Thus the 
result data type is `(max(p1-s1, p2-s2) +1 + max(s1, s2), max(s1, s2))`.
Thus the order matters:
* For `(decimal(12,5) + decimal(12,6)) + decimal(3, 2)`, the first add 
`decimal(12,5) + decimal(12,6)` results in `decimal(14, 6)`, and then 
`decimal(14, 6) + decimal(3, 2)`  results in `decimal(15, 6)`
* For `(decimal(12, 6) + decimal(3,2)) + decimal(12, 5)`, the first add 
`decimal(12, 6) + decimal(3,2)` results in `decimal(13, 6)`, and then 
`decimal(13, 6) + decimal(12, 5)` results in `decimal(14, 6)`

In the following query:
```
create table foo(a decimal(12, 5), b decimal(12, 6)) using orc
select sum(coalesce(a+b+ 1.75, a)) from foo
```
At first `coalesce(a+b+ 1.75, a)` is resolved as `coalesce(a+b+ 1.75, cast(a as 
decimal(15, 6))`. In the canonicalized version, the expression becomes 
`coalesce(1.75+b+a, cast(a as decimal(15, 6))`. As explained above, `1.75+b+a` 
is of decimal(14, 6), which is different from  `cast(a as decimal(15, 6)`. Thus 
the following error will happen:
{code:java}
java.lang.IllegalArgumentException: requirement failed: All input types must be 
the same except nullable, containsNull, valueContainsNull flags. The input 
types found are
        DecimalType(14,6)
        DecimalType(15,6)
        at scala.Predef$.require(Predef.scala:281)
        at 
org.apache.spark.sql.catalyst.expressions.ComplexTypeMergingExpression.dataTypeCheck(Expression.scala:1149)
        at 
org.apache.spark.sql.catalyst.expressions.ComplexTypeMergingExpression.dataTypeCheck$(Expression.scala:1143)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to