[
https://issues.apache.org/jira/browse/KYLIN-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17928352#comment-17928352
]
Guoliang Sun commented on KYLIN-6045:
-------------------------------------
h3. Root Cause
h4. 一、SQL -> Rel
In the `KylinRelDataTypeSystem`, during the `deriveSumType` process, the
precision of the `Decimal` type is not effectively expanded:
- If the precision is less than 19, it is expanded to 19.
- If the precision is greater than or equal to 19, it remains unchanged.
This results in the following behavior:
{code:java}
SUM(Decimal(5, 2)) -> Decimal(19, 2)
SUM(Decimal(19, 2)) -> Decimal(19, 2)
SUM(Decimal(20, 2)) -> Decimal(20, 2){code}
When the aggregation column is of type `Decimal` and its precision is greater
than or equal to 19, the result of the `SUM` aggregation may exceed the integer
range defined by `Decimal(precision, scale)`. The integer part of a `Decimal`
is calculated as `precision - scale`.
h4. 二、Calcite Plan -> Spark Plan
In the `AggregatePlan`, when handling the conversion logic for `SUM`, it
retains the `Decimal` precision and scale as defined in the `RelNode`. When an
index is hit, the logic first casts the measure column. If the actual value of
the measure column exceeds the integer range of the `inputType` (as defined in
the cast), the result becomes `null`. Consequently, the outer `SUM` aggregation
also results in `null`, which explains the phenomenon observed in the Jira
issue.
Additionally, there are two inconsistencies in semantics that contribute to
errors and confusion:
1. Inconsistency between the model's `SUM` measure return type and the Calcite
logical plan's `SUM` return type:
- The model's `SUM` measure increases the `Decimal` precision by +10 based
on the original column.
- The Calcite logical plan's `SUM` return type is controlled by
`KylinRelDataTypeSystem`, with the current behavior described above.
2. Inconsistency between the Spark logical plan's `SUM` return type and the
Calcite logical plan's `SUM` return type:
- In Calcite, the `SUM` aggregation result type represents the type of the
aggregated value.
- After converting to the Spark plan, this type becomes the type of the
column inside the `SUM`. After further aggregation, the `Decimal` precision
changes (in Spark, each nested `SUM` increases the `Decimal` precision by +10
until the maximum of 38).
> SUM Query Decimal Precision Anomaly
> -----------------------------------
>
> Key: KYLIN-6045
> URL: https://issues.apache.org/jira/browse/KYLIN-6045
> Project: Kylin
> Issue Type: Bug
> Affects Versions: 5.0.0
> Reporter: Guoliang Sun
> Priority: Major
>
> When generating the Spark plan for a query, a `cast` conversion is added for
> the `sum` aggregation in `AggregatePlan.buildAgg`. At this point, the input
> type is the column type, causing the precision of the `cast` to be reduced.
> This results in the query returning `null`.
> h3. Example
> - The column precision in the Hive table is `decimal(19,6)`.
> - The model measure precision is `decimal(29,6)`.
> - When querying, the result will be `null`.
> In the Spark event log for the query, the `cast` precision is
> `decimal(19,6)`. Directly retrieving data from the Parquet file yields the
> following:
> - When the `cast` precision is `DECIMAL(19,6)`, the result is `null`.
> - When the `cast` precision is `DECIMAL(29,6)`, the result is correct.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)