[
https://issues.apache.org/jira/browse/IMPALA-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Taras Bobrovytsky resolved IMPALA-5019.
---------------------------------------
Resolution: Fixed
Fix Version/s: Impala 2.11.0
{code}
commit bc12a9eb35ff60d7a7e0f6732e9ab6a1d4538f2a
Author: Taras Bobrovytsky <[email protected]>
Date: Tue Sep 19 16:23:24 2017 -0700
IMPALA-5019: Decimal V2 addition
In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.
DECIMAL V1:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,38) |
+----------------------------------------------------------------+
DECIMAL V2:
+----------------------------------------------------------------+
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
+----------------------------------------------------------------+
| DECIMAL(38,6) |
+----------------------------------------------------------------+
This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would fit into int128.
Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
decimal numbers and checks that Impala adds them correctly.
Performance:
For the common case, performance remains the same.
select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)
BEFORE: 4.74s
AFTER: 4.73s
In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)
BEFORE: 1.63s
AFTER: 13.57s
In following case, we take the most complex path and see the most
signification perfmance hit.
select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))
BEFORE: 1.63s
AFTER: 20.57
{code}
> DECIMAL V2 add/sub result type
> ------------------------------
>
> Key: IMPALA-5019
> URL: https://issues.apache.org/jira/browse/IMPALA-5019
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 2.0
> Reporter: Dan Hecht
> Assignee: Taras Bobrovytsky
> Fix For: Impala 2.11.0
>
>
> For decimal_v2=true, we should revisit the add/sub result type. Currently, we
> set result scale to max(S1, S2) (potentially losing precision). Other
> systems (e.g. SQL Server) seem to choose either S1 or S2 depending on whether
> digits to the left of the decimal point would be lost. This would require
> changes to the backend implementation of add/sub, however.
> Currently we compute rP and rS as follows:
> {code}
> rS = max(s1, s2)
> rP = max(s1, s2) + max(p1 - s1, p2 - s2) + 1
> {code}
> We currently handle the case where rP > 38 as follows:
> {code}
> if (rP > 38):
> rP = 38
> rS = min(38, rS)
> {code}
> This basically truncates the digits to the left of the decimal point.
> The proposed result under V2 is:
> {code}
> if (rP > 38):
> minS = min(rS, 6)
> rS = rS - (rP - 38)
> rS = max(minS, rS)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)