[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Taras Bobrovytsky has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/8309 ) Change subject: IMPALA-5019: Decimal V2 addition .. IMPALA-5019: Decimal V2 addition In this patch, we implement the new decimal return type rules for addition expressions. These rules become active when the query option DECIMAL_V2 is enabled. The algorithm for determining the type of the result is described in the JIRA. DECIMAL V1: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,38) | ++ DECIMAL V2: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,6) | ++ This patch required backend changes. We implement an algorithm where we handle the whole and fractional parts separately, and then combine them to get the final result. This is more complex and slower. We try to avoid this by first checking if the result would fit into int128. Testing: - Added expr tests. - Tested locally on my machine with a script that generates random decimal numbers and checks that Impala adds them correctly. Performance: For the common case, performance remains the same. select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1) BEFORE: 4.74s AFTER: 4.73s In this case, we check if it is necessary to do the complex addition, and it turns out to be not necessary. We see a slowdown because the result needs to be scaled down by dividing. select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19) BEFORE: 1.63s AFTER: 13.57s In following case, we take the most complex path and see the most signification perfmance hit. select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37)) BEFORE: 1.63s AFTER: 20.57 Change-Id: I401049c56d910eb1546a178c909c923b01239336 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.h M be/src/runtime/decimal-value.inline.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 5 files changed, 392 insertions(+), 87 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/4 -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 4 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 ) Change subject: IMPALA-5019: Decimal V2 addition .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@179 PS3, Line 179: int128_t& x_left, int128_t& x_right, int128_t& y_left, int128_t& y_right) { > Use pointers for output arguments: https://google.github.io/styleguide/cppg Done http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@214 PS3, Line 214: right_overflow > carry_to_left might be a better name? Since the intent is to carry this int Agreed, changed. http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@243 PS3, Line 243: inline int128_t SubtractLarge(int128_t x, int x_scale, int128_t y, int y_scale, > What do you think about make it so that this always computes x - y, and mak The way it is currently implemented, I don't see an advantage in doing that. We do not use the subtraction operation in this function, we still add x and y together. Are you suggesting to rewrite it, so that x and y are both expected to be positive and we will be doing x - y here? It's not quite clear to me which case would be eliminated if we rewrite it that way. (Both cases on line 264 would still need to be present) http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305 PS2, Line 305: RESULT_T y = 0; > Ah ok. I wonder if renaming the variable would make it clearer. Something l Ranamed to result_scale_decrease http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310 PS2, Line 310: } > Thanks. I still think something is missing in the AdjustToSameScale() name Modified the comment of AdjustToSameScale() slightly. I was debating changing the name to AdjustToMaxScale(), but that might be confusing (because max scale is 38)). -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 07 Nov 2017 01:43:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 ) Change subject: IMPALA-5019: Decimal V2 addition .. Patch Set 3: (5 comments) I'm gradually understanding this better. Had some further suggestions to make it easier to follow but I think I'm getting close. http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@179 PS3, Line 179: int128_t& x_left, int128_t& x_right, int128_t& y_left, int128_t& y_right) { Use pointers for output arguments: https://google.github.io/styleguide/cppguide.html#Reference_Arguments http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@214 PS3, Line 214: right_overflow carry_to_left might be a better name? Since the intent is to carry this into the left. http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@243 PS3, Line 243: inline int128_t SubtractLarge(int128_t x, int x_scale, int128_t y, int y_scale, What do you think about make it so that this always computes x - y, and making the caller responsible for switching the arguments? I think it would be easier to follow because there's one less case to handle. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305 PS2, Line 305: RESULT_T y = 0; > Delta scale is different in AdjustToSameScale(). There are two delta scales Ah ok. I wonder if renaming the variable would make it clearer. Something like result_scale_delta to make it clearer that it's between the result scale and something else? Or result_scale_decrease to indicate that we need to scale down the result by this amount. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310 PS2, Line 310: } > Added comment. Thanks. I still think something is missing in the AdjustToSameScale() name or comment, since I couldn't tell that it was intended to adjust them to the max scale without reading the code. Maybe it just needs a comment explaining how the output scale is computed. -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Mon, 30 Oct 2017 21:12:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Taras Bobrovytsky has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/8309 ) Change subject: IMPALA-5019: Decimal V2 addition .. IMPALA-5019: Decimal V2 addition In this patch, we implement the new decimal return type rules for addition expressions. These rules become active when the query option DECIMAL_V2 is enabled. The algorithm for determining the type of the result is described in the JIRA. DECIMAL V1: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,38) | ++ DECIMAL V2: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,6) | ++ This patch required backend changes. We implement an algorithm where we handle the whole and fractional parts separately, and then combine them to get the final result. This is more complex and slower. We try to avoid this by first checking if the result would fit into int128. Testing: - Added expr tests. - Tested locally on my machine with a script that generates random decimal numbers and checks that Impala adds them correctly. Performance: For the common case, performance remains the same. select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1) BEFORE: 4.74s AFTER: 4.73s In this case, we check if it is necessary to do the complex addition, and it turns out to be not necessary. We see a slowdown because the result needs to be scaled down by dividing. select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19) BEFORE: 1.63s AFTER: 13.57s In following case, we take the most complex path and see the most signification perfmance hit. select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37)) BEFORE: 1.63s AFTER: 20.57 Change-Id: I401049c56d910eb1546a178c909c923b01239336 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 4 files changed, 387 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/3 -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 3 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 ) Change subject: IMPALA-5019: Decimal V2 addition .. Patch Set 2: (12 comments) http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG@31 PS1, Line 31: to avoid this by first checking if the result would into int128. > fit? Done http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@166 PS2, Line 166: LeadingZeroAdjustment > I think the name could make it clearer what it's computing. Maybe something Done. I like the new name. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@170 PS2, Line 170: floor(log2(b)) > The table values are actually floor(log2(10^b)) for i = 0, 1, 2, 3... etc, Yes, exactly. Fixed the comment. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@181 PS2, Line 181: same_sign > We usually make numeric/bool template arguments upper case to make it clear Separated this large function into 3 smaller ones, as you suggested. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@185 PS2, Line 185: // This is expected to be handled by the caller. > What about the case when they're zero? That's also meant to be handled by t It wasn't possible for this condition to fail because at least one must be greater than zero for the following reasons: - If they are both zero, we will not be calling this function, because of the leading zero test. - If one is negative and the other is zero, we will invert it on line 327 I modified this check in the new implementation. Yes we do have cases, where we add zero. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@272 PS2, Line 272: DCHECK_EQ(result_scale, std::max(this_scale, other_scale)); > A brief comment that this is guaranteed by the frontend migh help people ne Done http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@277 PS2, Line 277: DCHECK(!*overflow) << "Cannot overflow unless result is Decimal16Value"; > extra indent here Done http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@287 PS2, Line 287: if (this_scale < other_scale) { > It might be slightly easier to follow if the branches were(delta_scale < 0) Rewrote the code to make it more clear. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@299 PS2, Line 299: into > long line. Done http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305 PS2, Line 305: bool ovf = AdjustToSameScale(*this, this_scale, other, other_scale, > It feels a bit weird that we're calculating delta_scale above and inside Ad Delta scale is different in AdjustToSameScale(). There are two delta scales in this patch: between x and y and between max(x_scale, y_scale) and result_scale. What we are doing here and in AddLarge() is first adjust to the same scale, do the addition, then scale down to result_scale. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310 PS2, Line 310: if (delta_scale > 0) { > This might need a brief comment to explain why it's necessary (or perhaps t Added comment. http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java: http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java@228 PS2, Line 228:* TODO: implement V2 rules for ADD/SUB. > TODO not needed. Done -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Sat, 28 Oct 2017 00:02:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 ) Change subject: IMPALA-5019: Decimal V2 addition .. Patch Set 2: (12 comments) I spent some time getting to understand the code. Still don't 100% comprehend it but getting closer. http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG@31 PS1, Line 31: to avoid this by first checking if the result would into int128. fit? http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@166 PS2, Line 166: LeadingZeroAdjustment I think the name could make it clearer what it's computing. Maybe something like MinLeadingZerosAfterScaling() http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@170 PS2, Line 170: floor(log2(b)) The table values are actually floor(log2(10^b)) for i = 0, 1, 2, 3... etc, right? I found this notation a bit confusing. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@181 PS2, Line 181: same_sign We usually make numeric/bool template arguments upper case to make it clear they're constants. I'm not sure that templates are really necessary here though. It seems like we could just factor out the shared parts into helper functions and have different functions implementing the same sign/different sign algorithms. E.g. SeparateFractional(x, y, x_scale, y_scale, &x_left, &x_right, &y_left, &y_right); int max_scale = std::max(x_scale, y_scale); int delta_scale = max_scale - result_scale; ... do the same/different sign algorithm ... ... http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@185 PS2, Line 185: // This is expected to be handled by the caller. What about the case when they're zero? That's also meant to be handled by the caller, right? Do we test cases with multiplying by zero? http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@272 PS2, Line 272: DCHECK_EQ(result_scale, std::max(this_scale, other_scale)); A brief comment that this is guaranteed by the frontend migh help people new to the code. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@277 PS2, Line 277: DCHECK(!*overflow) << "Cannot overflow unless result is Decimal16Value"; extra indent here http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@287 PS2, Line 287: if (this_scale < other_scale) { It might be slightly easier to follow if the branches were(delta_scale < 0) and (delta_scale > 0) - it's less obvious this way that it's forcing delta_scale to be positive. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@299 PS2, Line 299: into long line. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305 PS2, Line 305: bool ovf = AdjustToSameScale(*this, this_scale, other, other_scale, It feels a bit weird that we're calculating delta_scale above and inside AdjustToSameScale() - it seems like the delta_scale logic in this function and AdjustToSameScale() are tightly coupled - AdjustToSameScale() doesn't really abstract it away if the caller has to know the details to correctly adjust the output result. I'm not sure if there's a better way to factor it. http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310 PS2, Line 310: if (delta_scale > 0) { This might need a brief comment to explain why it's necessary (or perhaps there's some way we can improve how AdjustToSameScale() is factored so that it's clearer what the expected result scale of that is). http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java: http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java@228 PS2, Line 228:* TODO: implement V2 rules for ADD/SUB. TODO not needed. -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Thu, 26 Oct 2017 00:18:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Taras Bobrovytsky has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8309 ) Change subject: IMPALA-5019: Decimal V2 addition .. IMPALA-5019: Decimal V2 addition In this patch, we implement the new decimal return type rules for addition expressions. These rules become active when the query option DECIMAL_V2 is enabled. The algorithm for determining the type of the result is described in the JIRA. DECIMAL V1: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,38) | ++ DECIMAL V2: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,6) | ++ This patch required backend changes. We implement an algorithm where we handle the whole and fractional parts separately, and then combine them to get the final result. This is more complex and slower. We try to avoid this by first checking if the result would into int128. Testing: - Added expr tests. - Tested locally on my machine with a script that generates random decimal numbers and checks that Impala adds them correctly. Performance: For the common case, performance remains the same. select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1) BEFORE: 4.74s AFTER: 4.73s In this case, we check if it is necessary to do the complex addition, and it turns out to be not necessary. We see a slowdown because the result needs to be scaled down by dividing. select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19) BEFORE: 1.63s AFTER: 13.57s In following case, we take the most complex path and see the most signification perfmance hit. select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37)) BEFORE: 1.63s AFTER: 20.57 Change-Id: I401049c56d910eb1546a178c909c923b01239336 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java 4 files changed, 297 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/2 -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8309 ) Change subject: IMPALA-5019: Decimal V2 addition .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8309/1/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java: http://gerrit.cloudera.org:8080/#/c/8309/1/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java@268 PS1, Line 268: // Not yet implemented - fall back to V1 rules. I think this can now be removed now. -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 18 Oct 2017 02:30:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition
Taras Bobrovytsky has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8309 Change subject: IMPALA-5019: Decimal V2 addition .. IMPALA-5019: Decimal V2 addition In this patch, we implement the new decimal return type rules for addition expressions. These rules become active when the query option DECIMAL_V2 is enabled. The algorithm for determining the type of the result is described in the JIRA. DECIMAL V1: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,38) | ++ DECIMAL V2: ++ | typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) | ++ | DECIMAL(38,6) | ++ This patch required backend changes. We implement an algorithm where we handle the whole and fractional parts separately, and then combine them to get the final result. This is more complex and slower. We try to avoid this by first checking if the result would into int128. Testing: - Added expr tests. - Tested locally on my machine with a script that generates random decimal numbers and checks that Impala adds them correctly. Performance: For the common case, performance remains the same. select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1) BEFORE: 4.74s AFTER: 4.73s In this case, we check if it is necessary to do the complex addition, and it turns out to be not necessary. We see a slowdown because the result needs to be scaled down by dividing. select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19) BEFORE: 1.63s AFTER: 13.57s In following case, we take the most complex path and see the most signification perfmance hit. select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37)) BEFORE: 1.63s AFTER: 20.57 Change-Id: I401049c56d910eb1546a178c909c923b01239336 --- M be/src/exprs/expr-test.cc M be/src/runtime/decimal-value.inline.h M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java 3 files changed, 282 insertions(+), 69 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/1 -- To view, visit http://gerrit.cloudera.org:8080/8309 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336 Gerrit-Change-Number: 8309 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky