[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow
Taras Bobrovytsky has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8404 ) Change subject: IMPALA-5017: Error on decimal overflow .. IMPALA-5017: Error on decimal overflow Before this patch, decimal operations would either silently overflow (in the case of sum() and avg()), or produce a warning. In this patch, the behaviour is changed so that an error is produced in the case of overflow when DECIMAL_v2 is enabled. Decimal v1 behaviour is unchanged. We introduce overflow checks when computing sum() and avg(). This results in a ~30% performance regression when we are in decimal v2 mode compared to decimal v1. Benchmarks: Query: select sum(dec_38_19) from decimal_tbl Decimal v1: 13.09s Decimal v2: 17.10s Query: select avg(dec_38_19) from decimal_tbl Decimal v1: 13.60s Decimal v2: 19.10s Query: select avg(dec_9_5) from decimal_tbl Decimal v1: 12.06s Decimal v2: 18.07s Testing: - Added several end to end tests. - Updated Expr tests to check for error in case of overflow. Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 --- M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/decimal-operators-ir.cc M be/src/exprs/expr-codegen-test.cc M be/src/exprs/expr-test.cc M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test M testdata/workloads/functional-query/queries/QueryTest/decimal.test 6 files changed, 214 insertions(+), 69 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8404/2 -- To view, visit http://gerrit.cloudera.org:8080/8404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 Gerrit-Change-Number: 8404 Gerrit-PatchSet: 2 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8404 ) Change subject: IMPALA-5017: Error on decimal overflow .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@12 PS1, Line 12: In this patch, the beharior is changed so that an error is produced in > I'll also accept behaviour ;) Done http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@27 PS1, Line 27: select avg(dec_38_19) from decimal_tbl > I guess these regressions occur regardless of the width of the input decima Correct. Added another benchmark to illustrate this. http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@413 PS1, Line 413: abs(avg->sum_val16) > DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val4))) { > This isn't correct if sum_val16 and src.val* have opposite sign, is it? That's true. Fixed and added a test case. http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@420 PS1, Line 420: abs(avg->sum_val16) > DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val8))) { > What do you think about only checking for overflow of the integer type at t 1. Are you suggesting to check for something like abs(avg->sum_val16) > 2**128 - abs(src.val8))) ? I don't really see why it is faster to check against 2**128 instead of against MAX_UNSCALED_DECIMAL. 2. What do you mean by initial check? Where would this check be done? Something like (decimal_v2 && sum_val16 > 2**32 && abs(avg->sum_val16) > MAX_UNSCALED_DECIMAL - abs(src.val8)))? -- To view, visit http://gerrit.cloudera.org:8080/8404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 Gerrit-Change-Number: 8404 Gerrit-PatchSet: 1 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 07 Nov 2017 21:59:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8404 ) Change subject: IMPALA-5017: Error on decimal overflow .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@12 PS1, Line 12: In this patch, the beharior is changed so that an error is produced in > typo: beharior should be behavior I'll also accept behaviour ;) http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@27 PS1, Line 27: select avg(dec_38_19) from decimal_tbl I guess these regressions occur regardless of the width of the input decimal, rigth? http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@413 PS1, Line 413: abs(avg->sum_val16) > DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val4))) { This isn't correct if sum_val16 and src.val* have opposite sign, is it? http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@420 PS1, Line 420: abs(avg->sum_val16) > DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val8))) { What do you think about only checking for overflow of the integer type at this point and checking for whether it fits in the decimal type during finalisation? Might be cheaper. We could also probably do a cheaper initial check of sum_val16 versus as constant in the 4 and 8 byte cases since they can't possible overflow unless the sum is already quite large. -- To view, visit http://gerrit.cloudera.org:8080/8404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 Gerrit-Change-Number: 8404 Gerrit-PatchSet: 1 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 03 Nov 2017 23:21:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow
John Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/8404 ) Change subject: IMPALA-5017: Error on decimal overflow .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@12 PS1, Line 12: In this patch, the beharior is changed so that an error is produced in typo: beharior should be behavior -- To view, visit http://gerrit.cloudera.org:8080/8404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 Gerrit-Change-Number: 8404 Gerrit-PatchSet: 1 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 27 Oct 2017 19:16:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8404 ) Change subject: IMPALA-5017: Error on decimal overflow .. Patch Set 1: Ooops, canceled -- To view, visit http://gerrit.cloudera.org:8080/8404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 Gerrit-Change-Number: 8404 Gerrit-PatchSet: 1 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 27 Oct 2017 03:32:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8404 ) Change subject: IMPALA-5017: Error on decimal overflow .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1402/ -- To view, visit http://gerrit.cloudera.org:8080/8404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 Gerrit-Change-Number: 8404 Gerrit-PatchSet: 1 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Fri, 27 Oct 2017 03:32:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow
Taras Bobrovytsky has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8404 Change subject: IMPALA-5017: Error on decimal overflow .. IMPALA-5017: Error on decimal overflow Before this patch, decimal operations would either silently overflow (in the case of sum() and avg()), or produce a warning. In this patch, the beharior is changed so that an error is produced in the case of overflow when DECIMAL_v2 is enabled. Decimal v1 behavior is unchanged. We introduce overflow checks when computing sum() and avg(). This results in a ~30% performance regression when we are in decimal v2 mode compared to decimal v1. Benchmarks: Query: select sum(dec_38_19) from decimal_tbl Decimal v1: 13.09s Decimal v2: 17.10s Query: select avg(dec_38_19) from decimal_tbl Decimal v1: 13.60s Decimal v2: 19.10s Testing: - Added several end to end tests. - Updated Expr tests to check for error in case of overflow. Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 --- M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/decimal-operators-ir.cc M be/src/exprs/expr-codegen-test.cc M be/src/exprs/expr-test.cc M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test M testdata/workloads/functional-query/queries/QueryTest/decimal.test 6 files changed, 186 insertions(+), 69 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8404/1 -- To view, visit http://gerrit.cloudera.org:8080/8404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 Gerrit-Change-Number: 8404 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky