[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow

2017-11-07 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/8404 )

Change subject: IMPALA-5017: Error on decimal overflow
..

IMPALA-5017: Error on decimal overflow

Before this patch, decimal operations would either silently overflow (in
the case of sum() and avg()), or produce a warning.

In this patch, the behaviour is changed so that an error is produced in
the case of overflow when DECIMAL_v2 is enabled. Decimal v1 behaviour is
unchanged.

We introduce overflow checks when computing sum() and avg(). This
results in a ~30% performance regression when we are in decimal v2 mode
compared to decimal v1.

Benchmarks:
  Query:
  select sum(dec_38_19) from decimal_tbl
Decimal v1: 13.09s
Decimal v2: 17.10s

  Query:
  select avg(dec_38_19) from decimal_tbl
Decimal v1: 13.60s
Decimal v2: 19.10s

  Query:
  select avg(dec_9_5) from decimal_tbl
Decimal v1: 12.06s
Decimal v2: 18.07s

Testing:
- Added several end to end tests.
- Updated Expr tests to check for error in case of overflow.

Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
---
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr-test.cc
M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test
M testdata/workloads/functional-query/queries/QueryTest/decimal.test
6 files changed, 214 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8404/2
--
To view, visit http://gerrit.cloudera.org:8080/8404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
Gerrit-Change-Number: 8404
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow

2017-11-07 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8404 )

Change subject: IMPALA-5017: Error on decimal overflow
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@12
PS1, Line 12: In this patch, the beharior is changed so that an error is 
produced in
> I'll also accept behaviour ;)
Done


http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@27
PS1, Line 27:   select avg(dec_38_19) from decimal_tbl
> I guess these regressions occur regardless of the width of the input decima
Correct. Added another benchmark to illustrate this.


http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@413
PS1, Line 413:   abs(avg->sum_val16) > 
DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val4))) {
> This isn't correct if sum_val16 and src.val* have opposite sign, is it?
That's true. Fixed and added a test case.


http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@420
PS1, Line 420:   abs(avg->sum_val16) > 
DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val8))) {
> What do you think about only checking for overflow of the integer type at t
1. Are you suggesting to check for something like abs(avg->sum_val16) > 2**128 
- abs(src.val8))) ?
I don't really see why it is faster to check against 2**128 instead of against 
MAX_UNSCALED_DECIMAL.

2. What do you mean by initial check? Where would this check be done? Something 
like (decimal_v2 && sum_val16 > 2**32 && abs(avg->sum_val16) > 
MAX_UNSCALED_DECIMAL - abs(src.val8)))?



--
To view, visit http://gerrit.cloudera.org:8080/8404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
Gerrit-Change-Number: 8404
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 07 Nov 2017 21:59:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8404 )

Change subject: IMPALA-5017: Error on decimal overflow
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@12
PS1, Line 12: In this patch, the beharior is changed so that an error is 
produced in
> typo: beharior should be behavior
I'll also accept behaviour ;)


http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@27
PS1, Line 27:   select avg(dec_38_19) from decimal_tbl
I guess these regressions occur regardless of the width of the input decimal, 
rigth?


http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@413
PS1, Line 413:   abs(avg->sum_val16) > 
DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val4))) {
This isn't correct if sum_val16 and src.val* have opposite sign, is it?


http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@420
PS1, Line 420:   abs(avg->sum_val16) > 
DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val8))) {
What do you think about only checking for overflow of the integer type at this 
point and checking for whether it fits in the decimal type during finalisation? 
Might be cheaper.

We could also probably do a cheaper initial check of sum_val16 versus as 
constant in the 4 and 8 byte cases since they can't possible overflow unless 
the sum is already quite large.



--
To view, visit http://gerrit.cloudera.org:8080/8404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
Gerrit-Change-Number: 8404
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 03 Nov 2017 23:21:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow

2017-10-27 Thread John Sherman (Code Review)
John Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8404 )

Change subject: IMPALA-5017: Error on decimal overflow
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@12
PS1, Line 12: In this patch, the beharior is changed so that an error is 
produced in
typo: beharior should be behavior



--
To view, visit http://gerrit.cloudera.org:8080/8404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
Gerrit-Change-Number: 8404
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 27 Oct 2017 19:16:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow

2017-10-26 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8404 )

Change subject: IMPALA-5017: Error on decimal overflow
..


Patch Set 1:

Ooops, canceled


--
To view, visit http://gerrit.cloudera.org:8080/8404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
Gerrit-Change-Number: 8404
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 27 Oct 2017 03:32:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow

2017-10-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8404 )

Change subject: IMPALA-5017: Error on decimal overflow
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1402/


--
To view, visit http://gerrit.cloudera.org:8080/8404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
Gerrit-Change-Number: 8404
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 27 Oct 2017 03:32:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow

2017-10-26 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8404


Change subject: IMPALA-5017: Error on decimal overflow
..

IMPALA-5017: Error on decimal overflow

Before this patch, decimal operations would either silently overflow (in
the case of sum() and avg()), or produce a warning.

In this patch, the beharior is changed so that an error is produced in
the case of overflow when DECIMAL_v2 is enabled. Decimal v1 behavior is
unchanged.

We introduce overflow checks when computing sum() and avg(). This
results in a ~30% performance regression when we are in decimal v2 mode
compared to decimal v1.

Benchmarks:
  Query:
  select sum(dec_38_19) from decimal_tbl
Decimal v1: 13.09s
Decimal v2: 17.10s

  Query:
  select avg(dec_38_19) from decimal_tbl
Decimal v1: 13.60s
Decimal v2: 19.10s

Testing:
- Added several end to end tests.
- Updated Expr tests to check for error in case of overflow.

Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
---
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr-test.cc
M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test
M testdata/workloads/functional-query/queries/QueryTest/decimal.test
6 files changed, 186 insertions(+), 69 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8404/1
--
To view, visit http://gerrit.cloudera.org:8080/8404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
Gerrit-Change-Number: 8404
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky