[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-11-06 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..

IMPALA-5019: Decimal V2 addition

In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.

DECIMAL V1:
++
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
++
| DECIMAL(38,38) |
++

DECIMAL V2:
++
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
++
| DECIMAL(38,6)  |
++

This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would fit into int128.

Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
  decimal numbers and checks that Impala adds them correctly.

Performance:

For the common case, performance remains the same.
  select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)

  BEFORE: 4.74s
  AFTER:  4.73s

In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
  select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)

  BEFORE: 1.63s
  AFTER:  13.57s

In following case, we take the most complex path and see the most
signification perfmance hit.
  select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))

  BEFORE: 1.63s
  AFTER: 20.57

Change-Id: I401049c56d910eb1546a178c909c923b01239336
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.h
M be/src/runtime/decimal-value.inline.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
5 files changed, 392 insertions(+), 87 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/4
--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-11-06 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@179
PS3, Line 179: int128_t& x_left, int128_t& x_right, int128_t& y_left, 
int128_t& y_right) {
> Use pointers for output arguments: https://google.github.io/styleguide/cppg
Done


http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@214
PS3, Line 214: right_overflow
> carry_to_left might be a better name? Since the intent is to carry this int
Agreed, changed.


http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@243
PS3, Line 243: inline int128_t SubtractLarge(int128_t x, int x_scale, int128_t 
y, int y_scale,
> What do you think about make it so that this always computes x - y, and mak
The way it is currently implemented, I don't see an advantage in doing that. We 
do not use the subtraction operation in this function, we still add x and y 
together.

Are you suggesting to rewrite it, so that x and y are both expected to be 
positive and we will be doing x - y here? It's not quite clear to me which case 
would be eliminated if we rewrite it that way. (Both cases on line 264 would 
still need to be present)


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305
PS2, Line 305: RESULT_T y = 0;
> Ah ok. I wonder if renaming the variable would make it clearer. Something l
Ranamed to result_scale_decrease


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310
PS2, Line 310:   }
> Thanks. I still think something is missing in the AdjustToSameScale() name
Modified the comment of AdjustToSameScale() slightly. I was  debating changing 
the name to AdjustToMaxScale(), but that might be confusing (because max scale 
is 38)).



--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 07 Nov 2017 01:43:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-10-30 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..


Patch Set 3:

(5 comments)

I'm gradually understanding this better. Had some further suggestions to make 
it easier to follow but I think I'm getting close.

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@179
PS3, Line 179: int128_t& x_left, int128_t& x_right, int128_t& y_left, 
int128_t& y_right) {
Use pointers for output arguments: 
https://google.github.io/styleguide/cppguide.html#Reference_Arguments


http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@214
PS3, Line 214: right_overflow
carry_to_left might be a better name? Since the intent is to carry this into 
the left.


http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@243
PS3, Line 243: inline int128_t SubtractLarge(int128_t x, int x_scale, int128_t 
y, int y_scale,
What do you think about make it so that this always computes x - y, and making 
the caller responsible for switching the arguments? I think it would be easier 
to follow because there's one less case to handle.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305
PS2, Line 305: RESULT_T y = 0;
> Delta scale is different in AdjustToSameScale(). There are two delta scales
Ah ok. I wonder if renaming the variable would make it clearer. Something like 
result_scale_delta to make it clearer that it's between the result scale and 
something else? Or result_scale_decrease to indicate that we need to scale down 
the result by this amount.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310
PS2, Line 310:   }
> Added comment.
Thanks. I still think something is missing in the AdjustToSameScale() name or 
comment, since I couldn't tell that it was intended to adjust them to the max 
scale without reading the code. Maybe it just needs a comment explaining how 
the output scale is computed.



--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Mon, 30 Oct 2017 21:12:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-10-27 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..

IMPALA-5019: Decimal V2 addition

In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.

DECIMAL V1:
++
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
++
| DECIMAL(38,38) |
++

DECIMAL V2:
++
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
++
| DECIMAL(38,6)  |
++

This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would fit into int128.

Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
  decimal numbers and checks that Impala adds them correctly.

Performance:

For the common case, performance remains the same.
  select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)

  BEFORE: 4.74s
  AFTER:  4.73s

In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
  select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)

  BEFORE: 1.63s
  AFTER:  13.57s

In following case, we take the most complex path and see the most
signification perfmance hit.
  select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))

  BEFORE: 1.63s
  AFTER: 20.57

Change-Id: I401049c56d910eb1546a178c909c923b01239336
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
4 files changed, 387 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/3
--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-10-27 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..


Patch Set 2:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG@31
PS1, Line 31: to avoid this by first checking if the result would into int128.
> fit?
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@166
PS2, Line 166: LeadingZeroAdjustment
> I think the name could make it clearer what it's computing. Maybe something
Done. I like the new name.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@170
PS2, Line 170: floor(log2(b))
> The table values are actually floor(log2(10^b)) for i = 0, 1, 2, 3... etc,
Yes, exactly. Fixed the comment.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@181
PS2, Line 181: same_sign
> We usually make numeric/bool template arguments upper case to make it clear
Separated this large function into 3 smaller ones, as you suggested.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@185
PS2, Line 185:   // This is expected to be handled by the caller.
> What about the case when they're zero? That's also meant to be handled by t
It wasn't possible for this condition to fail because at least one must be 
greater than zero for the following reasons:
- If they are both zero, we will not be calling this function, because of the 
leading zero test.
- If one is negative and the other is zero, we will invert it on line 327

I modified this check in the new implementation.

Yes we do have cases, where we add zero.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@272
PS2, Line 272: DCHECK_EQ(result_scale, std::max(this_scale, other_scale));
> A brief comment that this is guaranteed by the frontend migh help people ne
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@277
PS2, Line 277:   DCHECK(!*overflow) << "Cannot overflow unless result is 
Decimal16Value";
> extra indent here
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@287
PS2, Line 287:   if (this_scale < other_scale) {
> It might be slightly easier to follow if the branches were(delta_scale < 0)
Rewrote the code to make it more clear.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@299
PS2, Line 299: into
> long line.
Done


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305
PS2, Line 305: bool ovf = AdjustToSameScale(*this, this_scale, other, 
other_scale,
> It feels a bit weird that we're calculating delta_scale above and inside Ad
Delta scale is different in AdjustToSameScale(). There are two delta scales in 
this patch: between x and y and between max(x_scale, y_scale) and result_scale.

What we are doing here and in AddLarge() is first adjust to the same scale, do 
the addition, then scale down to result_scale.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310
PS2, Line 310: if (delta_scale > 0) {
> This might need a brief comment to explain why it's necessary (or perhaps t
Added comment.


http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java:

http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java@228
PS2, Line 228:* TODO: implement V2 rules for ADD/SUB.
> TODO not needed.
Done



--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Sat, 28 Oct 2017 00:02:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-10-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..


Patch Set 2:

(12 comments)

I spent some time getting to understand the code. Still don't 100% comprehend 
it but getting closer.

http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8309/1//COMMIT_MSG@31
PS1, Line 31: to avoid this by first checking if the result would into int128.
fit?


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@166
PS2, Line 166: LeadingZeroAdjustment
I think the name could make it clearer what it's computing. Maybe something 
like MinLeadingZerosAfterScaling()


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@170
PS2, Line 170: floor(log2(b))
The table values are actually floor(log2(10^b)) for i = 0, 1, 2, 3... etc, 
right? I found this notation a bit confusing.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@181
PS2, Line 181: same_sign
We usually make numeric/bool template arguments upper case to make it clear 
they're constants.

I'm not sure that templates are really necessary here though. It seems like we 
could just factor out the shared parts into helper functions and have different 
functions implementing the same sign/different sign algorithms. E.g.

  SeparateFractional(x, y, x_scale, y_scale, _left, _right, _left, 
_right);
  int max_scale = std::max(x_scale, y_scale);
  int delta_scale = max_scale - result_scale;
  ... do the same/different sign algorithm ...
 ...


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@185
PS2, Line 185:   // This is expected to be handled by the caller.
What about the case when they're zero? That's also meant to be handled by the 
caller, right? Do we test cases with multiplying by zero?


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@272
PS2, Line 272: DCHECK_EQ(result_scale, std::max(this_scale, other_scale));
A brief comment that this is guaranteed by the frontend migh help people new to 
the code.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@277
PS2, Line 277:   DCHECK(!*overflow) << "Cannot overflow unless result is 
Decimal16Value";
extra indent here


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@287
PS2, Line 287:   if (this_scale < other_scale) {
It might be slightly easier to follow if the branches were(delta_scale < 0) and 
(delta_scale > 0) - it's less obvious this way that it's forcing delta_scale to 
be positive.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@299
PS2, Line 299: into
long line.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@305
PS2, Line 305: bool ovf = AdjustToSameScale(*this, this_scale, other, 
other_scale,
It feels a bit weird that we're calculating delta_scale above and inside 
AdjustToSameScale() - it seems like the delta_scale logic in this function and 
AdjustToSameScale() are tightly coupled - AdjustToSameScale() doesn't really 
abstract it away if the caller has to know the details to correctly adjust the 
output result. I'm not sure if there's a better way to factor it.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310
PS2, Line 310: if (delta_scale > 0) {
This might need a brief comment to explain why it's necessary (or perhaps 
there's some way we can improve how AdjustToSameScale() is factored so that 
it's clearer what the expected result scale of that is).


http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java:

http://gerrit.cloudera.org:8080/#/c/8309/2/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java@228
PS2, Line 228:* TODO: implement V2 rules for ADD/SUB.
TODO not needed.



--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 26 Oct 2017 00:18:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-10-18 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..

IMPALA-5019: Decimal V2 addition

In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.

DECIMAL V1:
++
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
++
| DECIMAL(38,38) |
++

DECIMAL V2:
++
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
++
| DECIMAL(38,6)  |
++

This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would into int128.

Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
  decimal numbers and checks that Impala adds them correctly.

Performance:

For the common case, performance remains the same.
  select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)

  BEFORE: 4.74s
  AFTER:  4.73s

In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
  select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)

  BEFORE: 1.63s
  AFTER:  13.57s

In following case, we take the most complex path and see the most
signification perfmance hit.
  select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))

  BEFORE: 1.63s
  AFTER: 20.57

Change-Id: I401049c56d910eb1546a178c909c923b01239336
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
4 files changed, 297 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/2
--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-10-17 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8309/1/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java:

http://gerrit.cloudera.org:8080/#/c/8309/1/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java@268
PS1, Line 268: // Not yet implemented - fall back to V1 rules.
I think this can now be removed now.



--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 18 Oct 2017 02:30:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-10-17 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8309


Change subject: IMPALA-5019: Decimal V2 addition
..

IMPALA-5019: Decimal V2 addition

In this patch, we implement the new decimal return type rules for
addition expressions. These rules become active when the query option
DECIMAL_V2 is enabled. The algorithm for determining the type of the
result is described in the JIRA.

DECIMAL V1:
++
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
++
| DECIMAL(38,38) |
++

DECIMAL V2:
++
| typeof(cast(1 as decimal(38,0)) + cast(0.1 as decimal(38,38))) |
++
| DECIMAL(38,6)  |
++

This patch required backend changes. We implement an algorithm where
we handle the whole and fractional parts separately, and then combine
them to get the final result. This is more complex and slower. We try
to avoid this by first checking if the result would into int128.

Testing:
- Added expr tests.
- Tested locally on my machine with a script that generates random
  decimal numbers and checks that Impala adds them correctly.

Performance:

For the common case, performance remains the same.
  select cast(2.2 as decimal(18, 1) + cast(2.2 as decimal(18, 1)

  BEFORE: 4.74s
  AFTER:  4.73s

In this case, we check if it is necessary to do the complex addition,
and it turns out to be not necessary. We see a slowdown because the
result needs to be scaled down by dividing.
  select cast(2.2 as decimal(38, 19) + cast(2.2 as decimal(38, 19)

  BEFORE: 1.63s
  AFTER:  13.57s

In following case, we take the most complex path and see the most
signification perfmance hit.
  select cast(7.5 as decimal(38,37)) + cast(2.2 as decimal(38,37))

  BEFORE: 1.63s
  AFTER: 20.57

Change-Id: I401049c56d910eb1546a178c909c923b01239336
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
3 files changed, 282 insertions(+), 69 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/8309/1
--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky