[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-05 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java:

http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java@1472
PS13, Line 1472: Type.DOUBLE, ScalarType.createDecimalType(2, 1));
> I think the decimal output type parameters are orthogonal to the purpose of
Yeah, I didn't look at which test case we're in here. I think 
TestDecimalArithmetic() could use some additional cases.



--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 13
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 05 Oct 2017 23:20:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java:

http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java@1472
PS13, Line 1472: Type.DOUBLE, ScalarType.createDecimalType(2, 1));
> actually, are there any other cases that we should add here  now that we've
I think the decimal output type parameters are orthogonal to the purpose of 
this specific test, which is testing whether the output is resolved to 
*INT/DECIMAL/DOUBLE with various combinations of input types and 
literals/non-literals. I don't think Taras' change would have affected that.



--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 13
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 04 Oct 2017 19:02:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java:

http://gerrit.cloudera.org:8080/#/c/7438/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java@1472
PS13, Line 1472: Type.DOUBLE, ScalarType.createDecimalType(2, 1));
actually, are there any other cases that we should add here  now that we've 
changed decimal_v2 multiplication rule?



--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 13
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 04 Oct 2017 16:50:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
a 256 bit integer intermediate value, which we then attempt to scale down
and round.

Performance:

I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost
identical. For TPCH Q1, there was an improvement from 21 seconds to 16
seconds. I did not see any regressions.

The performance improvement is due to the way we check for overflows
after this patch (by counting the leading zeros instead of dividing).
It can be clealy seen in this query:
  select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1))
  before: 7.85s
  after:  2.03s

I noticed performance regressions in the following cases:
- When we need to convert to a 256 bit integer before multiplying,
  which was introduced in this patch. Whether this happens depends on
  the resulting precision and the value of the inputs. In the following
  extreme case, the intermediate value is converted to a 256 bit integer
  every time.

  select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37))
  before: 14.56s (returns null)
  after:  126.17s

- When we need to scale down the intermediate value. In the following
  query the result is decimal(38,6) after the patch, so the
  intermediate needs to be scaled down.

  select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19))
  before: 7.25s
  after:  13.06s

These regressions are possible only when the resulting precision is 38
which is not common in typical workloads.

Note: The actual queries that I ran for the benchmark are not exactly as
  above. I constructed tables with millions of rows with those values. I
  ran the queries with DECIMAL_v2=1 option before and after the patch.

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Reviewed-on: http://gerrit.cloudera.org:8080/7438
Reviewed-by: Taras Bobrovytsky 
Tested-by: Impala Public Jenkins
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
5 files changed, 333 insertions(+), 62 deletions(-)

Approvals:
  Taras Bobrovytsky: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 13
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 

[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 12: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 12
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 04 Oct 2017 09:37:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1297/


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 12
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 04 Oct 2017 05:27:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 12: Code-Review+2

Fixed the uninitialized variable bug in expr-test.cc. Forwarding the +2. Thanks 
Jim!


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 12
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 04 Oct 2017 05:26:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
a 256 bit integer intermediate value, which we then attempt to scale down
and round.

Performance:

I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost
identical. For TPCH Q1, there was an improvement from 21 seconds to 16
seconds. I did not see any regressions.

The performance improvement is due to the way we check for overflows
after this patch (by counting the leading zeros instead of dividing).
It can be clealy seen in this query:
  select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1))
  before: 7.85s
  after:  2.03s

I noticed performance regressions in the following cases:
- When we need to convert to a 256 bit integer before multiplying,
  which was introduced in this patch. Whether this happens depends on
  the resulting precision and the value of the inputs. In the following
  extreme case, the intermediate value is converted to a 256 bit integer
  every time.

  select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37))
  before: 14.56s (returns null)
  after:  126.17s

- When we need to scale down the intermediate value. In the following
  query the result is decimal(38,6) after the patch, so the
  intermediate needs to be scaled down.

  select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19))
  before: 7.25s
  after:  13.06s

These regressions are possible only when the resulting precision is 38
which is not common in typical workloads.

Note: The actual queries that I ran for the benchmark are not exactly as
  above. I constructed tables with millions of rows with those values. I
  ran the queries with DECIMAL_v2=1 option before and after the patch.

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
5 files changed, 333 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/12
--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 12
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7438/11/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/7438/11/be/src/exprs/expr-test.cc@1544
PS11, Line 1544: DCHECK(false) << "Unexpected character.";
This is causing a clang-tidy error in your last GVD:

See the last lines of 
https://jenkins.impala.io/job/gerrit-verify-dryrun/1292/console

ERROR: There were some failed jobs: 
[https://jenkins.impala.io/job/clang-tidy/1606/, 
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/374/]

Then the last lines of the URL it points to, 
https://jenkins.impala.io/job/clang-tidy/1606/console:

/home/ubuntu/Impala/be/src/exprs/expr-test.cc:1543:7: warning: variable 
'digit' is used uninitialized whenever switch default is taken 
[clang-diagnostic-sometimes-uninitialized]



--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 11
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 04 Oct 2017 03:05:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1295/


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 11
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 04 Oct 2017 02:50:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 11: Code-Review+2

Updated FE tests. Forwarding the +2.


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 11
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 04 Oct 2017 02:49:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
a 256 bit integer intermediate value, which we then attempt to scale down
and round.

Performance:

I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost
identical. For TPCH Q1, there was an improvement from 21 seconds to 16
seconds. I did not see any regressions.

The performance improvement is due to the way we check for overflows
after this patch (by counting the leading zeros instead of dividing).
It can be clealy seen in this query:
  select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1))
  before: 7.85s
  after:  2.03s

I noticed performance regressions in the following cases:
- When we need to convert to a 256 bit integer before multiplying,
  which was introduced in this patch. Whether this happens depends on
  the resulting precision and the value of the inputs. In the following
  extreme case, the intermediate value is converted to a 256 bit integer
  every time.

  select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37))
  before: 14.56s (returns null)
  after:  126.17s

- When we need to scale down the intermediate value. In the following
  query the result is decimal(38,6) after the patch, so the
  intermediate needs to be scaled down.

  select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19))
  before: 7.25s
  after:  13.06s

These regressions are possible only when the resulting precision is 38
which is not common in typical workloads.

Note: The actual queries that I ran for the benchmark are not exactly as
  above. I constructed tables with millions of rows with those values. I
  ran the queries with DECIMAL_v2=1 option before and after the patch.

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
5 files changed, 333 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/11
--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 11
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 10: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1292/


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 10
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 04 Oct 2017 01:59:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1292/


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 10
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 03 Oct 2017 21:57:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 10: Code-Review+2

Rebased, forwarding the +2


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 10
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 03 Oct 2017 21:51:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
a 256 bit integer intermediate value, which we then attempt to scale down
and round.

Performance:

I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost
identical. For TPCH Q1, there was an improvement from 21 seconds to 16
seconds. I did not see any regressions.

The performance improvement is due to the way we check for overflows
after this patch (by counting the leading zeros instead of dividing).
It can be clealy seen in this query:
  select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1))
  before: 7.85s
  after:  2.03s

I noticed performance regressions in the following cases:
- When we need to convert to a 256 bit integer before multiplying,
  which was introduced in this patch. Whether this happens depends on
  the resulting precision and the value of the inputs. In the following
  extreme case, the intermediate value is converted to a 256 bit integer
  every time.

  select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37))
  before: 14.56s (returns null)
  after:  126.17s

- When we need to scale down the intermediate value. In the following
  query the result is decimal(38,6) after the patch, so the
  intermediate needs to be scaled down.

  select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19))
  before: 7.25s
  after:  13.06s

These regressions are possible only when the resulting precision is 38
which is not common in typical workloads.

Note: The actual queries that I ran for the benchmark are not exactly as
  above. I constructed tables with millions of rows with those values. I
  ran the queries with DECIMAL_v2=1 option before and after the patch.

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
4 files changed, 330 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/10
--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 10
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
a 256 bit integer intermediate value, which we then attempt to scale down
and round.

Performance:

I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost
identical. For TPCH Q1, there was an improvement from 21 seconds to 16
seconds. I did not see any regressions.

The performance improvement is due to the way we check for overflows
after this patch (by counting the leading zeros instead of dividing).
It can be clealy seen in this query:
  select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1))
  before: 7.85s
  after:  2.03s

I noticed performance regressions in the following cases:
- When we need to convert to a 256 bit integer before multiplying,
  which was introduced in this patch. Whether this happens depends on
  the resulting precision and the value of the inputs. In the following
  extreme case, the intermediate value is converted to a 256 bit integer
  every time.

  select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37))
  before: 14.56s (returns null)
  after:  126.17s

- When we need to scale down the intermediate value. In the following
  query the result is decimal(38,6) after the patch, so the
  intermediate needs to be scaled down.

  select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19))
  before: 7.25s
  after:  13.06s

These regressions are possible only when the resulting precision is 38
which is not common in typical workloads.

Note: The actual queries that I ran for the benchmark are not exactly as
  above. I constructed tables with millions of rows with those values. I
  ran the queries with DECIMAL_v2=1 option before and after the patch.

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
4 files changed, 330 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/9
--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 9
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 8:

Did you mean to upload a new diff?


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 8
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 03 Oct 2017 21:39:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 8:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1672
PS8, Line 1672: not require int256.
> if result does not require int256, why does it overflow (return null)?
Added comment.


http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1675
PS8, Line 1675: { true, 0, 38, 0 }}},
> when both V1 and V2 give the same answer, you don't have to include both re
Done


http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1680
PS8, Line 1680:   { "cast(18446744073709551615 as decimal(38,0)) * 
cast(9223372036854775808 as decimal(38,0))",
> It might be helpful if you mentioned how these values where computed in the
Added comment.


http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1702
PS8, Line 1702: 85070591730234615865843651857942052864
> how did you verify these results?
I used Python to make sure I get the same results.



--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 8
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 03 Oct 2017 21:26:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1680
PS8, Line 1680:   { "cast(18446744073709551615 as decimal(38,0)) * 
cast(9223372036854775808 as decimal(38,0))",
It might be helpful if you mentioned how these values where computed in the 
comments, e.g. 2 ^ 64 - 1



--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 8
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 03 Oct 2017 17:56:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-03 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 8: Code-Review+2

(3 comments)

Changes seem okay to me, but might be good to have Tim also take a look.

http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1672
PS8, Line 1672: not require int256.
if result does not require int256, why does it overflow (return null)?


http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1675
PS8, Line 1675: { true, 0, 38, 0 }}},
when both V1 and V2 give the same answer, you don't have to include both results


http://gerrit.cloudera.org:8080/#/c/7438/8/be/src/exprs/expr-test.cc@1702
PS8, Line 1702: 85070591730234615865843651857942052864
how did you verify these results?



--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 8
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 03 Oct 2017 17:11:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-02 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 8:

I made some additional changes in patch 8 as a result of thinking about your 
comment. Dan, can you take a look one more time?


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 8
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 03 Oct 2017 02:55:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-02 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
a 256 bit integer intermediate value, which we then attempt to scale down
and round.

Performance:

I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost
identical. For TPCH Q1, there was an improvement from 21 seconds to 16
seconds. I did not see any regressions.

The performance improvement is due to the way we check for overflows
after this patch (by counting the leading zeros instead of dividing).
It can be clealy seen in this query:
  select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1))
  before: 7.85s
  after:  2.03s

I noticed performance regressions in the following cases:
- When we need to convert to a 256 bit integer before multiplying,
  which was introduced in this patch. Whether this happens depends on
  the resulting precision and the value of the inputs. In the following
  extreme case, the intermediate value is converted to a 256 bit integer
  every time.

  select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37))
  before: 14.56s (returns null)
  after:  126.17s

- When we need to scale down the intermediate value. In the following
  query the result is decimal(38,6) after the patch, so the
  intermediate needs to be scaled down.

  select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19))
  before: 7.25s
  after:  13.06s

These regressions are possible only when the resulting precision is 38
which is not common in typical workloads.

Note: The actual queries that I ran for the benchmark are not exactly as
  above. I constructed tables with millions of rows with those values. I
  ran the queries with DECIMAL_v2=1 option before and after the patch.

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
4 files changed, 332 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/8
--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 8
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-02 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h@249
PS7, Line 249:   } else {
> maybe a quick comment:
Yes, your understanding is correct. Added comment.



--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 7
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 03 Oct 2017 02:53:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-02 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 7: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7438/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/7438/6//COMMIT_MSG@67
PS6, Line 67:   after:  126.17s
> No, not always. Look at line 73 in this file.
Yeah, but that's "just" 2x, as opposed to 10x.


http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/7438/7/be/src/runtime/decimal-value.inline.h@249
PS7, Line 249:   } else {
maybe a quick comment:
// We've verified the intermediate value will fit in int128.

is my understanding correct?



--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 7
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Mon, 02 Oct 2017 22:29:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-10-02 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
a 256 bit integer intermediate value, which we then attempt to scale down
and round.

Performance:

I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost
identical. For TPCH Q1, there was an improvement from 21 seconds to 16
seconds. I did not see any regressions.

The performance improvement is due to the way we check for overflows
after this patch (by counting the leading zeros instead of dividing).
It can be clealy seen in this query:
  select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1))
  before: 7.85s
  after:  2.03s

I noticed performance regressions in the following cases:
- When we need to convert to a 256 bit integer before multiplying,
  which was introduced in this patch. Whether this happens depends on
  the resulting precision and the value of the inputs. In the following
  extreme case, the intermediate value is converted to a 256 bit integer
  every time.

  select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37))
  before: 14.56s (returns null)
  after:  126.17s

- When we need to scale down the intermediate value. In the following
  query the result is decimal(38,6) after the patch, so the
  intermediate needs to be scaled down.

  select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19))
  before: 7.25s
  after:  13.06s

These regressions are possible only when the resulting precision is 38
which is not common in typical workloads.

Note: The actual queries that I ran for the benchmark are not exactly as
  above. I constructed tables with millions of rows with those values. I
  ran the queries with DECIMAL_v2=1 option before and after the patch.

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
4 files changed, 292 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/7
--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 7
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-29 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 6:

(7 comments)

mostly just some questions.

http://gerrit.cloudera.org:8080/#/c/7438/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/7438/6//COMMIT_MSG@67
PS6, Line 67:   after:  126.17s
is it always the case that this regression is in cases that would have 
(incorrectly) returned null before this change?


http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h@a253
PS6, Line 253:
great to see that go away


http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h@245
PS6, Line 245: overflow
instead of saying "we will overflow", I think "we indicate overflow" or "return 
overflow" or something like that. I was trying to figure out where the overflow 
occurs until I continued reading.


http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h@244
PS6, Line 244: If delta scale is 0, we need to do a more precise check if 
converting to 256 bits
 :   // is necessary
I don't understand that comment, given that line 240-241 says the first check 
was conservative.


http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h@273
PS6, Line 273:   // with a delta_scale 39 does not fit into int128.
so normally the (38,38) * (38,38) case would be handled in the needs_int256 
case, is that right?  is this case just when the values are so small that they 
turn into 0?


http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/runtime/decimal-value.inline.h@284
PS6, Line 284:
nit: let's eliminate this blank line (and maybe others. this function already 
barely fits on one screen which is important for readability.


http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/util/bit-util.h
File be/src/util/bit-util.h:

http://gerrit.cloudera.org:8080/#/c/7438/6/be/src/util/bit-util.h@292
PS6, Line 292: __int128 shifted = v >> 64;
 : if (shifted > 0) {
given that 'v' is unsigned, won't the shift be unsigned, meaning 'shifted' will 
always get zeros in the top 64, meaning shifted is always >= 0?  oh, so is this 
really just checking "shifted != 0"?  If so, that seems clearer, especially 
since the else-clause only makes sense for 0, not for negative.



--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 29 Sep 2017 23:23:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 6: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 29 Sep 2017 22:04:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-28 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 6:

The algorithm mentioned here, on page 17 seems quite useful:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.111.9736=rep1=pdf


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 28 Sep 2017 17:28:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-27 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 6:

> > I totally agree about doing anything fancy as a followup.  In
 > > particular, I did not find anything at all useful in AVX to help.
 > > There is no vectored multiply large enough for this, and FMA
 > > operations won't help either as they basically only help with
 > > precision loss on floating point types.
 > >
 > > Considering the perf difference is so extreme in some cases, I
 > > think we should strongly consider either living with the broken
 > > behavior until we can come up with a solution, or else verifying
 > > with users making use of the DECIMAL_V2 feature that this will
 > not
 > > be a problem.
 >
 > AVX2 has 32x8 times 32x8 multiplication. I have elsewhere simulated
 > 64-bit multiplication with this, combines with additions and
 > shifts, and received a speedup on 64x4 times 64x4 operations.

Oh, and AVX512 has vpmullq for multiplying vectors of 64-bit values.


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 28 Sep 2017 01:56:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-27 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 6:

> I totally agree about doing anything fancy as a followup.  In
 > particular, I did not find anything at all useful in AVX to help.
 > There is no vectored multiply large enough for this, and FMA
 > operations won't help either as they basically only help with
 > precision loss on floating point types.
 >
 > Considering the perf difference is so extreme in some cases, I
 > think we should strongly consider either living with the broken
 > behavior until we can come up with a solution, or else verifying
 > with users making use of the DECIMAL_V2 feature that this will not
 > be a problem.

AVX2 has 32x8 times 32x8 multiplication. I have elsewhere simulated 64-bit 
multiplication with this, combines with additions and shifts, and received a 
speedup on 64x4 times 64x4 operations.


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 27 Sep 2017 20:31:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-26 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 6:

I totally agree about doing anything fancy as a followup.  In particular, I did 
not find anything at all useful in AVX to help.  There is no vectored multiply 
large enough for this, and FMA operations won't help either as they basically 
only help with precision loss on floating point types.

Considering the perf difference is so extreme in some cases, I think we should 
strongly consider either living with the broken behavior until we can come up 
with a solution, or else verifying with users making use of the DECIMAL_V2 
feature that this will not be a problem.


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 27 Sep 2017 03:00:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-26 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 6:

I think we should explore your suggestion to try SEE/AVX in a follow on patch. 
(to optimze both multiplication and division if it's possible)


--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 27 Sep 2017 01:24:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-26 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
a 256 bit integer intermediate value, which we then attempt to scale down
and round.

Performance:

I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost
identical. For TPCH Q1, there was an improvement from 21 seconds to 16
seconds. I did not see any regressions.

The performance improvement is due to the way we check for overflows
after this patch (by counting the leading zeros instead of dividing).
It can be clealy seen in this query:
  select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1))
  before: 7.85s
  after:  2.03s

I noticed performance regressions in the following cases:
- When we need to convert to a 256 bit integer before multiplying,
  which was introduced in this patch. Whether this happens depends on
  the resulting precision and the value of the inputs. In the following
  extreme case, the intermediate value is converted to a 256 bit integer
  every time.

  select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37))
  before: 14.56s (returns null)
  after:  126.17s

- When we need to scale down the intermediate value. In the following
  query the result is decimal(38,6) after the patch, so the
  intermediate needs to be scaled down.

  select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19))
  before: 7.25s
  after:  13.06s

These regressions are possible only when the resulting precision is 38
which is not common in typical workloads.

Note: The actual queries that I ran for the benchmark are not exactly as
  above. I constructed tables with millions of rows with those values. I
  ran the queries with DECIMAL_v2=1 option before and after the patch.

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
4 files changed, 289 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/6
--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-20 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 5:

Have we ever tried using SSE/AVX for multiplication?  It should be possible to 
avoid using int256_t altogether if we can do four 64-bit multiplies in 
parallel.  Are FMA instructions available for integer? (I have the manual in 
front of me but have to run and no time to look it up).  We should be able to 
do the division the same way with another round of multiplication.

-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-18 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7438/5/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

Line 203:   RESULT_T result = value / multiplier;
> Another idea: there are a finite number of possible delta_scale values.  It
After trying different approaches, this one produced the best results: 
http://github.mtv.cloudera.com/Taras/Impala/commit/950892a48e555ba24dcca772851d251ae3ecb17f

Before: 66.13s
After: 55.63s

So, a small improvement. Still not that great: without the multiplication 
patch, it took 8.26s


-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7438/5/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

Line 203:   RESULT_T result = value / multiplier;
Another idea: there are a finite number of possible delta_scale values.  It may 
be faster to promote delta_scale to a template parameter and switch on 
delta_scale to select an optimised implementation. The compiler may be able to 
optimise the division if it compiles the division separately for each specific 
multiplier values, e.g. generating special code for /10, /100, /1000, etc. E.g. 
there are plenty of tricks in this area that compilers can incorporate: 
https://stackoverflow.com/questions/2033210/c-fast-division-mod-by-10x


-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7438/4//COMMIT_MSG
Commit Message:

Line 75:   DECIMAL_V2 disabled: 4.25s
> Running perf top showed that most of the time is spent in ScaleDownAndRound
Yeah that seems likely - the faster everything else becomes the more this would 
be a bottleneck.

I don't see an obvious way to speed this up (except maybe avoiding the 
redundant divide/mod calls?) and I guess it only makes a difference if a lot of 
expressions are overflowing. Maybe someone with deeper knowledge of decimal 
might see a way to improve it.


http://gerrit.cloudera.org:8080/#/c/7438/4/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

Line 204:   if (round) {
It's unfortunate that we're not computing the quotient and remainder at the 
same time. It looks like boost eventually calls divide_unsigned_helper in 
multiprecision/cpp_int/divide.hpp with the same arguments in both cases but 
throws away either the result or remainder.

If avoiding two divide calls would help performance we could consider calling 
divide_unsigned_helper() directly.


-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-11 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#5).

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
an int256 intermediate value, which we then attempt to scale down and
round to int128.

Benchmarks:

In the following benchmarks, we are selecting from lineitem_big, which
has about 100 times as many rows as the normal lineitem.

Query:
  select
sum(l_quantity * l_tax) +
sum(l_extendedprice * l_discount)
  from lineitem_big;

Before:
  DECIMAL_V2 disabled: 2.24s
  DECIMAL_V2 enabled : 2.14

After:
  DECIMAL_V2 disabled: 2.66s
  DECIMAL_V2 enabled : 2.25s

Query:
  select
sum(l_quantity * l_tax * cast(1 as decimal(38, 0))) +
sum(l_extendedprice * l_discount * cast(1 as decimal(38, 0)))
  from lineitem_big

Before:
  DECIMAL_V2 disabled: 2.34s
  DECIMAL_V2 enabled : 2.36s

After:
  DECIMAL_V2 disabled: 4.25s
  DECIMAL_V2 enabled : 4.15s

Query:
  select
sum(l_quantity * l_tax * cast(1 as decimal(38, 37))) +
sum(l_extendedprice * l_discount * cast(1 as decimal(38, 37)))
  from lineitem_big

Before:
  DECIMAL_V2 disabled: 2.84s
  DECIMAL_V2 enabled : 8.26s

After:
  DECIMAL_V2 disabled: 69.16s
  DECIMAL_V2 enabled : 66.13s

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
4 files changed, 282 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

2017-09-11 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#5).

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
..

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,38)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,30)|
+---+

DECIMAL V2:

+---+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+---+
| DECIMAL(38,37)|
+---+

+---+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+---+
| DECIMAL(38,6) |
+---+

In this patch, we also fix the early multiplication overflow. We compute
an int256 intermediate value, which we then attempt to scale down and
round to int128.

Benchmarks:

In the following benchmarks, we are selecting from lineitem_big, which
has about 100 times as many rows as the normal lineitem.

Query:
  select
sum(l_quantity * l_tax) +
sum(l_extendedprice * l_discount)
  from lineitem_big;

Before:
  DECIMAL_V2 disabled: 2.24s
  DECIMAL_V2 enabled : 2.14

After:
  DECIMAL_V2 disabled: 2.66s
  DECIMAL_V2 enabled : 2.25s

Query:
  select
sum(l_quantity * l_tax * cast(1 as decimal(38, 0))) +
sum(l_extendedprice * l_discount * cast(1 as decimal(38, 0)))
  from lineitem_big

Before:
  DECIMAL_V2 disabled: 2.34s
  DECIMAL_V2 enabled : 2.36s

After:
  DECIMAL_V2 disabled: 4.25s
  DECIMAL_V2 enabled : 4.15s

Query:
  select
sum(l_quantity * l_tax * cast(1 as decimal(38, 37))) +
sum(l_extendedprice * l_discount * cast(1 as decimal(38, 37)))
  from lineitem_big

Before:
  DECIMAL_V2 disabled: 2.84s
  DECIMAL_V2 enabled : 8.26s

After:
  DECIMAL_V2 disabled: 69.16s
  DECIMAL_V2 enabled : 66.13s

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
4 files changed, 282 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden