Taras Bobrovytsky has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/7438 )

Change subject: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
......................................................................

IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+-----------------------------------------------------------------------+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+-----------------------------------------------------------------------+
| DECIMAL(38,38)                                                        |
+-----------------------------------------------------------------------+

+-----------------------------------------------------------------------+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+-----------------------------------------------------------------------+
| DECIMAL(38,30)                                                        |
+-----------------------------------------------------------------------+

DECIMAL V2:

+-----------------------------------------------------------------------+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+-----------------------------------------------------------------------+
| DECIMAL(38,37)                                                        |
+-----------------------------------------------------------------------+

+-----------------------------------------------------------------------+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+-----------------------------------------------------------------------+
| DECIMAL(38,6)                                                         |
+-----------------------------------------------------------------------+

In this patch, we also fix the early multiplication overflow. We compute
a 256 bit integer intermediate value, which we then attempt to scale down
and round.

Performance:

I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost
identical. For TPCH Q1, there was an improvement from 21 seconds to 16
seconds. I did not see any regressions.

The performance improvement is due to the way we check for overflows
after this patch (by counting the leading zeros instead of dividing).
It can be clealy seen in this query:
  select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1))
  before: 7.85s
  after:  2.03s

I noticed performance regressions in the following cases:
- When we need to convert to a 256 bit integer before multiplying,
  which was introduced in this patch. Whether this happens depends on
  the resulting precision and the value of the inputs. In the following
  extreme case, the intermediate value is converted to a 256 bit integer
  every time.

  select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37))
  before: 14.56s (returns null)
  after:  126.17s

- When we need to scale down the intermediate value. In the following
  query the result is decimal(38,6) after the patch, so the
  intermediate needs to be scaled down.

  select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19))
  before: 7.25s
  after:  13.06s

These regressions are possible only when the resulting precision is 38
which is not common in typical workloads.

Note: The actual queries that I ran for the benchmark are not exactly as
  above. I constructed tables with millions of rows with those values. I
  ran the queries with DECIMAL_v2=1 option before and after the patch.

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
---
M be/src/exprs/expr-test.cc
M be/src/runtime/decimal-value.inline.h
M be/src/util/bit-util.h
M fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
4 files changed, 289 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/7438/6
--
To view, visit http://gerrit.cloudera.org:8080/7438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Gerrit-Change-Number: 7438
Gerrit-PatchSet: 6
Gerrit-Owner: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <zams...@cloudera.com>

Reply via email to