Zach Amsden has posted comments on this change. Change subject: IMPALA-4939, IMPALA-4939: Decimal V2 multiplication ......................................................................
Patch Set 1: (11 comments) http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 1554: } This is the second time a similar function has come up in code review. What do you think about using user-defined literals or merging Jim Apple's code: https://github.com/jbapple/128-bit-literals Line 1693: { true, 0, 38, 6 }}}, Other test - 38 9's right of decimal * 38 9's right of decimal. If I am correct this should round up to 1 in the v2 type scheme (due to the crazy +1 and rounding). http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/runtime/decimal-value.inline.h File be/src/runtime/decimal-value.inline.h: Line 240: if (needs_int256) { UNLIKELY(needs_int256) Line 251: if (delta_scale > 38) { Can we explicitly handle the LIKELY(delta_scale == 0) case first? Line 253: // decimal(38, 38). The result should be a decimal(38, 37), so Is this what our new formula gives? It is slightly paradoxical, since it loses precision. We can never have more than (p1-s1)+(p2-s2) digits to the left of the decimal. I think we should reconsider our truncation rules as they apply to multiply. Line 301: needs_int256 = DecimalUtil::MAX_UNSCALED_DECIMAL16 / abs(y) < abs(x); I think this can be optimized a bit. Why not use count of leading zeroes as an estimator of result? Then we know needs_int256 = lzcnt(x) + lzcnt(y) < 128 Line 305: if (needs_int256) { UNLIKELY Line 316: if (delta_scale > 38) { As above, handle likely case of delta_scale == 0 first http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/util/bit-util.h File be/src/util/bit-util.h: Line 65: return value < 0 ? -1 : 1; The reason it was written that way is I think this is undefined for unsigned integers and triggers pedantic warnings (also I think gcc and clang turn this into less than optimal code). godbolt is not working for me right now so I'll check later. http://gerrit.cloudera.org:8080/#/c/7438/1/be/src/util/decimal-util.h File be/src/util/decimal-util.h: Line 336: if (LIKELY(scale < 77)) return values[scale]; Do we really need values this high? Maybe truncate the scale earlier and add a DCHECK that we never request larger values? http://gerrit.cloudera.org:8080/#/c/7438/1/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java File fe/src/main/java/org/apache/impala/analysis/TypesUtil.java: Line 260: resultPrecision = p1 + p2 + 1; This +1 is messy for several reasons - first, it isn't necessary at all except for rounding on precision overflow. Second, it actually reduces precision - in general, v1 * v2 should have at most (p1 - s1) + (p2 - s2) digits left of the decimal point, instead we have one more. Here we get less precise results for almost all cases, just to save the one case of 0.99999... * 0.99999 rounding up to 1.0. I'd prefer to truncate that case and keep the extra precision. Finally, it contributes to type inflation, creating longer types and more overhead than necessary. -- To view, visit http://gerrit.cloudera.org:8080/7438 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zach Amsden <[email protected]> Gerrit-HasComments: Yes
