anujphadke has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )
Change subject: IMPALA-4848: Add WIDTH_BUCKET() function ...................................................................... Patch Set 10: (19 comments) http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/expr-test.cc@4653 PS9, Line 4653: TestValue("is_nan(0.0)", TYPE_BOOLEAN, false); > Add some tests with nulls. Done http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc File be/src/exprs/math-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@455 PS9, Line 455: Subtracting a negative min_range from e > You can just write: Done http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460 PS9, Line 460: cimal8 > "In case" Done http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460 PS9, Line 460: re the re > function* Done http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460 PS9, Line 460: n > extra space here Done http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@461 PS9, Line 461: store the res > I am not sure it makes sense to even mention decimal8Value here. Why would It doesn't. I was trying to detail out why everything needs to be stored as a decimal16Value (asked in PS5). I was trying to convey that this cannot be stored in a decimal8value if it does not overflow. I ll remove it since it is sounding confusing. http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@461 PS9, Line 461: ith the > spelling Done http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@480 PS9, Line 480: } > What if min_range > max_range? Done http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@481 PS9, Line 481: > Would it make sense to include the name of the function in the error messag We don't in all the cases. https://github.com/apache/incubator-impala/blob/master/be/src/exprs/string-functions-ir.cc#L338 http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@489 PS9, Line 489: _range) { > how about: num_buckets.val + 1 I am storing in num_buckets in a BigIntVal first and then incrementing to avoid the overflow issue. result.val = num_buckets.val + 1 // will evaluate the RHS first and assign the overflowed value to result. Adding 1 later after storing it in a BigIntVal to avoid this. I have a test case - TestValue("width_bucket(22, 5, 20.1, 2147483647)", TYPE_BIGINT, 2147483648) http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@494 PS9, Line 494: } > Maybe expand this comment a little? E.g Done http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@503 PS9, Line 503: range_size = max_range.template Subtract<int128_t>(input_scale, min_range, > Actually I'm not correct here. It does overflow in the case I provided. We have a test case for this. // Test max - min will overflow during width_bucket evaluation TestError("width_bucket(11, -9, 99999999999999999999999999999999999999, 4000)"); http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@514 PS9, Line 514: dist_from_min = expr.template Subtract<int128_t>(input_scale, min_range, input_scale, > Can this overflow ever happen without overflowing on line 503? Add a test c Right, this overflow won't happen without overflowing the subtraction on l 503. It's not needed. Should we have this extra check for safety? http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@533 PS9, Line 533: > Nit: Put a space between "if" and the opening brace. (here and elsewhere) Done http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@539 PS9, Line 539: > avoid written twice Done http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@549 PS9, Line 549: // match the scale of the numerator. > Can you explain this a little better? It's not really clear why we have to Done http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@551 PS9, Line 551: range_size.value()), input_scale); > Is it possible for this to overflow? Yes. I added a check for this and set needs_int256 to true if it overflows. Added a a test for this. http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@557 PS9, Line 557: return BigIntVal::null(); > This should be moved up to around line 543. Moved it above. This ideally should not overflow since the result of the division is the bucket number which is a IntVal. Are you fine with having this check for safety? http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@571 PS9, Line 571: } > Why add this check here, instead of around line 480 where other similar che Done -- To view, visit http://gerrit.cloudera.org:8080/6023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f Gerrit-Change-Number: 6023 Gerrit-PatchSet: 10 Gerrit-Owner: anujphadke <apha...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Michael Brown <mi...@cloudera.com> Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-Reviewer: anujphadke <apha...@cloudera.com> Gerrit-Comment-Date: Wed, 17 Jan 2018 03:34:32 +0000 Gerrit-HasComments: Yes