Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023 )
Change subject: IMPALA-4848: Add WIDTH_BUCKET() function ...................................................................... Patch Set 9: (20 comments) http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/expr-test.cc@4653 PS9, Line 4653: TestValue("width_bucket(22, 5, 20.1, 4)", TYPE_BIGINT, 5); Add some tests with nulls. For example: (NULL, 5, 20, 3) (12, NULL, 20, 3) (12, 5, NULL, 3) (12, 5, 20, NULL) http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc File be/src/exprs/math-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@455 PS9, Line 455: min_range = -1 (or any negative value) You can just write: min_range < 0 http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460 PS9, Line 460: Incase "In case" http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460 PS9, Line 460: t extra space here http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460 PS9, Line 460: functions function* http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@461 PS9, Line 461: decimal8Value I am not sure it makes sense to even mention decimal8Value here. Why would you even think about storing it in a decimal 8? http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@461 PS9, Line 461: stroring spelling http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@480 PS9, Line 480: if (min_range == max_range) { What if min_range > max_range? Also, add a test case. http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@481 PS9, Line 481: Lower bound cannot be equal to upper bound Would it make sense to include the name of the function in the error message? Do we normally do that? http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@489 PS9, Line 489: num_buckets.val; how about: num_buckets.val + 1 Also, add a test case where num_buckets is a maximally large integer, so adding one causes an overflow. http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@494 PS9, Line 494: // FE casts all the arguments to the same type. Maybe expand this comment a little? E.g "expr", "min_range" and "max_range" are guaranteed to be the same scale and precision by the FE http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@503 PS9, Line 503: if(overflow) { Have we considered what happens when subtraction is successful (and overflow is false), but the result is larger than the largest allowed decimal (10^38 - 1). Add a test case like this. For example: min range = -100 max_range = 10^38 - 1 The result of that subtraction should not overflow, because it fits into int128_t http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@514 PS9, Line 514: if(overflow) { Can this overflow ever happen without overflowing on line 503? Add a test case for this overflow and error message. (You might need to add some End to end tests to test for the error message) http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@533 PS9, Line 533: if(needs_int256) { Nit: Put a space between "if" and the opening brace. (here and elsewhere) http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@539 PS9, Line 539: avoid avoid written twice http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@549 PS9, Line 549: // bump up the denominator scale so that the scale of the numerator and denominator Can you explain this a little better? It's not really clear why we have to scale it up. http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@551 PS9, Line 551: int128_t y = DecimalUtil::MultiplyByScale<int128_t>(range_size.value(), Is it possible for this to overflow? http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@557 PS9, Line 557: ctx->SetError("Overflow while dividing by the range_size"); This should be moved up to around line 543. Add a test case for this error message. http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@571 PS9, Line 571: if (UNLIKELY(num_buckets.val <= 0)) { Why add this check here, instead of around line 480 where other similar check are located? http://gerrit.cloudera.org:8080/#/c/6023/9/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/6023/9/fe/src/main/java/org/apache/impala/analysis/Expr.java@459 PS9, Line 459: Why this empty line? -- To view, visit http://gerrit.cloudera.org:8080/6023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f Gerrit-Change-Number: 6023 Gerrit-PatchSet: 9 Gerrit-Owner: anujphadke <apha...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Michael Brown <mi...@cloudera.com> Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-Reviewer: anujphadke <apha...@cloudera.com> Gerrit-Comment-Date: Tue, 21 Nov 2017 00:08:38 +0000 Gerrit-HasComments: Yes