[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

anujphadke (Code Review) Tue, 16 Jan 2018 19:35:03 -0800

anujphadke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/6023 )


Change subject: IMPALA-4848: Add WIDTH_BUCKET() function
......................................................................


Patch Set 10:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/expr-test.cc@4653
PS9, Line 4653:   TestValue("is_nan(0.0)", TYPE_BOOLEAN, false);
> Add some tests with nulls.
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc
File be/src/exprs/math-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@455
PS9, Line 455: Subtracting a negative min_range from e
> You can just write:
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460
PS9, Line 460: cimal8
> "In case"
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460
PS9, Line 460: re the re
> function*
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@460
PS9, Line 460: n
> extra space here
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@461
PS9, Line 461: store the res
> I am not sure it makes sense to even mention decimal8Value here. Why would
It doesn't.
I was trying to detail out why everything needs to be stored as a 
decimal16Value (asked in PS5).
I was trying to convey that this cannot be stored in a decimal8value if it does 
not overflow.
I ll remove it since it is sounding confusing.


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@461
PS9, Line 461: ith the
> spelling
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@480
PS9, Line 480:   }
> What if min_range > max_range?
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@481
PS9, Line 481:
> Would it make sense to include the name of the function in the error messag
We don't in all the cases.
https://github.com/apache/incubator-impala/blob/master/be/src/exprs/string-functions-ir.cc#L338


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@489
PS9, Line 489: _range) {
> how about: num_buckets.val + 1
I am storing in num_buckets in a BigIntVal first and then incrementing to avoid 
the overflow issue.

result.val = num_buckets.val + 1 // will evaluate the RHS first and assign the 
overflowed value to result.

Adding 1 later after storing it in a BigIntVal to avoid this.

I have a test case -
TestValue("width_bucket(22, 5, 20.1, 2147483647)", TYPE_BIGINT, 2147483648)


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@494
PS9, Line 494:   }
> Maybe expand this comment a little? E.g
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@503
PS9, Line 503:   range_size = max_range.template 
Subtract<int128_t>(input_scale, min_range,
> Actually I'm not correct here. It does overflow in the case I provided.
We have a test case for this.
 // Test max - min will overflow during  width_bucket evaluation
  TestError("width_bucket(11, -9, 99999999999999999999999999999999999999, 
4000)");


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@514
PS9, Line 514:   dist_from_min = expr.template Subtract<int128_t>(input_scale, 
min_range, input_scale,
> Can this overflow ever happen without overflowing on line 503? Add a test c
Right, this overflow won't happen without overflowing the subtraction on l 503.
It's not needed. Should we have this extra check for safety?


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@533
PS9, Line 533:
> Nit: Put a space between "if" and the opening brace. (here and elsewhere)
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@539
PS9, Line 539:
> avoid written twice
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@549
PS9, Line 549:     // match the scale of the numerator.
> Can you explain this a little better? It's not really clear why we have to
Done


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@551
PS9, Line 551:         range_size.value()), input_scale);
> Is it possible for this to overflow?
Yes. I added a check for this and set needs_int256 to true if it overflows.
Added a a test for this.


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@557
PS9, Line 557:       return BigIntVal::null();
> This should be moved up to around line 543.
Moved it above.
This ideally should not overflow  since the result of the division is the 
bucket number which is a IntVal.
Are you fine with having this check for safety?


http://gerrit.cloudera.org:8080/#/c/6023/9/be/src/exprs/math-functions-ir.cc@571
PS9, Line 571: }
> Why add this check here, instead of around line 480 where other similar che
Done



--
To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 10
Gerrit-Owner: anujphadke <apha...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-Reviewer: anujphadke <apha...@cloudera.com>
Gerrit-Comment-Date: Wed, 17 Jan 2018 03:34:32 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function

Reply via email to