github-actions[bot] commented on code in PR #63430:
URL: https://github.com/apache/doris/pull/63430#discussion_r3272533556
##########
be/src/exprs/aggregate/aggregate_function_linear_histogram.h:
##########
@@ -174,6 +187,61 @@ struct AggregateFunctionLinearHistogramData {
auto& column = assert_cast<ColumnString&>(to);
column.insert_data(buffer.GetString(), buffer.GetSize());
}
+
+private:
+ double bucket_key(double value) const {
+ double key = (value - offset) / interval;
+ double rounded_key = std::round(key);
+ if (boundary_decimal_scale != NO_BOUNDARY_SCALE) {
+ double raw_boundary = rounded_key * interval + offset;
+ double boundary = normalize_boundary(raw_boundary,
boundary_decimal_scale);
+ if (value == boundary) {
Review Comment:
This can still move valid values across the true bucket boundary when
`decimal_scale()` classifies an interval that is only near a decimal value. For
example, with `interval = 0.10000000000000002` and `offset = 0`,
`decimal_scale(interval)` returns scale 1 because `interval * 10` is within the
epsilon tolerance of `1`. Then `bucket_key(0.1)` normalizes the rounded raw
boundary back to `0.1`, this equality succeeds, and the value is assigned to
key `1`, even though `(0.1 - 0) / 0.10000000000000002` is below `1` and should
floor to bucket `0`. This is distinct from the already-raised key-space
tolerance issue because the current code removed that tolerance but still gets
the same kind of mis-bucketing through tolerant scale inference plus
normalized-boundary equality. Please only enable decimal boundary snapping when
the interval/offset are exactly represented by the inferred scale, or otherwise
ensure snapping cannot cross a raw boundary that is greater than the input
value.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]