yashmayya opened a new pull request, #14689: URL: https://github.com/apache/pinot/pull/14689
- Fixes https://github.com/apache/pinot/issues/12705. - As discussed in the above issue, the problem is that in final aggregates the actual return type from the aggregation function will be a double value (from `MIN` / `MAX` / `SUM` / `AVG`) and this is attempted to be cast to `BigDecimal` when constructing the data block to be sent over the wire. The cast is attempted because the return type for `MIN` / `MAX` / `SUM` / `AVG` is inferred as `DECIMAL` when the input operand type is `DECIMAL` (or `BIG_DECIMAL` in Pinot) due to the use of the standard operators in Calcite. We don't want to change this type inference logic because it's backward incompatible (the actual return type for users would change if we change this) and also because we want to move towards actual polymorphic aggregation functions (i.e., not just the appearance of polymorphism through casting). Note that this isn't an issue in the leaf aggregation because the return type there is determined as `DOUBLE` through the [AggregationFunctionType](https://github.com/apache/ pinot/blob/d15f8c8caf531b29a839801a8afc08dc314c6d8e/pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/AggregationFunctionType.java#L54-L59) (see [here](https://github.com/apache/pinot/blob/d15f8c8caf531b29a839801a8afc08dc314c6d8e/pinot-query-planner/src/main/java/org/apache/pinot/calcite/rel/rules/PinotAggregateExchangeNodeInsertRule.java#L326-L331)). - The type conversion logic added to `TypeUtils::convert` (that handles single-stage -> multi-stage type conversion) includes a type check before converting because there are some aggregation functions like `SUMPRECISION` that actually return `BigDecimal` values and we don't want to lose precision there by converting through doubles. - In the long term, we want to support truly polymorphic aggregation functions in the multi-stage query engine (i.e., where the MAX aggregation function would return a value of the same type as the input operand type - right now it's always `DOUBLE`). However, this has backward compatibility implications and we need to be careful about not breaking the v1 engine. Right now, the output type to end users will still be the same as the input operand type in the MSQE but there can be loss of precision (`MAX(BIG_DECIMAL col) -> DOUBLE output -> cast to BIG_DECIMAL (loss of precision)` for instance). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
