kumarUjjawal commented on issue #19322: URL: https://github.com/apache/datafusion/issues/19322#issuecomment-3692004659
@Jefffrey I've been looking into the median overflow issue. Hers' what `percentile_cont` uses for handling integer types. - Casting integer inputs to Float64 internally ([percentile_cont.rs L193-203](https://github.com/apache/datafusion/blob/main/datafusion/functions-aggregate/src/percentile_cont.rs#L193-L203)) - Returning Float64 for integer inputs ([percentile_cont.rs L251-258](https://github.com/apache/datafusion/blob/main/datafusion/functions-aggregate/src/percentile_cont.rs#L251-L258)) - Casting in update_batch ([percentile_cont.rs L513-519](https://github.com/apache/datafusion/blob/main/datafusion/functions-aggregate/src/percentile_cont.rs#L513-L519)) This would also address issue #18867, where we expect median(integers) to match `percentile_cont(column, 0.5)`. Questions before I proceed: 1. Should median hange its return type for integers (from Int8 → Float64)? This is a breaking change but matches percentile_cont behavior. 2. Or should we only fix the overflow while keeping integer return type? (Promote intermediate calculation to i128 or f64, then cast back) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
