davidlghellin commented on issue #18867: URL: https://github.com/apache/datafusion/issues/18867#issuecomment-3562465418
I have a question about the intended semantics here: is `median` supposed to behave as a **discrete** median (returning a value of the same type as the input, e.g. `INT`), or more like a **continuous** percentile (similar to `percentile_cont(..., 0.5)`)? For example, in DataFusion today, `median` for integer inputs effectively takes the midpoint but returns it in the input type, so for values `[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]` you get `4`, while a continuous percentile like `percentile_cont(n, 0.5)` would return `4.5`. It would be helpful to clarify whether the goal here is: - to keep `median` as a discrete median that preserves the input type, or - to align `median` with a continuous percentile definition (and possibly promote to `Float64`). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
