davidlghellin commented on issue #18867:
URL: https://github.com/apache/datafusion/issues/18867#issuecomment-3562465418

   I have a question about the intended semantics here: is `median` supposed to 
behave as a **discrete** median (returning a value of the same type as the 
input, e.g. `INT`), or more like a **continuous** percentile (similar to 
`percentile_cont(..., 0.5)`)?
   
   For example, in DataFusion today, `median` for integer inputs effectively 
takes the midpoint but returns it in the input type, so for values `[0, 1, 2, 
3, 4, 5, 6, 7, 8, 9]` you get `4`, while a continuous percentile like 
`percentile_cont(n, 0.5)` would return `4.5`.
   
   It would be helpful to clarify whether the goal here is:
   - to keep `median` as a discrete median that preserves the input type, or
   - to align `median` with a continuous percentile definition (and possibly 
promote to `Float64`).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to