Jefffrey opened a new issue, #19612: URL: https://github.com/apache/datafusion/issues/19612
### Describe the bug Originally from this discussion: https://github.com/apache/datafusion/pull/19278#discussion_r2616514268 It turns out aggregate functions can still be used with window frames even if they don't implement `retract_batch`, if the window frame is `ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`. And if the accumulator consumes the internal state as part of `evaluate`, it can return incorrect results. ### To Reproduce In `aggregate.slt`, try this: ```sql query ITRRR SELECT timestamp, tags, value, median(value) OVER ( PARTITION BY tags ORDER BY timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ), percentile_cont(value, 0.5) OVER ( PARTITION BY tags ORDER BY timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) FROM median_window_test ORDER BY tags, timestamp; ---- 1 tag1 10 10 10 2 tag1 20 15 15 3 tag1 30 20 20 4 tag1 40 25 25 5 tag1 50 30 30 1 tag2 60 60 60 2 tag2 70 65 65 3 tag2 80 70 70 4 tag2 90 75 75 5 tag2 100 80 80 ``` We would expect those last two columns to be the same. However the actual results are: ```sql query ITRRR SELECT timestamp, tags, value, median(value) OVER ( PARTITION BY tags ORDER BY timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ), percentile_cont(value, 0.5) OVER ( PARTITION BY tags ORDER BY timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) FROM median_window_test ORDER BY tags, timestamp; ---- 1 tag1 10 10 10 2 tag1 20 15 20 3 tag1 30 20 30 4 tag1 40 25 40 5 tag1 50 30 50 1 tag2 60 60 60 2 tag2 70 65 70 3 tag2 80 70 80 4 tag2 90 75 90 5 tag2 100 80 100 ``` - See last column; it exactly follows the input data because it keeps discarding its state ### Expected behavior I've identified the following aggregate functions which likely exhibit this behaviour: - percentile_cont - string_agg - distinct median median was fixed by https://github.com/apache/datafusion/pull/19278 We need to fix these accumulators to not consume the internal state on evaluate. We also would want to update the documentation around accumulators to make this behaviour obvious so we don't fall into the same trap later. ### Additional context Comment on why consuming internal state is bad for windows: https://github.com/apache/datafusion/pull/19278#discussion_r2609476348 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
