andygrove opened a new issue, #4553: URL: https://github.com/apache/datafusion-comet/issues/4553
## Background Spark's time-window grouping expressions currently fall back to Spark in Comet: - `window(timeColumn, windowDuration, [slideDuration], [startTime])` (`TimeWindow`) - tumbling/sliding windows, common in batch aggregation (`GROUP BY window(ts, '1 hour')`), not just streaming. - `session_window(timeColumn, gapDuration)` (`SessionWindow`) - session windows. - `window_time(window)` (`WindowTime`) - extracts the event time from a window column. Comet has no serde for `TimeWindow` / `SessionWindow` / `WindowTime` today, so any query using them falls back. ## Notes These are not plain scalar functions: Spark's analyzer (`TimeWindowing` / `SessionWindowing` rules) rewrites `window()` / `session_window()` into an `Expand` plus grouping on a computed window struct. Native support would need to handle the rewritten form (struct construction and the window-boundary arithmetic) and the grouping that follows. `window()` over a fixed duration is the most commonly used of the three and would be the natural starting point. ## Acceptance criteria - `window`, `session_window`, and `window_time` execute natively in Comet and match Spark. - Add SQL file test coverage under `expressions/datetime/`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
