andygrove opened a new issue, #4553:
URL: https://github.com/apache/datafusion-comet/issues/4553

   ## Background
   
   Spark's time-window grouping expressions currently fall back to Spark in 
Comet:
   
   - `window(timeColumn, windowDuration, [slideDuration], [startTime])` 
(`TimeWindow`) - tumbling/sliding windows, common in batch aggregation (`GROUP 
BY window(ts, '1 hour')`), not just streaming.
   - `session_window(timeColumn, gapDuration)` (`SessionWindow`) - session 
windows.
   - `window_time(window)` (`WindowTime`) - extracts the event time from a 
window column.
   
   Comet has no serde for `TimeWindow` / `SessionWindow` / `WindowTime` today, 
so any query using them falls back.
   
   ## Notes
   
   These are not plain scalar functions: Spark's analyzer (`TimeWindowing` / 
`SessionWindowing` rules) rewrites `window()` / `session_window()` into an 
`Expand` plus grouping on a computed window struct. Native support would need 
to handle the rewritten form (struct construction and the window-boundary 
arithmetic) and the grouping that follows.
   
   `window()` over a fixed duration is the most commonly used of the three and 
would be the natural starting point.
   
   ## Acceptance criteria
   
   - `window`, `session_window`, and `window_time` execute natively in Comet 
and match Spark.
   - Add SQL file test coverage under `expressions/datetime/`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to