andygrove opened a new pull request, #3397: URL: https://github.com/apache/datafusion-comet/pull/3397
## Summary This PR fixes core correctness issues with windowed aggregate queries by adding an explicit `SortExec` before `BoundedWindowAggExec` when ORDER BY is present. **Tracking Issue:** #2721 ## Changes 1. **Add explicit SortExec** (`planner.rs`) - Insert sort before `BoundedWindowAggExec` when ORDER BY is present, ensuring `InputOrderMode::Sorted` requirement is satisfied 2. **Improve support level detection** (`CometWindowExec.scala`) - Change from blanket `Incompatible` to `Compatible` for valid cases, with proper validation that partition expressions must be a subset of order expressions 3. **Disable by default** (`CometConf.scala`) - Set `spark.comet.exec.window.enabled=false` to avoid breaking changes; users can opt-in to test ## What's Now Supported (when enabled) - Window aggregates: `COUNT`, `SUM`, `MIN`, `MAX` - `OVER()` - no partition, no order - `OVER(ORDER BY x)` - order only - `OVER(PARTITION BY x)` - partition only - `OVER(PARTITION BY x ORDER BY x, y)` - partition is subset of order ## What's NOT Supported (falls back to Spark) - `PARTITION BY a ORDER BY b` where partition columns differ from order columns - `AVG` window aggregate (native implementation has known issues) - Ranking functions: `ROW_NUMBER`, `RANK`, `DENSE_RANK`, etc. - Offset functions: `LAG`, `LEAD` - Value functions: `FIRST_VALUE`, `LAST_VALUE`, `NTH_VALUE` - `RANGE BETWEEN` with numeric/temporal expressions (#1246) ## Test Plan - [x] All existing window tests pass (14 tests) - [x] Enabled "aggregate window function for all types" test that was previously ignored - [x] Added new tests for partition-subset-of-order validation - [x] No golden file updates needed (feature disabled by default) 🤖 Generated with [Claude Code](https://claude.ai/code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
