andygrove opened a new pull request, #4732: URL: https://github.com/apache/datafusion-comet/pull/4732
## Which issue does this PR close? Closes #4729 ## Rationale for this change This PR is the result of auditing the native window function support added in #4209 (covering ranking, value-shift, and standard aggregate window functions). The audit walked the Scala serde (`CometWindowExec`), the native planner (`create_window_expr` / `process_agg_func`), and the DataFusion 54 window/aggregate accumulators, and verified behavior empirically through the differential SQL-file test harness. The audit found one correctness divergence. `SUM(<decimal>)` over a sliding window frame (a frame whose lower bound is not `UNBOUNDED PRECEDING`) routes to DataFusion's built-in `sum`, whose accumulator wraps on overflow (`add_wrapping`) instead of returning Spark's NULL. On overflow the native result is a wrapped value that can fall outside the declared decimal precision, which even breaks Spark's result decoding (`EXPRESSION_DECODING_FAILED` / `NUMERIC_VALUE_OUT_OF_RANGE`). The divergence is decimal-specific: bigint sliding `SUM` overflow matches Spark (both wrap), and ever-expanding decimal `SUM` overflow already matches Spark (Comet's `SumDecimal` UDAF returns NULL). ## What changes are included in this PR? - Fall back to Spark for decimal `SUM` / `AVG` over a sliding window frame in `CometWindowExec`. Overflow cannot be detected at plan time, so the whole sliding decimal case falls back, mirroring the existing RANGE-frame DATE/DECIMAL fallbacks. Ever-expanding frames keep using Comet's overflow-aware `SumDecimal` / `AvgDecimal` UDAFs and continue to run natively. - Add regression tests in `windows/window_functions.sql` (Section 8): ever-expanding decimal `SUM` overflow stays native and returns NULL, and sliding decimal `SUM` falls back. - Record the findings in the `window_funcs` expression-audit notes. The audit also surfaced a coverage gap that is not a correctness divergence and is tracked separately in #4731: `AVG(<decimal>)` over a window always falls back to Spark on Spark 4.x because `CometWindowExec` does not recognize the `Cast(Divide(...))` average shape, leaving the `AvgDecimal` native window branch dead there. Results stay correct via fallback. The fallback guard added here also covers `AVG` so the overflow class of bug cannot reappear once that shape is recognized. This work was scaffolded by the `audit-comet-expression` project skill. ## How are these changes tested? - New and existing cases in `windows/window_functions.sql` run through `CometSqlFileTestSuite`, which compares Comet against Spark. - `CometWindowExecSuite` (49 tests) passes unchanged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
