andygrove opened a new pull request, #4439: URL: https://github.com/apache/datafusion-comet/pull/4439
## Which issue does this PR close? N/A. Autonomous audit pass. ## Rationale for this change Audit of the `Average` (`avg`) aggregate expression against Spark 3.4.3, 3.5.8, and 4.0.1. The aggregate logic is identical across all three versions (4.0.1 only changes a `QueryContext` import path). The Comet serde and the Rust `Avg` / `AvgDecimal` accumulators correctly handle numeric and decimal inputs, including ANSI-mode decimal overflow. The audit found one inaccurate user-facing string in the serde and several uncovered edge cases that are now exercised. ## What changes are included in this PR? - Audit sub-bullets in `spark_expressions_support.md` recording dates and the per-version finding for 3.4.3, 3.5.8, and 4.0.1. - Corrected `CometAverage.getIncompatibleReasons` text. The previous text claimed "Falls back to Spark in ANSI mode. Supports all numeric inputs except decimal types", neither of which is accurate: ANSI mode is wired through to the native `AvgDecimal` accumulator, and decimal inputs are supported via `avgDataTypeSupported`. The new text describes the real caveat: Comet falls back to Spark for `YearMonthIntervalType` and `DayTimeIntervalType` inputs (which Spark supports since 3.4). - Expanded `expressions/aggregate/avg.sql` with new SQL test cases: single-row group; tinyint and smallint inputs; all-NULL groups; empty input; double NaN / +Infinity / -Infinity mixes; Long boundary values; negative-only inputs; decimal at precision 20; cross-check against `count`. ## How are these changes tested? - `./mvnw test -DwildcardSuites=CometSqlFileTestSuite -Dsuites="org.apache.comet.CometSqlFileTestSuite avg" -Dtest=none` (passes locally; all new queries match Spark) Scaffolded by the `audit-comet-expression-autonomous` skill. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
