andygrove opened a new pull request, #4453: URL: https://github.com/apache/datafusion-comet/pull/4453
## Which issue does this PR close? Closes #. ## Rationale for this change While reviewing #4448 it became clear that the audit skill should produce captured tests for any correctness divergence it identifies, not just prose findings. Tests in `query ignore(...)` mode are the most useful long-lived artefact: they reproduce the bug for whoever picks up the fix, and they auto-activate when the `ignore(...)` is removed. This PR updates the `audit-comet-expression` skill to require that workflow, and applies it retroactively to three correctness findings from the #4448 audit. **Stacked on #4448.** The diff currently shows that PR's docs commits too; they will drop out when #4448 merges and this branch is rebased. ## What changes are included in this PR? **Skill update** (`audit-comet-expression`): - Step 6 reorganises the priorities into three named buckets: correctness divergences, missing coverage, and consistency issues. - Step 7 now requires correctness findings to be captured as SQL file tests in the same PR as the audit, walks through the "search for existing issue, file if missing" workflow, and gives the trivial-fix-vs-`ignore` decision rule with a concrete `query ignore(<url>)` example. **Captured tests for #4448 findings:** | Test | Bug | Issue | | --- | --- | --- | | `next_day.sql` (appended) | Comet trims `' MO '` to match `Monday`; Spark does not trim and returns NULL | #4450 | | `next_day_ansi.sql` (new) | `next_day` returns NULL where Spark throws under `spark.sql.ansi.enabled=true` | #4449 | | `make_date_ansi.sql` (new) | `make_date` returns NULL where Spark throws under `spark.sql.ansi.enabled=true` | #4451 | All three are in `query ignore(<issue>)` mode. When the upstream fix lands, removing `ignore(...)` will activate the assertions. **Audit finding withdrawn:** The `make_date` "year 0 / negative years" finding was checked against Spark's own `MakeDate.nullSafeEval`, which uses `LocalDate.of(year, month, day)` and accepts any year in the full Java `LocalDate` range. Spark and chrono agree on this case, and the existing `make_date.sql` tests at lines 133/136 already exercise it. Issue #4452 was closed as not-a-bug. **Why not fix instead of ignore?** Both `SparkNextDay` and `SparkMakeDate` live in the upstream `datafusion-spark` crate (pinned at 53.1.0 in `Cargo.lock`). Fixing the ANSI throw and the trim behaviour requires changes there, which is outside the scope of this PR. ## How are these changes tested? The new SQL test files parse against `SqlFileTestParser` (the `IgnorePattern`, `ConfigPattern`, and `MinSparkVersionPattern` regexes all match the directives used). The `query ignore(...)` queries are skipped at runtime per the documented `ignore` mode contract in `sql-file-tests.md`, so they do not affect suite outcomes today. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
