andygrove opened a new pull request, #4183: URL: https://github.com/apache/datafusion-comet/pull/4183
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> N/A. Audit-driven test coverage; no behavior change. ## Rationale for this change `spark.sql.legacy.timeParserPolicy` (`LEGACY` / `CORRECTED` / `EXCEPTION`) controls which datetime parser Spark uses and changes results materially on lenient inputs and ambiguous patterns. No existing Comet SQL test exercises this config, so we have no regression net for the seven expressions that read it. This PR closes that gap. ## What changes are included in this PR? For each Spark expression that reads the policy (`date_format`, `from_unixtime`, `unix_timestamp`, `to_unix_timestamp`, `to_timestamp`/`to_timestamp_ntz`, `to_date`, and Spark 4's `try_to_timestamp`): - A ConfigMatrix file that runs convergent inputs under `LEGACY`, `CORRECTED`, and `EXCEPTION`. - Per-policy files (`*_legacy.sql`, `*_corrected.sql`, `*_exception.sql`) covering divergent inputs: single-digit fields under fixed-width patterns, out-of-range month/day, trailing characters, legacy-only pattern tokens like `aaaa`, and the `INCONSISTENT_BEHAVIOR_CROSS_VERSION` exception paths. A new contributor-guide page `spark_configs_support.md` mirrors the expression audit log: it tracks Spark configs that affect Comet behavior and records the full audit notes for `spark.sql.legacy.timeParserPolicy` (source semantics, affected expressions, current Comet status, test layout, findings). This PR was scaffolded with the project's `audit-comet-expression` workflow extended to a config-level audit, plus the `superpowers:brainstorming` and `superpowers:using-git-worktrees` skills. ## How are these changes tested? `CometSqlFileTestSuite` runs the 42 generated test cases through both Spark and Comet and compares results. Verified locally: - `./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite time_parser_policy" -Dtest=none` -- 42/42 pass on Spark 3.5.8 (default). - `./mvnw test -Pspark-3.4 -Dsuites="org.apache.comet.CometSqlFileTestSuite time_parser_policy" -Dtest=none` -- 42/42 pass. - `./mvnw test -Pspark-4.0 -Dsuites="org.apache.comet.CometSqlFileTestSuite try_to_timestamp_time_parser_policy" -Dtest=none` -- 6/6 pass; `to_timestamp_time_parser_policy_exception` also verified on 4.0. No Comet bugs were uncovered by the audit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
