andygrove opened a new pull request, #4183:
URL: https://github.com/apache/datafusion-comet/pull/4183

   ## Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and 
enhancements and this helps us generate change logs for our releases. You can 
link an issue to this PR using the GitHub syntax. For example `Closes #123` 
indicates that this PR will close issue #123.
   -->
   
   N/A. Audit-driven test coverage; no behavior change.
   
   ## Rationale for this change
   
   `spark.sql.legacy.timeParserPolicy` (`LEGACY` / `CORRECTED` / `EXCEPTION`) 
controls which datetime parser Spark uses and changes results materially on 
lenient inputs and ambiguous patterns. No existing Comet SQL test exercises 
this config, so we have no regression net for the seven expressions that read 
it. This PR closes that gap.
   
   ## What changes are included in this PR?
   
   For each Spark expression that reads the policy (`date_format`, 
`from_unixtime`, `unix_timestamp`, `to_unix_timestamp`, 
`to_timestamp`/`to_timestamp_ntz`, `to_date`, and Spark 4's `try_to_timestamp`):
   
   - A ConfigMatrix file that runs convergent inputs under `LEGACY`, 
`CORRECTED`, and `EXCEPTION`.
   - Per-policy files (`*_legacy.sql`, `*_corrected.sql`, `*_exception.sql`) 
covering divergent inputs: single-digit fields under fixed-width patterns, 
out-of-range month/day, trailing characters, legacy-only pattern tokens like 
`aaaa`, and the `INCONSISTENT_BEHAVIOR_CROSS_VERSION` exception paths.
   
   A new contributor-guide page `spark_configs_support.md` mirrors the 
expression audit log: it tracks Spark configs that affect Comet behavior and 
records the full audit notes for `spark.sql.legacy.timeParserPolicy` (source 
semantics, affected expressions, current Comet status, test layout, findings).
   
   This PR was scaffolded with the project's `audit-comet-expression` workflow 
extended to a config-level audit, plus the `superpowers:brainstorming` and 
`superpowers:using-git-worktrees` skills.
   
   ## How are these changes tested?
   
   `CometSqlFileTestSuite` runs the 42 generated test cases through both Spark 
and Comet and compares results. Verified locally:
   
   - `./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite 
time_parser_policy" -Dtest=none` -- 42/42 pass on Spark 3.5.8 (default).
   - `./mvnw test -Pspark-3.4 -Dsuites="org.apache.comet.CometSqlFileTestSuite 
time_parser_policy" -Dtest=none` -- 42/42 pass.
   - `./mvnw test -Pspark-4.0 -Dsuites="org.apache.comet.CometSqlFileTestSuite 
try_to_timestamp_time_parser_policy" -Dtest=none` -- 6/6 pass; 
`to_timestamp_time_parser_policy_exception` also verified on 4.0.
   
   No Comet bugs were uncovered by the audit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to