andygrove opened a new issue, #4646: URL: https://github.com/apache/datafusion-comet/issues/4646
## Describe the bug Spark 4.0 widens many string-typed `inputTypes` on datetime expressions to `StringTypeWithCollation(supportsTrimCollation = true)`. The affected datetime expressions include `convert_timezone`, `date_format`, `date_trunc`, `from_unixtime`, `make_timestamp`, `next_day`, `to_unix_timestamp`, `trunc`, and `unix_timestamp`. Today the Comet serdes for these expressions accept those string inputs without distinguishing the collation, so non-default collations are silently treated as compatible. Per the `audit-comet-expression` skill (rule 11), a non-default collation on a string input should flip the support level to `Incompatible(Some(...))` so the divergence is visible in EXPLAIN and the auto-generated compatibility guide, and so the projection falls back rather than producing potentially divergent results. ## Steps to reproduce On Spark 4.0, apply a non-default collation (for example `UTF8_LCASE` or `UNICODE_CI`) to a string argument of one of the datetime expressions above and observe that Comet still runs the expression natively without distinguishing the collation. ## Expected behavior Non-default collations on string inputs to these datetime expressions should report `Incompatible(Some(...))` (falling back unless explicitly opted in), consistent with how other expressions gate collation. ## Additional context Split out from the high-priority list in #4502 (item 5, originally tracked as medium priority) so that #4502 can be closed once the remaining fixes land. Cross-references #2190 and #4496. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
