manuzhang opened a new pull request, #4701: URL: https://github.com/apache/datafusion-comet/pull/4701
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> Part of apache/datafusion-comet#4505. ## Rationale for this change `str_to_map` should fall back to Spark when collation-aware string splitting is required. Comet's native path only matches Spark behavior for default `UTF8_BINARY` string inputs and delimiters. ## What changes are included in this PR? - Marks `StringToMap` expressions with non-`UTF8_BINARY` collations as incompatible. - Enumerates both collation and legacy-truncate incompatibility reasons for `str_to_map`. - Adds Spark 4.0+ SQL fallback coverage for collated input, pair delimiters, and key-value delimiters. ## How are these changes tested? - Adds SQL tests `str_to_map_collation.sql` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
