mihailom-db opened a new pull request, #45383: URL: https://github.com/apache/spark/pull/45383
### What changes were proposed in this pull request? This PR adds automatic casting and collations resolution as per `PGSQL` behaviour: 1. Collations set on the metadata level are implicit 2. Collations set using the `COLLATE` expression are explicit 3. When there is a combination of expressions of multiple collations the output will be: - if there are explicit collations and all of them are equal then that collation will be the output - if there are multiple different explicit collations `COLLATION_MISMATCH.EXPLICIT` will be thrown - if there are no explicit collations and only a single type of non default collation, that one will be used - if there are no explicit collations and multiple non-default implicit ones `COLLATION_MISMATCH.IMPLICIT` will be thrown Another thing is that `INDETERMINATE_COLLATION` should only be thrown on comparison operations, and we should be able to combine different implicit collations for certain operations like concat and possible others in the future. This is why I had to add another predefined collation id named INDETERMINATE_COLLATION_ID which means that the result is a combination of conflicting non-default implicit collations. Right now it has an id of -1 so it fails if it ever goes to the `CollatorFactory`. ### Why are the changes needed? We need to be able to compare columns and values with different collations and set a way of explicitly changing the collation we want to use. ### Does this PR introduce _any_ user-facing change? Yes. We add 3 new errors and enable collation casting. ### How was this patch tested? Tests in `CollationSuite` were done to check code validity. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
