Vladimir Golubev created SPARK-51428:
----------------------------------------
Summary: Implicit aliases of collated trees are assigned
non-deterministically
Key: SPARK-51428
URL: https://issues.apache.org/jira/browse/SPARK-51428
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 4.0.0
Reporter: Vladimir Golubev
Consider the following collated queries and their schemas:
1.
```
SELECT 'a' COLLATE UTF8_LCASE < 'A'
->
(collate(a, UTF8_LCASE) < 'A' collate UTF8_LCASE)
```
2.
```
SELECT CONCAT_WS('a', col1, col1) FROM VALUES ('a' COLLATE UTF8_LCASE)
->
concat_ws(a, col1, col1)
```
The 1. case has an explicit alias where 'A' literal is marked as collated,
which is correct. However, in the second case, 'a' literal is not marked as
collated in the output implicit alias, despite the fact that it is indeed
collated by `CollationTypeCoercion`. The 2. output schema has to be
`concat_ws('a' collate UTF8_LCASE, col1, col1)|`.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]