[ 
https://issues.apache.org/jira/browse/SPARK-57727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-57727:
-----------------------------------
    Labels: pull-request-available  (was: )

> Fix inferAdditionalConstraints incorrectly substituting attributes with 
> non-binary-stable collations
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-57727
>                 URL: https://issues.apache.org/jira/browse/SPARK-57727
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 5.0.0
>            Reporter: Eric Yang
>            Priority: Major
>              Labels: pull-request-available
>
> `QueryPlanConstraints.inferAdditionalConstraints` infers b's predicate from a 
> = b and a's predicate by substituting attribute a with b. Under a 
> non-binary-stable collation, a = b is a collation equality (e.g. 'hello' = 
> 'HELLO' under UTF8_LCASE), not byte equality, so the substitution is 
> problematic and silently drops rows.
> Repro:
> {code:sql}
> CREATE TABLE t (a STRING COLLATE UTF8_LCASE, b STRING COLLATE UTF8_LCASE);
> INSERT INTO t VALUES ('hello', 'HELLO');
> SELECT a, b FROM t WHERE a = b AND a = 'hello' COLLATE UTF8_BINARY;
> {code}
> Returns no rows with constraint propagation enabled (default); should return 
> ('hello','HELLO').
> Same class as SPARK-55647 (fixed in ConstantPropagation); this sibling rule 
> was left unguarded.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to