[
https://issues.apache.org/jira/browse/SPARK-57727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-57727:
-----------------------------------
Labels: pull-request-available (was: )
> Fix inferAdditionalConstraints incorrectly substituting attributes with
> non-binary-stable collations
> ----------------------------------------------------------------------------------------------------
>
> Key: SPARK-57727
> URL: https://issues.apache.org/jira/browse/SPARK-57727
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 5.0.0
> Reporter: Eric Yang
> Priority: Major
> Labels: pull-request-available
>
> `QueryPlanConstraints.inferAdditionalConstraints` infers b's predicate from a
> = b and a's predicate by substituting attribute a with b. Under a
> non-binary-stable collation, a = b is a collation equality (e.g. 'hello' =
> 'HELLO' under UTF8_LCASE), not byte equality, so the substitution is
> problematic and silently drops rows.
> Repro:
> {code:sql}
> CREATE TABLE t (a STRING COLLATE UTF8_LCASE, b STRING COLLATE UTF8_LCASE);
> INSERT INTO t VALUES ('hello', 'HELLO');
> SELECT a, b FROM t WHERE a = b AND a = 'hello' COLLATE UTF8_BINARY;
> {code}
> Returns no rows with constraint propagation enabled (default); should return
> ('hello','HELLO').
> Same class as SPARK-55647 (fixed in ConstantPropagation); this sibling rule
> was left unguarded.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]