[ 
https://issues.apache.org/jira/browse/CALCITE-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-7199:
-------------------------------
    Summary: Improve column uniqueness computation for Join  (was: 
RelMdColumnUniqueness joins handler is invalid for columns in both sides)

> Improve column uniqueness computation for Join
> ----------------------------------------------
>
>                 Key: CALCITE-7199
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7199
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.40.0
>            Reporter: Claude Brisson
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.41.0
>
>
> {{RelMdColumnUniqueness.areColumnsUnique()}} handler for joins contains the 
> following code:
> {code}
>     // If the original column mask contains columns from both the left and
>     // right hand side, then the columns are unique if and only if they're
>     // unique for their respective join inputs
>     Boolean leftUnique = mq.areColumnsUnique(left, leftColumns, ignoreNulls);
>     Boolean rightUnique = mq.areColumnsUnique(right, rightColumns, 
> ignoreNulls);
>     if ((leftColumns.cardinality() > 0)
>         && (rightColumns.cardinality() > 0)) {
>       if ((leftUnique == null) || (rightUnique == null)) {
>         return null;
>       } else {
>         return leftUnique && rightUnique;
>       }
>     }
> {code}
> This is not correct. Uniqueness on both sides is a sufficient condition for 
> the columns to be unique in the join result, but not a necessary one.
> The columns will also be unique in the joins result if the following 
> conditions are all met:
> * the queried columns coming from the left input are unique
> * the columns implied in the right side of the join equi-condition are unique
> * the join does not generate nulls on the left
> (and the same with left and right reversed).
> The fact that the join is done on a unique key on the right side will 
> guarantee that the uniqueness of the queried columns on the left is 
> preserved, and whatever columns we add in the queried subset, it will remain 
> unique.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to