[ 
https://issues.apache.org/jira/browse/CALCITE-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Makhmutov updated CALCITE-1037:
--------------------------------------
    Description: 
Column uniqueness is calculated incorrectly for 'Correlate' expression -- and 
in some cases this leads to java.lang.IndexOutOfBoundsException. Example of 
such code:
{code}select
 x.v
from
 (
  select
   t1.v
  from
   (values (1,1),(1,2)) as t1(k,v) 
   join (values (1)) as t2(k) on t1.k=t2.k
 ) x,
 lateral
 (
  select 
   t.v
  from
   unnest(multiset[x.v]) as t(v)
 ) y
group by x.v,y.v{code}

The problems seems to be related to the 
org.apache.calcite.rel.metadata.RelMdColumnUniqueness.areColumnsUnique(Correlate
 rel, ImmutableBitSet columns, boolean ignoreNulls) method -- it just delegates 
uniqueness check to left input without changing columns list, which leads to 
Exception if this list references columns from right input.

It seems, that right behavior should be following:
* For Anti/Semi join type keep the current behavior (as resulting rows contains 
fields only from left input).
* For Left/Inner join type columns set for correlate is unique only if it 
includes unique sets from both sides.

  was:
Column uniqueness is calculated incorrectly for 'Correlate' expression -- and 
in some cases this leads to java.lang.IndexOutOfBoundsException. Example of 
such code:
{code}select
 x.v
from
 (
  select
   v
  from
   (values (1)) as t(v) 
 ) as x(v),
 lateral
 (
  select 
   x.v
  from
   (values (1),(x.v)) as t(v)
 ) y
group by x.v,y.v{code}

The problems seems to be related to the 
org.apache.calcite.rel.metadata.RelMdColumnUniqueness.areColumnsUnique(Correlate
 rel, ImmutableBitSet columns, boolean ignoreNulls) method -- it just delegates 
uniqueness check to left input without changing columns list, which leads to 
Exception if this list references columns from right input.

It seems, that right behavior should be following:
* For Anti/Semi join type keep the current behavior (as resulting rows contains 
fields only from left input).
* For Left/Inner join type columns set for correlate is unique only if it 
includes unique sets from both sides.


> Column uniqueness is calculated incorrectly for 'Correlate' expression
> ----------------------------------------------------------------------
>
>                 Key: CALCITE-1037
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1037
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.5.0
>            Reporter: Alexey Makhmutov
>            Assignee: Julian Hyde
>
> Column uniqueness is calculated incorrectly for 'Correlate' expression -- and 
> in some cases this leads to java.lang.IndexOutOfBoundsException. Example of 
> such code:
> {code}select
>  x.v
> from
>  (
>   select
>    t1.v
>   from
>    (values (1,1),(1,2)) as t1(k,v) 
>    join (values (1)) as t2(k) on t1.k=t2.k
>  ) x,
>  lateral
>  (
>   select 
>    t.v
>   from
>    unnest(multiset[x.v]) as t(v)
>  ) y
> group by x.v,y.v{code}
> The problems seems to be related to the 
> org.apache.calcite.rel.metadata.RelMdColumnUniqueness.areColumnsUnique(Correlate
>  rel, ImmutableBitSet columns, boolean ignoreNulls) method -- it just 
> delegates uniqueness check to left input without changing columns list, which 
> leads to Exception if this list references columns from right input.
> It seems, that right behavior should be following:
> * For Anti/Semi join type keep the current behavior (as resulting rows 
> contains fields only from left input).
> * For Left/Inner join type columns set for correlate is unique only if it 
> includes unique sets from both sides.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to