[ 
https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated PHOENIX-4602:
------------------------------
    Description: 
Given following table:

{code}
    CREATE TABLE test_table (
     PK1 INTEGER NOT NULL,
     PK2 INTEGER NOT NULL,
     PK3 INTEGER NOT NULL,
     DATA INTEGER, 
     CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3))
{code}

and a sql:

{code}
  select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and 
t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))
{code}

Obviously, it is a typical case for the sql to use SkipScanFilter,however, the 
sql actually does not use Skip Scan, it use Range Scan and just push the 
leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain 
sql is :
 {code:sql}
   CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5]
       SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9))
{code}

 I think the problem is affected by the 
{{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following 
line 763, because the pk2 column is not the leading pk column,so this method 
return null, causing the expression 
{{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not  pushed 
to scan:
{code:java}
757    boolean hasFirstSlot = true;
758    boolean prevIsNull = false;
759    // TODO: Do the same optimization that we do for IN if the childSlots 
specify a fully qualified row key
760   for (KeySlot slot : childSlot) {
761      if (hasFirstSlot) {
762           // if the first slot is null, return null immediately
763           if (slot == null) {
764                return null;
765            }
766           // mark that we've handled the first slot
767           hasFirstSlot = false;
768      }
{code}

For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems 
that it is not necessary to make sure the PK Column in OrExpression is leading 
PK Column,just guarantee there is only one PK Column in OrExpression is enough. 
 

 

  was:
Given following table:

{code}
    CREATE TABLE test_table (
     PK1 INTEGER NOT NULL,
     PK2 INTEGER NOT NULL,
     PK3 INTEGER NOT NULL,
     DATA INTEGER, 
     CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3))
{code}

and a sql:

{code}
  select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and 
t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))
{code}

Obviously, it is a typical case for the sql to use SkipScanFilter,however, the 
sql actually does not use Skip Scan, it use Range Scan and just push the 
leading pk column expression {{ (t.pk1 >=2 and t.pk1<5) }} to scan,the explain 
sql is :
 {code:sql}
   CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5]
       SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9))
{code}

 I think the problem is affected by the 
{{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following 
line 763, because the pk2 column is not the leading pk column,so this method 
return null, causing the expression 
{{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9)) }} does not  pushed 
to scan:
{code:java}
757    boolean hasFirstSlot = true;
758    boolean prevIsNull = false;
759    // TODO: Do the same optimization that we do for IN if the childSlots 
specify a fully qualified row key
760   for (KeySlot slot : childSlot) {
761      if (hasFirstSlot) {
762           // if the first slot is null, return null immediately
763           if (slot == null) {
764                return null;
765            }
766           // mark that we've handled the first slot
767           hasFirstSlot = false;
768      }
{code}

For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems 
that it is not necessary to make sure the PK Column in OrExpression is leading 
PK Column,just guarantee there is only one PK Column in OrExpression is enough. 
 

 


> OrExpression should can also push non-leading pk columns to scan
> ----------------------------------------------------------------
>
>                 Key: PHOENIX-4602
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4602
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 4.13.0
>            Reporter: chenglei
>            Priority: Major
>         Attachments: PHOENIX-4602_v1.patch
>
>
> Given following table:
> {code}
>     CREATE TABLE test_table (
>      PK1 INTEGER NOT NULL,
>      PK2 INTEGER NOT NULL,
>      PK3 INTEGER NOT NULL,
>      DATA INTEGER, 
>      CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3))
> {code}
> and a sql:
> {code}
>   select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 
> and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))
> {code}
> Obviously, it is a typical case for the sql to use SkipScanFilter,however, 
> the sql actually does not use Skip Scan, it use Range Scan and just push the 
> leading pk column expression {{(t.pk1 >=2 and t.pk1<5)}} to scan,the explain 
> sql is :
>  {code:sql}
>    CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5]
>        SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9))
> {code}
>  I think the problem is affected by the 
> {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, in the following 
> line 763, because the pk2 column is not the leading pk column,so this method 
> return null, causing the expression 
> {{((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not  pushed 
> to scan:
> {code:java}
> 757    boolean hasFirstSlot = true;
> 758    boolean prevIsNull = false;
> 759    // TODO: Do the same optimization that we do for IN if the childSlots 
> specify a fully qualified row key
> 760   for (KeySlot slot : childSlot) {
> 761      if (hasFirstSlot) {
> 762           // if the first slot is null, return null immediately
> 763           if (slot == null) {
> 764                return null;
> 765            }
> 766           // mark that we've handled the first slot
> 767           hasFirstSlot = false;
> 768      }
> {code}
> For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems 
> that it is not necessary to make sure the PK Column in OrExpression is 
> leading PK Column,just guarantee there is only one PK Column in OrExpression 
> is enough.  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to