[ 
https://issues.apache.org/jira/browse/PHOENIX-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenglei updated PHOENIX-4602:
------------------------------
    Description: 
Given following table:

{code}
    CREATE TABLE test_table (
     PK1 INTEGER NOT NULL,
     PK2 INTEGER NOT NULL,
     PK3 INTEGER NOT NULL,
     DATA INTEGER, 
     CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3))
{code}

and a sql:

{code}
  select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and 
t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))
{code}

Obviously, it is a typical case for the sql to use SkipScanFilter,however, the 
sql actually does not use Skip Scan, it use Range Scan and just push the 
leading pk column expression {{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain 
sql is :
 {code:sql}
   CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5]
       SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9))
{code}

 I think the problem is affected by the 
WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 
763, because the pk2 column is not the leading pk column,so this method return 
null, causing the expression 
{{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not  pushed 
to scan:
{code:java}
757    boolean hasFirstSlot = true;
758    boolean prevIsNull = false;
759    // TODO: Do the same optimization that we do for IN if the childSlots 
specify a fully qualified row key
760   for (KeySlot slot : childSlot) {
761      if (hasFirstSlot) {
762           // if the first slot is null, return null immediately
763           if (slot == null) {
764                return null;
765            }
766           // mark that we've handled the first slot
767           hasFirstSlot = false;
768      }
{code}

For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems 
that it is not necessary to make sure the PK Column in OrExpression is leading 
PK Column,just guarantee there is only one PK Column in OrExpression is enough. 
 

 

  was:
Given following table:

{code}
    CREATE TABLE test_table (
     PK1 INTEGER NOT NULL,
     PK2 INTEGER NOT NULL,
     PK3 INTEGER NOT NULL,
     DATA INTEGER, 
     CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3))
{code}

and a sql:

{code}
  select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 and 
t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))
{code}

Obviously, it is a typical case for the sql to use SkipScanFilter,however, the 
sql actually does not use Skip Scan, it use Range Scan and just push the 
leading pk column expression \{{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain 
is :
 {code:sql}
   CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5]
       SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9))
{code}

 I think the problem is affected by the 
WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 
763, because the pk2 column is not the leading pk column,so this
method return null, causing the expression {{ ((t.pk2 >= 4 and t.pk2 <6) or 
(t.pk2 >= 8 and t.pk2 <9))}} does not  pushed to scan
{code:java}
757    boolean hasFirstSlot = true;
758    boolean prevIsNull = false;
759    // TODO: Do the same optimization that we do for IN if the childSlots 
specify a fully qualified row key
760   for (KeySlot slot : childSlot) {
761      if (hasFirstSlot) {
762           // if the first slot is null, return null immediately
763           if (slot == null) {
764                return null;
765            }
766           // mark that we've handled the first slot
767           hasFirstSlot = false;
768      }
{code}

For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems 
that it is not necessary to make sure the pk column in OrExpression is leading 
pk column,guarantee there is only one PK Column in OrExpression is enough.  

 


> OrExpression should can also push non-leading pk columns to scan
> ----------------------------------------------------------------
>
>                 Key: PHOENIX-4602
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4602
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 4.13.0
>            Reporter: chenglei
>            Priority: Major
>         Attachments: PHOENIX-4602_v1.patch
>
>
> Given following table:
> {code}
>     CREATE TABLE test_table (
>      PK1 INTEGER NOT NULL,
>      PK2 INTEGER NOT NULL,
>      PK3 INTEGER NOT NULL,
>      DATA INTEGER, 
>      CONSTRAINT TEST_PK PRIMARY KEY (PK1,PK2,PK3))
> {code}
> and a sql:
> {code}
>   select * from test_table t where (t.pk1 >=2 and t.pk1<5) and ((t.pk2 >= 4 
> and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))
> {code}
> Obviously, it is a typical case for the sql to use SkipScanFilter,however, 
> the sql actually does not use Skip Scan, it use Range Scan and just push the 
> leading pk column expression {{ (t.pk1 >=2 and t.pk1<5)}} to scan,the explain 
> sql is :
>  {code:sql}
>    CLIENT PARALLEL 1-WAY RANGE SCAN OVER TEST_TABLE [2] - [5]
>        SERVER FILTER BY ((PK2 >= 4 AND PK2 < 6) OR (PK2 >= 8 AND PK2 < 9))
> {code}
>  I think the problem is affected by the 
> WhereOptimizer.KeyExpressionVisitor.orKeySlots method, in the following line 
> 763, because the pk2 column is not the leading pk column,so this method 
> return null, causing the expression 
> {{ ((t.pk2 >= 4 and t.pk2 <6) or (t.pk2 >= 8 and t.pk2 <9))}} does not  
> pushed to scan:
> {code:java}
> 757    boolean hasFirstSlot = true;
> 758    boolean prevIsNull = false;
> 759    // TODO: Do the same optimization that we do for IN if the childSlots 
> specify a fully qualified row key
> 760   for (KeySlot slot : childSlot) {
> 761      if (hasFirstSlot) {
> 762           // if the first slot is null, return null immediately
> 763           if (slot == null) {
> 764                return null;
> 765            }
> 766           // mark that we've handled the first slot
> 767           hasFirstSlot = false;
> 768      }
> {code}
> For above {{WhereOptimizer.KeyExpressionVisitor.orKeySlots}} method, it seems 
> that it is not necessary to make sure the PK Column in OrExpression is 
> leading PK Column,just guarantee there is only one PK Column in OrExpression 
> is enough.  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to