[GitHub] [phoenix] kadirozde commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

GitBox Tue, 29 Jun 2021 10:56:39 -0700


kadirozde commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870799645



   > @kadirozde @lhofhansl FYI.
   > 
   > 1.You said "Phoenix client does not use a global index for the queries 
with the columns that are not covered by the global index" is not right , In 
QueryOptimizer.addPlan, for the sql with the columns that are not covered by 
the global index, if user specify a Index Hint and there exists where clause, 
the sql would be rewritten as
   > "SELECT /*+ NO_INDEX _/ K,V1,V2 FROM T WHERE ("K" IN ((SELECT /_+ INDEX(T 
IDX) */ ":K" FROM "IDX" WHERE "0:V1" = 'bar')) AND V2 = 'foo') " (k is pk of T 
, v1 is in IDX and v2 is not), you may consider compatibility with exising code.
   > 
   
   What I meant is that by default the uncovered global index is not used. One 
can construct a query plan manually using hints as you pointed it out to use 
the uncovered global index. Please note that you can also construct a SQL join 
statement and achieve the same thing.  
   
   > 2.Whether or not scaning the gobal index and retrieving the corresponding 
rows from the data table is better than just scaning the data table is a 
complex problem, because there are many factors we need to consider such as 
Network cost, random disk access cost , data distribution , column selective 
etc. You said "It is expected that such performance improvement will happen 
when the index row key prefix length is greater than the data row key prefix 
length for a given query" is extremely insufficient. Lack of a CBO framework in 
Phoenix, seems that it is sensible to be conservative, I think it is better to 
left whether or not select this strategy to user by user specifying the Index 
Hint just as the existing code.
   
   I agree that there is no guarantee that the index always performs better. 
However, based on my experience, it will perform better in most of the cases in 
practice.  This is because the index PK is designed by the user who knows the 
use case (the type and shape of queries) and the user wants that the index 
should be used if the index row key prefix length is greater than the data row 
key prefix length for a given query in general. I understand your concern here 
and please help me out on how to proceed here. I can add a config param to use 
uncovered indexes without a specific hint.  This mean that we will preserve the 
existing behavior if the config param is not specified. Would that address your 
concern?
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [phoenix] kadirozde commented on pull request #1256: PHOENIX-6458 Using global indexes for queries with uncovered columns

Reply via email to