[
https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371547#comment-17371547
]
ASF GitHub Bot commented on PHOENIX-6458:
-----------------------------------------
kadirozde commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-870799645
> @kadirozde @lhofhansl FYI.
>
> 1.You said "Phoenix client does not use a global index for the queries
with the columns that are not covered by the global index" is not right , In
QueryOptimizer.addPlan, for the sql with the columns that are not covered by
the global index, if user specify a Index Hint and there exists where clause,
the sql would be rewritten as
> "SELECT /*+ NO_INDEX _/ K,V1,V2 FROM T WHERE ("K" IN ((SELECT /_+ INDEX(T
IDX) */ ":K" FROM "IDX" WHERE "0:V1" = 'bar')) AND V2 = 'foo') " (k is pk of T
, v1 is in IDX and v2 is not), you may consider compatibility with exising code.
>
What I meant is that by default the uncovered global index is not used. One
can construct a query plan manually using hints as you pointed it out to use
the uncovered global index. Please note that you can also construct a SQL join
statement and achieve the same thing.
> 2.Whether or not scaning the gobal index and retrieving the corresponding
rows from the data table is better than just scaning the data table is a
complex problem, because there are many factors we need to consider such as
Network cost, random disk access cost , data distribution , column selective
etc. You said "It is expected that such performance improvement will happen
when the index row key prefix length is greater than the data row key prefix
length for a given query" is extremely insufficient. Lack of a CBO framework in
Phoenix, seems that it is sensible to be conservative, I think it is better to
left whether or not select this strategy to user by user specifying the Index
Hint just as the existing code.
I agree that there is no guarantee that the index always performs better.
However, based on my experience, it will perform better in most of the cases in
practice. This is because the index PK is designed by the user who knows the
use case (the type and shape of queries) and the user wants that the index
should be used if the index row key prefix length is greater than the data row
key prefix length for a given query in general. I understand your concern here
and please help me out on how to proceed here. I can add a config param to use
uncovered indexes without a specific hint. This mean that we will preserve the
existing behavior if the config param is not specified. Would that address your
concern?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Using global indexes for queries with uncovered columns
> -------------------------------------------------------
>
> Key: PHOENIX-6458
> URL: https://issues.apache.org/jira/browse/PHOENIX-6458
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 5.1.0
> Reporter: Kadir Ozdemir
> Priority: Major
> Attachments: PHOENIX-6458.master.001.patch
>
>
> Phoenix client does not use a global index for the queries with the columns
> that are not covered by the global index. However, there are many cases where
> using the global index to map secondary keys to primary keys and then
> retrieving the corresponding rows from the data table results in faster
> queries. It is expected that such performance improvement will happen when
> the index row key prefix length is greater than the data row key prefix
> length for a given query.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)