[
https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17411586#comment-17411586
]
ASF GitHub Bot commented on PHOENIX-6458:
-----------------------------------------
kadirozde commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-914695124
@lhofhansl , @comnetwork, I have done some performance testing on a cluster
with 15 region servers. I created a data table with 16 million rows. Each row
is about 2500 bytes. The row key of this table is composed of four fields
(VARCHAR, INTEGER, TIMESTAMP, VARCHAR). I run the same test without an index,
with a covered index and with an uncovered index. The timestamp field is
indexed. The query used in the test returned N rows that fall in to the a
supplied timestamp range, where N is supplied as the limit parameter. The
query returns four fields. The query times in ms are as follows:
limit covered uncovered no index
1 212 252 4404
10 215 256 5375
100 215 310 5169
1000 232 1125 4698
10000 433 7325 6440
100000 1588 67002 6789
It is clear that if the number of selected rows is large (in this case 10000
or more) the uncovered index starts to perform worse than the full table scan.
No sure if these results are generalizable. Instead of using an uncovered index
by default, I will add a logic to use an uncovered index only if it is given as
a hint.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Using global indexes for queries with uncovered columns
> -------------------------------------------------------
>
> Key: PHOENIX-6458
> URL: https://issues.apache.org/jira/browse/PHOENIX-6458
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 5.1.0
> Reporter: Kadir Ozdemir
> Priority: Major
> Attachments: PHOENIX-6458.master.001.patch
>
>
> Phoenix client does not use a global index for the queries with the columns
> that are not covered by the global index. However, there are many cases where
> using the global index to map secondary keys to primary keys and then
> retrieving the corresponding rows from the data table results in faster
> queries. It is expected that such performance improvement will happen when
> the index row key prefix length is greater than the data row key prefix
> length for a given query.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)