[
https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376260#comment-17376260
]
ASF GitHub Bot commented on PHOENIX-6458:
-----------------------------------------
lhofhansl commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-875280447
If you do SELECT count(uncovered_column) FROM T WHERE covered_column = xyz,
the global uncovered index is not used even when you hint it as expected (I
just verified that current 5.x. Phoenix).
I found that uncovered local indexes (that's what I tested) are something
much slower than doing to a full table scan. That happens when there is a WHERE
clause that an index could be used for, but the WHERE restriction is not
selective.
(As noted above, FAST_DIFF (Phoenix' default) is actually the worst choice
since SEEKs are slow with it. ROW_INDEX_V1 with ZSTD compression are far
better. I blogged about this here:
https://hadoop-hbase.blogspot.com/2018/10/apache-hbase-and-apache-phoenix-more-on.html
a while ago: With FAST_DIFF the WHERE clause needed to be **0.5% (return 1/200
of the data)** to be effective. With ROW_INDEX_V1 + ZSTD that was 10%.)
This is at best as good as uncovered local indexing, and probably worse
since we need to go remote for each row, unless we do batching. And the batches
would still be requiring a SKIP_SCAN, which in the general case is still very
slow for FAST_DIFF.
So I expect with defaults the WHERE clause would need to be somewhere
between 1/1000 and 1/300 hundred selective for this to be improvement.
Anyway... I think we should check this in. Presumable folks would create
uncovered global indexes only when they know what they are doing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Using global indexes for queries with uncovered columns
> -------------------------------------------------------
>
> Key: PHOENIX-6458
> URL: https://issues.apache.org/jira/browse/PHOENIX-6458
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 5.1.0
> Reporter: Kadir Ozdemir
> Priority: Major
> Attachments: PHOENIX-6458.master.001.patch
>
>
> Phoenix client does not use a global index for the queries with the columns
> that are not covered by the global index. However, there are many cases where
> using the global index to map secondary keys to primary keys and then
> retrieving the corresponding rows from the data table results in faster
> queries. It is expected that such performance improvement will happen when
> the index row key prefix length is greater than the data row key prefix
> length for a given query.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)