[jira] [Commented] (PHOENIX-6458) Using global indexes for queries with uncovered columns

ASF GitHub Bot (Jira) Tue, 06 Jul 2021 22:01:04 -0700


    [ 
https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376260#comment-17376260
 ]


ASF GitHub Bot commented on PHOENIX-6458:
-----------------------------------------

lhofhansl commented on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-875280447


   If you do SELECT count(uncovered_column) FROM T WHERE covered_column = xyz, 
the global uncovered index is not used even when you hint it as expected (I 
just verified that current 5.x. Phoenix).
   
   I found that uncovered local indexes (that's what I tested) are something 
much slower than doing to a full table scan. That happens when there is a WHERE 
clause that an index could be used for, but the WHERE restriction is not 
selective.
   
   (As noted above, FAST_DIFF (Phoenix' default) is actually the worst choice 
since SEEKs are slow with it. ROW_INDEX_V1 with ZSTD compression are far 
better. I blogged about this here: 
https://hadoop-hbase.blogspot.com/2018/10/apache-hbase-and-apache-phoenix-more-on.html
 a while ago: With FAST_DIFF the WHERE clause needed to be **0.5% (return 1/200 
of the data)** to be effective. With ROW_INDEX_V1 + ZSTD that was 10%.)
   
   This is at best as good as uncovered local indexing, and probably worse 
since we need to go remote for each row, unless we do batching. And the batches 
would still be requiring a SKIP_SCAN, which in the general case is still very 
slow for FAST_DIFF.
   So I expect with defaults the WHERE clause would need to be somewhere 
between 1/1000 and 1/300 hundred selective for this to be improvement.
   
   Anyway... I think we should check this in. Presumable folks would create 
uncovered global indexes only when they know what they are doing.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


> Using global indexes for queries with uncovered columns
> -------------------------------------------------------
>
>                 Key: PHOENIX-6458
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6458
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.1.0
>            Reporter: Kadir Ozdemir
>            Priority: Major
>         Attachments: PHOENIX-6458.master.001.patch
>
>
> Phoenix client does not use a global index for the queries with the columns 
> that are not covered by the global index. However, there are many cases where 
> using the global index to map secondary keys to primary keys and then 
> retrieving the corresponding rows from the data table results in faster 
> queries. It is expected that such performance improvement will happen when 
> the index row key prefix length is greater than the data row key prefix 
> length for a given query. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6458) Using global indexes for queries with uncovered columns

Reply via email to