lhofhansl edited a comment on pull request #1256:
URL: https://github.com/apache/phoenix/pull/1256#issuecomment-875280447


   If you do SELECT count(uncovered_column) FROM T WHERE covered_column = xyz, 
the global uncovered index is not used even when you hint it as expected (I 
just verified that current 5.x. Phoenix).
   
   I found that uncovered local indexes (that's what I tested) are sometimes 
much slower than doing to a full table scan. That happens when there is a WHERE 
clause that an index could be used for, but the WHERE restriction is not 
selective.
   
   (As noted above, FAST_DIFF (Phoenix' default) is actually the worst choice 
since SEEKs are slow with it. ROW_INDEX_V1 with ZSTD compression are far 
better. I blogged about this here: 
https://hadoop-hbase.blogspot.com/2018/10/apache-hbase-and-apache-phoenix-more-on.html
 a while ago: With FAST_DIFF the WHERE clause needed to be **0.5% (return 1/200 
of the data)** to be effective. With ROW_INDEX_V1 + ZSTD that was 10%.)
   
   This is at best as good as uncovered local indexing, and probably worse 
since we need to go remote for each row, unless we do batching. And the batches 
would still be requiring a SKIP_SCAN, which in the general case is still very 
slow for FAST_DIFF.
   So I expect with defaults the WHERE clause would need to be somewhere 
between 1/1000 and 1/300 hundred selective for this to be improvement.
   
   Anyway... I think we should check this in. Presumable folks would create 
uncovered global indexes only when they know what they are doing.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to