[jira] [Commented] (PHOENIX-6458) Using global indexes for queries with uncovered columns

Lars Hofhansl (Jira) Fri, 04 Mar 2022 17:05:05 -0800


    [ 
https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501614#comment-17501614
 ]


Lars Hofhansl commented on PHOENIX-6458:
----------------------------------------

Hmm... Doesn't quite seem to work:
{code:java}
 > select /*+ NO_INDEX */ count(suppkey) from lineitem where tax = 0.08;
+----------------+
| COUNT(SUPPKEY) |
+----------------+
| 2000406        |
+----------------+
1 row selected (6.614 seconds)

> select /*+ INDEX(lineitem g_l_tax) */ count(suppkey) from lineitem where tax 
> = 0.08;
+------------------+
| COUNT("SUPPKEY") |
+------------------+
| 0                |
+------------------+
1 row selected (7.422 seconds)

> explain select /*+ INDEX(lineitem g_l_tax) */ count(suppkey) from lineitem 
> where tax = 0.08;
+-----------------------------------------------------------------------------------------+----------------+---------------+---------------+
|                                          PLAN                                 
          | EST_BYTES_READ | EST_ROWS_READ |  EST_INFO_TS  |
+-----------------------------------------------------------------------------------------+----------------+---------------+---------------+
| CLIENT 3-CHUNK 511502 ROWS 20971582 BYTES PARALLEL 1-WAY RANGE SCAN OVER 
G_L_TAX [0.08] | 20971582       | 511502        | 1646441656705 |
|     SERVER MERGE [0.SUPPKEY]                                                  
          | 20971582       | 511502        | 1646441656705 |
|     SERVER FILTER BY FIRST KEY ONLY                                           
          | 20971582       | 511502        | 1646441656705 |
|     SERVER AGGREGATE INTO SINGLE ROW                                          
          | 20971582       | 511502        | 1646441656705 |
+-----------------------------------------------------------------------------------------+----------------+---------------+---------------+
4 rows selected (0.03 seconds){code}
 

[~kozdemir] 

> Using global indexes for queries with uncovered columns
> -------------------------------------------------------
>
>                 Key: PHOENIX-6458
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6458
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.1.0
>            Reporter: Kadir Ozdemir
>            Assignee: Kadir OZDEMIR
>            Priority: Major
>             Fix For: 4.17.0, 5.2.0, 5.1.3
>
>         Attachments: PHOENIX-6458.master.001.patch, 
> PHOENIX-6458.master.002.patch
>
>
> The Phoenix query optimizer does not use a global index for a query with the 
> columns that are not covered by the global index if the query does not have 
> the corresponding index hint for this index. With the index hint, the 
> optimizer rewrites the query where the index is used within a subquery. With 
> this subquery, the row keys of the index rows that satisfy the subquery are 
> retrieved by the Phoenix client and then pushed into the Phoenix server 
> caches of the data table regions. Finally, on the server side, data table 
> rows are scanned and joined with the index rows using HashJoin. Based on the 
> selectivity of the original query, this join operation may still result in 
> scanning a large amount of data table rows. 
> Eliminating these data table scans would be a significant improvement. To do 
> that, instead of rewriting the query, the Phoenix optimizer simply treats the 
> global index as a covered index for the given query. With this, the Phoenix 
> query optimizer chooses the index table for the query especially when the 
> index row key prefix length is greater than the data row key prefix length 
> for the query. On the server side, the index table is scanned using index row 
> key ranges implied by the query and the index row keys are then mapped to the 
> data table row keys (please note an index row key includes all the data row 
> key columns). Finally, the corresponding data table rows are scanned using 
> server-to-server RPCs.  PHOENIX-6458 (this Jira) retrieves the data table 
> rows one by one using the HBase get operation. PHOENIX-6501 replaces this get 
> operation with the scan operation to reduce the number of server-to-server 
> RPC calls.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (PHOENIX-6458) Using global indexes for queries with uncovered columns

Reply via email to