[jira] [Comment Edited] (PHOENIX-5796) Possible query optimization when projecting uncovered columns and querying on indexed columns

chenglei (Jira) Tue, 24 Mar 2020 20:24:35 -0700


    [ 
https://issues.apache.org/jira/browse/PHOENIX-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066352#comment-17066352
 ]


chenglei edited comment on PHOENIX-5796 at 3/25/20, 3:23 AM:
-------------------------------------------------------------

[~ckulkarni]，[~dbwong] said is right，notice the bottom of your explain : 
"DYNAMIC SERVER FILTER BY "T.A" IN($6,$8)"

When using {{UNCOV_INDEX_T}}，The {{QueryOptimizer}} first get a {{in}} :
{code:java}
SELECT /*+ NO_INDEX */ C FROM T  WHERE "A" IN ((SELECT /*+ INDEX(T 
UNCOV_INDEX_T) */ ":A" FROM "UNCOV_INDEX_T"  WHERE "0:B" = 'abc'))
{code}

then convert to a {{semi join}}:
{code:java}
SELECT /*+ NO_INDEX */ C FROM T  Semi JOIN (SELECT /*+ INDEX(T UNCOV_INDEX_T) 
*/ 1 $4,":A" $5 FROM "UNCOV_INDEX_T"  WHERE "0:B" = 'abc') $3 ON ("A" = $3.$5)
{code}

For the {{semi join}}, however, in fact it did not execute the join ,what the 
semi join do is:
# first execute the rhs {{SELECT /*+ INDEX(T UNCOV_INDEX_T) */ 1 $4,":A" $5 
FROM "UNCOV_INDEX_T"  WHERE "0:B" = 'abc'}} to get the pk column {{A}} of the 
base table.
# and then rewrite the semi join to {{SELECT /*+ NO_INDEX */ C FROM T where A 
in (a1,a2,a3....)}}, that is what the  "DYNAMIC SERVER FILTER BY "T.A" 
IN($6,$8)" said in the bottom of the explain.



was (Author: comnetwork):
[~ckulkarni]，[~dbwong] said is right，notice the bottom of your explain : 
"DYNAMIC SERVER FILTER BY "T.A" IN($6,$8)"

When using {{UNCOV_INDEX_T}}，The {{QueryOptimizer}} first get a {{in}} :
{code:java}
SELECT /*+ NO_INDEX */ C FROM T  WHERE "A" IN ((SELECT /*+ INDEX(T 
UNCOV_INDEX_T) */ ":A" FROM "UNCOV_INDEX_T"  WHERE "0:B" = 'abc'))
{code}

then convert to a {{semi join}}:
{code:java}
SELECT /*+ NO_INDEX */ C FROM T  Semi JOIN (SELECT /*+ INDEX(T UNCOV_INDEX_T) 
*/ 1 $4,":A" $5 FROM "UNCOV_INDEX_T"  WHERE "0:B" = 'abc') $3 ON ("A" = $3.$5)
{code}

For the {{semi join}}, however, in fact it did not execute the join ,what the 
semi join do is:
# first execute the  {{SELECT /*+ INDEX(T UNCOV_INDEX_T) */ 1 $4,":A" $5 FROM 
"UNCOV_INDEX_T"  WHERE "0:B" = 'abc'}} to
   get the pk column {{A}}
# and then rewrite the sql to {{SELECT /*+ NO_INDEX */ C FROM T where A in 
(a1,a2,a3....)}}, that is what the  "DYNAMIC SERVER FILTER BY "T.A" IN($6,$8)" 
said in the bottom of the explain.


> Possible query optimization when projecting uncovered columns and querying on 
> indexed columns
> ---------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5796
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5796
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.0.0, 4.15.0
>            Reporter: Chinmay Kulkarni
>            Priority: Major
>         Attachments: Screen Shot 2020-03-23 at 3.25.38 PM.png, Screen Shot 
> 2020-03-23 at 3.32.24 PM.png, Screen Shot 2020-03-24 at 11.51.12 AM.png
>
>
> Start HBase-1.3 server with Phoenix-4.15.0-HBase-1.3 server jar. Connect to 
> it using sqlline.py which has Phoenix-4.15.0-HBase-1.3 Phoenix client.
> Create a base table like:
> {code:sql}
> create table t (a integer primary key, b varchar(10), c integer);
> {code}
> Create an uncovered index on top of it like:
> {code:sql}
> create index uncov_index_t on t(b);
> {code}
> Now if you issue the query:
> {code:sql}
> explain select c from t where b='abc';
> {code}
> You'd see the following explain plan:
>  !Screen Shot 2020-03-23 at 3.25.38 PM.png|height=150,width=700!
> *Which is a full table scan on the base table 't'* since we cannot use the 
> global index as 'c' is not a covered column in the global index.
> *However, projecting columns contained fully within the index pk is correctly 
> a range scan:*
> {code:sql}
> explain select a,b from t where b='abc';
> {code}
> produces the following explain plan:
>  !Screen Shot 2020-03-23 at 3.32.24 PM.png|height=150,width=700! 
> In the first query, can there be an optimization to *query the index table, 
> get the start and stop keys of the base table and then issue a range 
> scan/(bunch of point lookups) on the base table* instead of doing a full 
> table scan on the base table like we currently do?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (PHOENIX-5796) Possible query optimization when projecting uncovered columns and querying on indexed columns

Reply via email to