[jira] [Commented] (PHOENIX-258) Use skip scan when SELECT DISTINCT on leading row key column(s)

Lars Hofhansl (JIRA) Wed, 01 Jun 2016 15:50:24 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311293#comment-15311293
 ]


Lars Hofhansl commented on PHOENIX-258:
---------------------------------------

bq. For example, if we have a schema like VARCHAR, INT, INT, we need to pad for 
the 2nd and 3rd slot position

I think we only need to pad whatever the last distinct/groupby position is. So 
in the schema above if the distinct is on key part1, we'd pad the VARCHAR, if 
the distinct is part1, part2, we'd pad the INT if needed. (we wouldn't optimize 
this for distinct over all parts)
Pretty sure this works correctly in all cases now.



> Use skip scan when SELECT DISTINCT on leading row key column(s)
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-258
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-258
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: ryang-sfdc
>            Assignee: Lars Hofhansl
>             Fix For: 4.8.0
>
>         Attachments: 258-WIP.txt, 258-v1.txt, 258-v10.txt, 258-v11.txt, 
> 258-v12.txt, 258-v13.txt, 258-v14.txt, 258-v2.txt, 258-v3.txt, 258-v4.txt, 
> 258-v5.txt, 258-v6.txt, 258-v7.txt, 258-v8.txt, 258-v9.txt, 258.txt, 
> DistinctFixedPrefixFilter.java, in-clause.png
>
>
> create table(a varchar(32) not null, date date not null constraint pk primary 
> key(a,date))
> [["PLAN"],["CLIENT PARALLEL 94-WAY FULL SCAN OVER foo"],["    SERVER 
> AGGREGATE INTO ORDERED DISTINCT ROWS BY [a]"],["CLIENT MERGE SORT"]]          
>    
> We should skip scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-258) Use skip scan when SELECT DISTINCT on leading row key column(s)

Reply via email to