[jira] [Commented] (PHOENIX-258) Use skip scan when SELECT DISTINCT on leading row key column(s)

Lars Hofhansl (JIRA) Sat, 28 May 2016 23:06:26 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15305772#comment-15305772
 ]


Lars Hofhansl commented on PHOENIX-258:
---------------------------------------

Thanks [~giacomotaylor]

-v7 has the follwing:
* Renamed to DistinctPrefixFilter. Fixed was relating to the fact that we have 
a fixed prefix length (in terms of the number of fields)
* uses context.getAggregateManager().isEmpty()
* avoid nextKey for variable length trailing prefixes.

* With "reverse" you mean explicit ORDER BY? Or the key being declared reversed 
because it was defined such in the table? I either case everything should fine. 
In case of ORDER BY it is sorted after the fact. If declared in the table, the 
key is reversed.
* The WHERE example works fine, since the distinct filter is last. I.e. the 
WHERE is evaluated first, then post filtered by the distinct filter. I.e. the 
distinct filter operates on the filtered values. That actually has the 
interested effect that the benefits of the distinct filter are reduce the more 
selective the WHERE clause is.
* I don't quite follow the RCV example. Why would that not just work? {{cols < 
plan.getTableRef().getTable().getRowKeySchema().getFieldCount()}} is just an 
extra optimization to avoid the filter in that case.


> Use skip scan when SELECT DISTINCT on leading row key column(s)
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-258
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-258
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: ryang-sfdc
>            Assignee: Lars Hofhansl
>             Fix For: 4.8.0
>
>         Attachments: 258-WIP.txt, 258-v1.txt, 258-v2.txt, 258-v3.txt, 
> 258-v4.txt, 258-v5.txt, 258-v6.txt, 258-v7.txt, 258.txt, 
> DistinctFixedPrefixFilter.java, in-clause.png
>
>
> create table(a varchar(32) not null, date date not null constraint pk primary 
> key(a,date))
> [["PLAN"],["CLIENT PARALLEL 94-WAY FULL SCAN OVER foo"],["    SERVER 
> AGGREGATE INTO ORDERED DISTINCT ROWS BY [a]"],["CLIENT MERGE SORT"]]          
>    
> We should skip scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-258) Use skip scan when SELECT DISTINCT on leading row key column(s)

Reply via email to