[ 
https://issues.apache.org/jira/browse/KUDU-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Berkeley updated KUDU-2567:
--------------------------------
    Description: 
The rowset tree only supports culling rowsets when there's both a PK upper 
bound and a PK lower bound. So, for example, on a tablet partitioned by arrival 
time, doing a query for all rows that arrived since yesterday involves creating 
iterators for every rowset, instead of only the rowsets that satisfy the 
primary key bound. Normally, this isn't such a big deal since the scan will 
immediately see from the key index that the rowset doesn't have any results, 
but in some cases (like if due to KUDU-1400 there are a lot of small rowsets), 
the time spent opening extra rowsets can make the initial scan request take a 
long time.

It should be fairly straightforward to enhance the rowset tree to handle 
intervals open on either end.

  was:
The rowset tree only supports culling rowsets when there's both a PK upper 
bound and a PK lower bound. So, for example, on a tablet partitioned by arrival 
time, doing a query for all rows that arrived since yesterday involves creating 
iterators for every rowset, instead of only the rowsets that satisfy the 
primary key bound. Normally, this isn't such a big deal such the scan will 
immediately see from the key index that the rowset doesn't have any results, 
but in some cases (like if due to KUDU-1400 there are a lot of small rowsets), 
the time spent opening extra rowsets can make the initial scan request take a 
long time.

It should be fairly straightforward to enhance the rowset tree to handle 
intervals open on either end.


> Cull rowsets for open-ended queries
> -----------------------------------
>
>                 Key: KUDU-2567
>                 URL: https://issues.apache.org/jira/browse/KUDU-2567
>             Project: Kudu
>          Issue Type: Improvement
>          Components: tablet
>    Affects Versions: 1.7.1
>            Reporter: Will Berkeley
>            Priority: Major
>
> The rowset tree only supports culling rowsets when there's both a PK upper 
> bound and a PK lower bound. So, for example, on a tablet partitioned by 
> arrival time, doing a query for all rows that arrived since yesterday 
> involves creating iterators for every rowset, instead of only the rowsets 
> that satisfy the primary key bound. Normally, this isn't such a big deal 
> since the scan will immediately see from the key index that the rowset 
> doesn't have any results, but in some cases (like if due to KUDU-1400 there 
> are a lot of small rowsets), the time spent opening extra rowsets can make 
> the initial scan request take a long time.
> It should be fairly straightforward to enhance the rowset tree to handle 
> intervals open on either end.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to