[ 
https://issues.apache.org/jira/browse/DRILL-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14997866#comment-14997866
 ] 

Julian Hyde commented on DRILL-3929:
------------------------------------

[~amansinha100], Your key argument against using modeling indexes as Calcite 
MVs is that it will "increase the search space". I do acknowledge that managing 
the planner's search space is a problem in Calcite. It is in every query 
planner.

But secondary indexes inflate the search space because they create so many more 
possibilities for execution plans. This is good! 

You state "External secondary indexes can be of two types: covering index and 
non-covering index". Phoenix also has local and global indexes. Some systems 
have hash indexes. Vertica and Druid has sorted projection tables. These are 
all forms of index, and there are more kinds of index that I haven't thought of 
or haven't been invented yet. They can all be modeled as MVs, then chosen based 
on cost, but I think your scheme would run out of road very quickly if the 
requirements were changed.

Also, consider the ways that a query can use several indexes. Some types of 
indexes, in particular bitmap indexes on the same table, can be intersected and 
unioned before generating a stream of rowids into the table scan. A rule-based 
approach would have difficulty choosing the best valid combination of indexes.

Lastly, consider summary tables, which I am sure Drill will use at some point. 
Summary tables are a kind of index (similar to sort-project index with optional 
aggregate), but summary tables can have indexes too! If you model summary 
tables and indexes as different concepts from each other and from base 
relations, your search space just got not larger, but a lot more complicated. 
Pragmatically that means that rules you have written for recognizing indexes on 
base tables won't work for indexes on summary tables; and you will have to 
write special rules that treat a sorted summary table as a non-covering index.

We should not use Volcano, with all rules enabled simultaneously, to optimize 
these queries; the search space will be too large. But by not modeling indexes 
as what they are -- relations containing useful denormalized data in a useful 
physical layout -- we are turning our back on many of the possibilities that 
they offer.

> Support the ability to query database tables using external indices           
> ------------------------------------------------------------------------------
>
>                 Key: DRILL-3929
>                 URL: https://issues.apache.org/jira/browse/DRILL-3929
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Execution - Relational Operators, Query Planning & 
> Optimization
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>
> This is a placeholder for adding support in Drill to query database tables 
> using external indices.  I will add more details about the use case and a 
> preliminary design proposal.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to