[
https://issues.apache.org/jira/browse/DRILL-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14997866#comment-14997866
]
Julian Hyde commented on DRILL-3929:
------------------------------------
[~amansinha100], Your key argument against using modeling indexes as Calcite
MVs is that it will "increase the search space". I do acknowledge that managing
the planner's search space is a problem in Calcite. It is in every query
planner.
But secondary indexes inflate the search space because they create so many more
possibilities for execution plans. This is good!
You state "External secondary indexes can be of two types: covering index and
non-covering index". Phoenix also has local and global indexes. Some systems
have hash indexes. Vertica and Druid has sorted projection tables. These are
all forms of index, and there are more kinds of index that I haven't thought of
or haven't been invented yet. They can all be modeled as MVs, then chosen based
on cost, but I think your scheme would run out of road very quickly if the
requirements were changed.
Also, consider the ways that a query can use several indexes. Some types of
indexes, in particular bitmap indexes on the same table, can be intersected and
unioned before generating a stream of rowids into the table scan. A rule-based
approach would have difficulty choosing the best valid combination of indexes.
Lastly, consider summary tables, which I am sure Drill will use at some point.
Summary tables are a kind of index (similar to sort-project index with optional
aggregate), but summary tables can have indexes too! If you model summary
tables and indexes as different concepts from each other and from base
relations, your search space just got not larger, but a lot more complicated.
Pragmatically that means that rules you have written for recognizing indexes on
base tables won't work for indexes on summary tables; and you will have to
write special rules that treat a sorted summary table as a non-covering index.
We should not use Volcano, with all rules enabled simultaneously, to optimize
these queries; the search space will be too large. But by not modeling indexes
as what they are -- relations containing useful denormalized data in a useful
physical layout -- we are turning our back on many of the possibilities that
they offer.
> Support the ability to query database tables using external indices
> ------------------------------------------------------------------------------
>
> Key: DRILL-3929
> URL: https://issues.apache.org/jira/browse/DRILL-3929
> Project: Apache Drill
> Issue Type: New Feature
> Components: Execution - Relational Operators, Query Planning &
> Optimization
> Reporter: Aman Sinha
> Assignee: Aman Sinha
>
> This is a placeholder for adding support in Drill to query database tables
> using external indices. I will add more details about the use case and a
> preliminary design proposal.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)