[
https://issues.apache.org/jira/browse/DRILL-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959212#comment-14959212
]
Aman Sinha commented on DRILL-3929:
-----------------------------------
> As a side point on this, I also think we need to fix the HBase pushdown so
> it behaves more like the JDBC plugin
Yes, avoiding the checks to determine the multimode patterns is not ideal..
[~jnadeau] you want to create a JIRA for it ?
Regarding the Phoenix approach, there are a few considerations:
(1) Is Phoenix registering an alternative physical plan or alternative SQL ? I
think it is the latter (SQL). There are pros and cons:
(a) covering index (all cols are available in the index) : The SQL
approach could work since the originalquery 'SELECT * FROM T
WHERE index_col < 10' can be rewritten to use the index only.
(b) the general case of non-covering index. For such cases, we may
be only retrieving the rowid/rowkey from the index, we have to join back
to the original table to retrieve rest of the columns. This should ideally be
done at the physical planning level rather than trying to express such
semantics in SQL.
(c) Doing something at the SQL or even at the logical planning level
means that the search space will increase due to treating a materialized
view/index as a separate table, putting it in the same equivalence class
as the original table.
(2) Costing and statistics:
(a) Index lookups have a random I/O pattern compared to table scans,
so they must be costed differently. I am not sure how to even model
the cost of external secondary indexes such as Elastic or Lucene.
Phoenix secondary indexes for Hbase are more 'native' so they could
have a decent cost model.
(b) In order to generate an Index scan plan, I would think Phoenix might
rely on filter selectivity estimates. However, this statistic is not always
available and non-trivial to compute for complex predicates.
If you have thoughts about these, let me know. I would like to
understand the Phoenix approach a little better...perhaps a google
hangout would help.
> Support the ability to query database tables using external indices
> ------------------------------------------------------------------------------
>
> Key: DRILL-3929
> URL: https://issues.apache.org/jira/browse/DRILL-3929
> Project: Apache Drill
> Issue Type: New Feature
> Components: Execution - Relational Operators, Query Planning &
> Optimization
> Reporter: Aman Sinha
> Assignee: Aman Sinha
>
> This is a placeholder for adding support in Drill to query database tables
> using external indices. I will add more details about the use case and a
> preliminary design proposal.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)