[
https://issues.apache.org/jira/browse/DRILL-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060359#comment-15060359
]
Jacques Nadeau commented on DRILL-3929:
---------------------------------------
I feel like there are a couple different things being discussed here.
1. Planning Meta Issues
a. How to best partition different phases of planning
b. How to register alternative algebras for a particular expression
c. Planning time as we fire more rules
2. Secondary Index Application Issues
a. What are the best initial transformations to improve query speed when
using a secondary index
For 2.a, I spent a little time generating some example transformations [1] as I
don't fully understand the first goal that we're trying to achieve for this
JIRA and the associated Gist. I believe it is C.alt2 from my doc. Note that I
don't even try to address the colocated/not index concepts that James presents
in his comments as I think those are more about costing than algebra (mostly).
For 1.a, I think we should talk about sets of rules rather than labels. As
Julian points out above, the meaning of the labels can be different for
different people. It might be helpful for us to have a very simple shared doc
which describes the phases of planning Drill goes through since I think that
may be part of the disconnection.
For 1.b: As I look at my doc, I start to agree that having to register
alternatives through SQL seems a bit weird. Some of these transformations are
pretty complicated. Trying to create SQL to provide the alternative plan seems
very difficult. In general though, the primary goal from my perspective is
simply to use the planning engine to register and leverage the alternative. As
I understand it, Materialized Views are mostly just a special rule that fires
at the beginning of planning that creates alternatives to tables by themselves.
I do recall that [~julianhyde] did state there were some 'special things' that
had to be done to make things work efficiently. My question would be whether
those 'special things' could be made into generic things so we avoid the need
to have to use materialized views to present alternative algebras.
For 1.c: I actually think this is the biggest issue here. In general, I think
it is a meta-issue. I think the Drill team has created far too many planning
stages because we're constantly challenged by planning performance. Do we have
a good benchmark for planning time? Is Calcite too slow or is Drill using
Calcite in an inefficient manner (most likely both are issues)? I worry that
we're jumping through a bunch of hoops in Drill rather than trying to figure
out what the disconnect is (general planning performance and specifically the
issue of trait propagation). Maybe we should start by fixing those things and
then the Drill team wouldn't be so cautious about using Calcite features (and
working around all the richness in Calcite).
[1]
https://docs.google.com/presentation/d/1kPVDF2hyxAI0aX1GhLLA3c5u577bFQLgg8YcaFPe0VI/edit#slide=id.p9
> Support the ability to query database tables using external indices
> ------------------------------------------------------------------------------
>
> Key: DRILL-3929
> URL: https://issues.apache.org/jira/browse/DRILL-3929
> Project: Apache Drill
> Issue Type: New Feature
> Components: Execution - Relational Operators, Query Planning &
> Optimization
> Reporter: Aman Sinha
> Assignee: Aman Sinha
>
> This is a placeholder for adding support in Drill to query database tables
> using external indices. I will add more details about the use case and a
> preliminary design proposal.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)