[ 
https://issues.apache.org/jira/browse/DRILL-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060359#comment-15060359
 ] 

Jacques Nadeau commented on DRILL-3929:
---------------------------------------

I feel like there are a couple different things being discussed here.

1. Planning Meta Issues
  a. How to best partition different phases of planning
  b. How to register alternative algebras for a particular expression
  c. Planning time as we fire more rules

2. Secondary Index Application Issues
  a. What are the best initial transformations to improve query speed when 
using a secondary index

For 2.a, I spent a little time generating some example transformations [1] as I 
don't fully understand the first goal that we're trying to achieve for this 
JIRA and the associated Gist. I believe it is C.alt2 from my doc. Note that I 
don't even try to address the colocated/not index concepts that James presents 
in his comments as I think those are more about costing than algebra (mostly).

For 1.a, I think we should talk about sets of rules rather than labels. As 
Julian points out above, the meaning of the labels can be different for 
different people. It might be helpful for us to have a very simple shared doc 
which describes the phases of planning Drill goes through since I think that 
may be part of the disconnection.

For 1.b: As I look at my doc, I start to agree that having to register 
alternatives through SQL seems a bit weird. Some of these transformations are 
pretty complicated. Trying to create SQL to provide the alternative plan seems 
very difficult. In general though, the primary goal from my perspective is 
simply to use the planning engine to register and leverage the alternative. As 
I understand it, Materialized Views are mostly just a special rule that fires 
at the beginning of planning that creates alternatives to tables by themselves. 
I do recall that [~julianhyde] did state there were some 'special things' that 
had to be done to make things work efficiently. My question would be whether 
those 'special things' could be made into generic things so we avoid the need 
to have to use materialized views to present alternative algebras.

For 1.c: I actually think this is the biggest issue here. In general, I think 
it is a meta-issue. I think the Drill team has created far too many planning 
stages because we're constantly challenged by planning performance. Do we have 
a good benchmark for planning time? Is Calcite too slow or is Drill using 
Calcite in an inefficient manner (most likely both are issues)? I worry that 
we're jumping through a bunch of hoops in Drill rather than trying to figure 
out what the disconnect is (general planning performance and specifically the 
issue of trait propagation). Maybe we should start by fixing those things and 
then the Drill team wouldn't be so cautious about using Calcite features (and 
working around all the richness in Calcite).

[1] 
https://docs.google.com/presentation/d/1kPVDF2hyxAI0aX1GhLLA3c5u577bFQLgg8YcaFPe0VI/edit#slide=id.p9





> Support the ability to query database tables using external indices           
> ------------------------------------------------------------------------------
>
>                 Key: DRILL-3929
>                 URL: https://issues.apache.org/jira/browse/DRILL-3929
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Execution - Relational Operators, Query Planning & 
> Optimization
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>
> This is a placeholder for adding support in Drill to query database tables 
> using external indices.  I will add more details about the use case and a 
> preliminary design proposal.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to