[jira] [Comment Edited] (CALCITE-1731) Rewriting of queries using materialized views with joins and aggregates

Julian Hyde (JIRA) Thu, 30 Mar 2017 15:18:51 -0700

    [ 
https://issues.apache.org/jira/browse/CALCITE-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949912#comment-15949912
 ]


Julian Hyde edited comment on CALCITE-1731 at 3/30/17 10:18 PM:
----------------------------------------------------------------

* Lots of blank comments. Can you fill them out. Be sure to use {{<p>}} 
paragraph markers.
* Are you sure about the residue of {{x = 1 OR z = 3}}? (I get confused about 
residues. Is there a standard term for them?)
* I worry about dots and quotes in qualified table names. Is the qualified 
table name in {{RelTableRef}} fully escaped? Use a list of strings instead?
* {{RelTableRef.identifier}} is not a good name for an {{int}} field.
* Are {{RelTableRef}} and {{RexInputTableRef}} likely to occur in all phases of 
planning, or just a particular phase? If RexInputTableRef can occur throughout 
planning it will be a huge burden. The javadoc needs to spell out its purpose 
and lifecycle much more clearly.
* It seems that you have made several of the {{RexNode}} methods in 
{{SubstitutionVisitor}} smarter. Can they be unit tested? 


was (Author: julianhyde):
* Lots of blank comments. Can you fill them out. Be sure to use {{<p>}} 
paragraph markers.
* Are RelTableRef and Rex
* Are you sure about the residue of {{x = 1 OR z = 3}}? (I get confused about 
residues. Is there a standard term for them?)
* I worry about dots and quotes in qualified table names. Is the qualified 
table name in RelTableRef fully escaped? Use a list of strings instead?
* RelTableRef.identifier is not a good name for an int field.
* Is RexInputTableRef likely to occur in all phases of planning, or just a 
particular phase? If it can occur throughout planning it will be a huge burden. 
The javadoc needs to spell out its purpose and lifecycle much more clearly.
* It seems that you have made several of the RexNode methods in 
SubstitutionVisitor smarter. Can they be unit tested? 

> Rewriting of queries using materialized views with joins and aggregates
> -----------------------------------------------------------------------
>
>                 Key: CALCITE-1731
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1731
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>             Fix For: 1.13.0
>
>
> The idea is still to build a rewriting approach similar to:
> ftp://ftp.cse.buffalo.edu/users/azhang/disc/SIGMOD/pdf-files/331/202-optimizing.pdf
> I tried to build on CALCITE-1389 work. However, finally I ended up creating a 
> new alternative rule. The main reason is that I wanted to follow the paper 
> more closely and not rely on triggering rules within the MV rewriting to find 
> whether expressions are equivalent. Instead, we extract information from the 
> query plan and the MVs plans using the new metadata providers proposed in 
> CALCITE-1682, and then we use that information to validate and execute the 
> rewriting.
> I also implemented new unifying/rewriting logic within the rule, since 
> existing unifying rules for aggregates were assuming that aggregate inputs in 
> the query and the MV needed to be equivalent (same Volcano node). That 
> condition can be relaxed because we verify in the rule, by using the new 
> metadata providers as stated above, that the result for the query is 
> contained within the MV.
> I added multiple tests, but any feedback pointing to new tests that could be 
> added to check correctness/coverage is welcome.
> Algorithm can trigger multiple rewritings for the same query node. In 
> addition, support for multiple usages of tables in query/MVs is supported.
> A few extensions that will follow this issue:
> * Extend logic to filter relevant MVs for a given query node, so approach is 
> scalable as number of MVs grows.
> * Produce rewritings using Union operators, e.g., a given query could be 
> partially answered from the MV (_year = 2014_) and from the query 
> (_not(year=2014)_). If the MV is stored e.g. in Druid, this rewriting might 
> be beneficial. As with the other rewritings, decision on whether to finally 
> use the rewriting should be cost-based.
> * Currently query and MV must use the same tables. This logic can be extended:
> - Firstly, if query uses an additional table than MV, we can produce a 
> rewriting that joins the MV with that additional table (given that join keys 
> are available in the MV and we can compute all output columns).
> - Second, if MV uses more tables than the query, we can recognize the 
> cardinality preserving joins to just project columns out of the MV and use it 
> in the rewriting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (CALCITE-1731) Rewriting of queries using materialized views with joins and aggregates

Reply via email to