Re: Materialized view rewriting

Maryann Xue Fri, 26 Feb 2016 08:39:32 -0800

Thank you for pointing out another way of defining materializations, Rajat!


We had some discussion again about this interface, and Julian opened a JIRA
https://issues.apache.org/jira/browse/CALCITE-1101.

The main problems are:

1. The life cycle of the materializations. In Phoenix (we don't know about
other projects, so welcome more comments), materializations are used to
model secondary indices, which are in fact another type of Phoenix tables.
So materializations for Phoenix should have exactly the same life cycle as
its enclosed PhoenixSchema, which is a snapshot of all current table
definitions as of the timestamp for a specific JDBC statement.

2. Right now for calling defineMaterialization method, we need to take a
whole lot of trouble to get the CalciteSchema object which in fact should
be internal to Calcite code.

3. The right point of time for defining/creating those materialized views.
Whether for Quark or Phoenix, we need to make sure that we call
defineMaterialization at the exact right point of time, which is after the
schema is loaded and before the planner tries to collect and use them.
Again this had better be something taken care of by Calcite instead of
carefully maintained by the users.


Let's follow up on that JIRA though.



Thanks,
Maryann

On Fri, Feb 26, 2016 at 5:50 AM, Rajat Venkatesh <[email protected]>
wrote:

> In Quark, dont use hooks to define materializations. We use a tablefactory
> [1] to defer until the schema is loaded[2].
>
> 1.
>
> https://github.com/qubole/quark/blob/master/optimizer/src/main/java/com/qubole/quark/planner/MetadataSchema.java#L85
> 2.
>
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/materialize/MaterializationService.java#L132
>
> Hopefully, I've understood your problem correctly.
>
>
> On Thu, Feb 25, 2016 at 10:16 PM Maryann Xue <[email protected]>
> wrote:
>
> > Hi Michael,
> >
> > We had a little difficulty defining our secondary indices as materialized
> > views with our Schema SPI implementation too, and this made the code
> pretty
> > hacky. In order to call that defineMaterialization method, we hold the
> > parent SchemaPlus object within our own Schema impl object so that we can
> > get its own corresponding SchemaPlus object by calling "parentSchema
> > .getSubSchema(this.name)" later on. We do this after the schema/table
> > resolving phase (that is when the entire schema tree incl. your own
> schema
> > objects have been initiated) and call defineMaterialization for each
> > secondary index under each subSchema. We add a hook in "Hook.TRIMMED" for
> > this, which sounds pretty weird, but this is exactly a point after you
> have
> > the whole schema tree ready and before the materializations are asked for
> > by the planner.
> >
> > Anyway, I do hope the interface can be modified to avoid all this
> trouble.
> >
> >
> > Thanks,
> > Maryann
> >
> > On Thu, Feb 25, 2016 at 9:24 AM, Michael Mior <[email protected]>
> wrote:
> >
> > > Any suggestions on the best place to hook in and add the materialized
> > > views? It seems like doing so requires the SchemaPlus object
> > corresponding
> > > to the current schema. The current best approach I see is to save a
> > > reference to the parent schema and then pull out the appropriate
> > SchemaPlus
> > > object in getTableMap. This seems like a bit of a hack though.
> > >
> > > --
> > > Michael Mior
> > > [email protected]
> > >
> > > 2016-02-24 17:22 GMT-05:00 Julian Hyde <[email protected]>:
> > >
> > > > By the way, interesting that you are interested in Cassandra and
> > > > materialized views. Cassandra announced materialized view support
> > > > recently[1] but they solved only half of a problem (not an
> > > > insignificant half, I hasten to add), namely materialized view
> > > > maintenance. They don't transparently substitute them into the query
> -
> > > > you have to reference the materialized view explicitly in y our query
> > > > - so in my book they've not delivered materialized view support. If
> > > > you're planning to deliver REAL materialized view support to
> Cassandra
> > > > that would be awesome.
> > > >
> > > > Julian
> > > >
> > > > [1]
> > > >
> > http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views
> > > >
> > > >
> > > > On Wed, Feb 24, 2016 at 2:17 PM, Julian Hyde <[email protected]>
> wrote:
> > > > > As is typical for complex pieces of code like this, the
> documentation
> > > > > is in the code (and the unit test). It's probably not what you
> wanted
> > > > > to hear, but the code mutates quite fast and so if we'd written a
> > > > > design doc a few months ago it would be partially inaccurate.
> > > > >
> > > > > I, Maryann Xue and Amogh Margoor are the main authors of this code.
> > > > >
> > > > > Suggest you find a relevant test case in MaterializationTest (or
> > write
> > > > > a new one) and run it with trace enabled and/or in a debugger. You
> > > > > will see the process of matching an expression to a MV bottom up if
> > > > > you watch each call to UnifyRule.unify.
> > > > >
> > > > > Julian
> > > > >
> > > > >
> > > > > On Wed, Feb 24, 2016 at 1:40 PM, Michael Mior <[email protected]>
> > > > wrote:
> > > > >> Is there any documentation anywhere on how the current
> > implementation
> > > of
> > > > >> query rewriting for materialized views work? Mostly I'm referring
> > > > >> to MaterializedViewSubstitutionVisitor. There's a lot of code to
> > > digest
> > > > >> with not a lot of documentation and it would be helpful to have a
> > > > reference
> > > > >> to refer. Thanks!
> > > > >>
> > > > >> Cheers,
> > > > >> --
> > > > >> Michael Mior
> > > > >> [email protected]
> > > >
> > >
> >
>

Re: Materialized view rewriting

Reply via email to