Re: Materialized view rewriting

Julian Hyde Fri, 26 Feb 2016 10:14:34 -0800

Rajat,

I’ll second what Maryann says. Please chime in on 
https://issues.apache.org/jira/browse/CALCITE-1101 
<https://issues.apache.org/jira/browse/CALCITE-1101>. If it doesn’t make things 
easier for Quark we probably shouldn’t be doing it.


Julian

> On Feb 26, 2016, at 8:39 AM, Maryann Xue <maryann....@gmail.com> wrote:
> 
> Thank you for pointing out another way of defining materializations, Rajat!
> 
> We had some discussion again about this interface, and Julian opened a JIRA
> https://issues.apache.org/jira/browse/CALCITE-1101.
> 
> The main problems are:
> 
> 1. The life cycle of the materializations. In Phoenix (we don't know about
> other projects, so welcome more comments), materializations are used to
> model secondary indices, which are in fact another type of Phoenix tables.
> So materializations for Phoenix should have exactly the same life cycle as
> its enclosed PhoenixSchema, which is a snapshot of all current table
> definitions as of the timestamp for a specific JDBC statement.
> 
> 2. Right now for calling defineMaterialization method, we need to take a
> whole lot of trouble to get the CalciteSchema object which in fact should
> be internal to Calcite code.
> 
> 3. The right point of time for defining/creating those materialized views.
> Whether for Quark or Phoenix, we need to make sure that we call
> defineMaterialization at the exact right point of time, which is after the
> schema is loaded and before the planner tries to collect and use them.
> Again this had better be something taken care of by Calcite instead of
> carefully maintained by the users.
> 
> 
> Let's follow up on that JIRA though.
> 
> 
> 
> Thanks,
> Maryann
> 
> On Fri, Feb 26, 2016 at 5:50 AM, Rajat Venkatesh <rvenkat...@qubole.com>
> wrote:
> 
>> In Quark, dont use hooks to define materializations. We use a tablefactory
>> [1] to defer until the schema is loaded[2].
>> 
>> 1.
>> 
>> https://github.com/qubole/quark/blob/master/optimizer/src/main/java/com/qubole/quark/planner/MetadataSchema.java#L85
>> 2.
>> 
>> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/materialize/MaterializationService.java#L132
>> 
>> Hopefully, I've understood your problem correctly.
>> 
>> 
>> On Thu, Feb 25, 2016 at 10:16 PM Maryann Xue <maryann....@gmail.com>
>> wrote:
>> 
>>> Hi Michael,
>>> 
>>> We had a little difficulty defining our secondary indices as materialized
>>> views with our Schema SPI implementation too, and this made the code
>> pretty
>>> hacky. In order to call that defineMaterialization method, we hold the
>>> parent SchemaPlus object within our own Schema impl object so that we can
>>> get its own corresponding SchemaPlus object by calling "parentSchema
>>> .getSubSchema(this.name)" later on. We do this after the schema/table
>>> resolving phase (that is when the entire schema tree incl. your own
>> schema
>>> objects have been initiated) and call defineMaterialization for each
>>> secondary index under each subSchema. We add a hook in "Hook.TRIMMED" for
>>> this, which sounds pretty weird, but this is exactly a point after you
>> have
>>> the whole schema tree ready and before the materializations are asked for
>>> by the planner.
>>> 
>>> Anyway, I do hope the interface can be modified to avoid all this
>> trouble.
>>> 
>>> 
>>> Thanks,
>>> Maryann
>>> 
>>> On Thu, Feb 25, 2016 at 9:24 AM, Michael Mior <mm...@uwaterloo.ca>
>> wrote:
>>> 
>>>> Any suggestions on the best place to hook in and add the materialized
>>>> views? It seems like doing so requires the SchemaPlus object
>>> corresponding
>>>> to the current schema. The current best approach I see is to save a
>>>> reference to the parent schema and then pull out the appropriate
>>> SchemaPlus
>>>> object in getTableMap. This seems like a bit of a hack though.
>>>> 
>>>> --
>>>> Michael Mior
>>>> mm...@uwaterloo.ca
>>>> 
>>>> 2016-02-24 17:22 GMT-05:00 Julian Hyde <jh...@apache.org>:
>>>> 
>>>>> By the way, interesting that you are interested in Cassandra and
>>>>> materialized views. Cassandra announced materialized view support
>>>>> recently[1] but they solved only half of a problem (not an
>>>>> insignificant half, I hasten to add), namely materialized view
>>>>> maintenance. They don't transparently substitute them into the query
>> -
>>>>> you have to reference the materialized view explicitly in y our query
>>>>> - so in my book they've not delivered materialized view support. If
>>>>> you're planning to deliver REAL materialized view support to
>> Cassandra
>>>>> that would be awesome.
>>>>> 
>>>>> Julian
>>>>> 
>>>>> [1]
>>>>> 
>>> http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views
>>>>> 
>>>>> 
>>>>> On Wed, Feb 24, 2016 at 2:17 PM, Julian Hyde <jh...@apache.org>
>> wrote:
>>>>>> As is typical for complex pieces of code like this, the
>> documentation
>>>>>> is in the code (and the unit test). It's probably not what you
>> wanted
>>>>>> to hear, but the code mutates quite fast and so if we'd written a
>>>>>> design doc a few months ago it would be partially inaccurate.
>>>>>> 
>>>>>> I, Maryann Xue and Amogh Margoor are the main authors of this code.
>>>>>> 
>>>>>> Suggest you find a relevant test case in MaterializationTest (or
>>> write
>>>>>> a new one) and run it with trace enabled and/or in a debugger. You
>>>>>> will see the process of matching an expression to a MV bottom up if
>>>>>> you watch each call to UnifyRule.unify.
>>>>>> 
>>>>>> Julian
>>>>>> 
>>>>>> 
>>>>>> On Wed, Feb 24, 2016 at 1:40 PM, Michael Mior <mm...@uwaterloo.ca>
>>>>> wrote:
>>>>>>> Is there any documentation anywhere on how the current
>>> implementation
>>>> of
>>>>>>> query rewriting for materialized views work? Mostly I'm referring
>>>>>>> to MaterializedViewSubstitutionVisitor. There's a lot of code to
>>>> digest
>>>>>>> with not a lot of documentation and it would be helpful to have a
>>>>> reference
>>>>>>> to refer. Thanks!
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> --
>>>>>>> Michael Mior
>>>>>>> mm...@uwaterloo.ca
>>>>> 
>>>> 
>>> 
>>

Re: Materialized view rewriting

Reply via email to