Hi all, The meeting yesterday was fun and productive. As discussed, this is the call to schedule our second meeting.
We encourage everyone to add their time preferences during 05/01 - 05/15 here: https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing Thanks, Botong On Wed, Apr 21, 2021 at 5:19 PM Botong Huang <[email protected]> wrote: > Hi all, > We've created a zoom meeting below for our meeting next Monday > (9pm-10:30pm PST on 04/26). > Talk to you all soon! > > Join Zoom Meeting > https://uci.zoom.us/j/91279732686 > <https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fj%2F91279732686&sa=D&source=calendar&usd=2&usg=AOvVaw2C5LoOmCaSLWSi-YvMmsOE> > > Meeting ID: 912 7973 2686 > One tap mobile > +16699006833,,91279732686# US (San Jose) > +12532158782,,91279732686# US (Tacoma) > > Dial by your location > +1 669 900 6833 US (San Jose) > +1 253 215 8782 US (Tacoma) > +1 346 248 7799 US (Houston) > +1 301 715 8592 US (Washington DC) > +1 312 626 6799 US (Chicago) > +1 646 558 8656 US (New York) > Meeting ID: 912 7973 2686 > Find your local number: https://uci.zoom.us/u/aykHTkJBh > <https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fu%2FaykHTkJBh&sa=D&source=calendar&usd=2&usg=AOvVaw0y_V5CisCHRyt9wsXLa9UM> > > Join by Skype for Business > https://uci.zoom.us/skype/91279732686 > <https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fskype%2F91279732686&sa=D&source=calendar&usd=2&usg=AOvVaw3iQwsDViu3K7-Rb_Iy6Zsy> > > > Thanks, > Botong > > On Tue, Apr 13, 2021 at 10:16 PM Botong Huang <[email protected]> wrote: > >> Hi all, >> >> According to the preferences collected, we are tentatively scheduling our >> meeting at 9pm-10:30pm PST on 04/26 Monday. >> >> We will give a presentation about Tempura, followed by a free discussion. >> >> Please let us know if there are new other requests. Few days before >> the meeting, I will send out a zoom meeting link. >> >> Thanks, >> Botong >> >> On Wed, Apr 7, 2021 at 2:46 PM Botong Huang <[email protected]> wrote: >> >>> Hi Julian and all, >>> >>> We've posted the Tempura code base below. Feel free to take a quick peek >>> at the last five commits. >>> https://github.com/alibaba/cost-based-incremental-optimizer/commits/main >>> >>> I've also opened a Jira (CALCITE-4568 >>> <https://issues.apache.org/jira/browse/CALCITE-4568>), which will serve >>> as the umbrella Jira for the feature. >>> >>> In the meantime, we encourage everyone to enter the time preferences for >>> our first meeting here: >>> >>> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing >>> >>> Thanks, >>> Botong >>> >>> On Mon, Apr 5, 2021 at 3:59 PM Julian Hyde <[email protected]> >>> wrote: >>> >>>> I have added my time preferences to the doc. >>>> >>>> Before we meet, could you publish a PR for us to review? >>>> >>>> Initial discussions will need to be about architecture and high-level >>>> design. So I would ask Calcite reviewers not to review the PR line-by-line >>>> (or to leave comments in GitHub) but try to understand the design >>>> holistically, and prepare questions/comments before the meeting. >>>> >>>> Botong, Can you please create a Calcite JIRA case for this task? JIRA >>>> how we track long-running tasks such as this. >>>> >>>> Julian >>>> >>>> >>>> > On Apr 3, 2021, at 5:15 PM, Botong Huang <[email protected]> wrote: >>>> > >>>> > Hi all, >>>> > >>>> > Apology for the delay. It took us some time to clean up our code base >>>> and >>>> > publicly release it (which will be out soon) for a quick peek. >>>> > >>>> > We are ready to present our work. Let's schedule a time for a Zoom >>>> > meeting and discuss how to integrate Tempura into Calcite. >>>> > >>>> > Since some of our team members are in China, we prefer the time slot >>>> of >>>> > 7:00pm-11:30pm PST any day. I've added our time preference in the >>>> shared >>>> > doc below. >>>> > >>>> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing >>>> > >>>> > We encourage everyone to add their time preferences (during >>>> 04/15-04/30) in >>>> > this doc. In a week or so, we will try to settle a time that works for >>>> > most. >>>> > >>>> > Thanks, >>>> > Botong >>>> > >>>> > On Sat, Jan 30, 2021 at 9:19 PM Botong Huang <[email protected]> >>>> wrote: >>>> > >>>> >> Hi Julian and Rui, >>>> >> >>>> >> Sounds good to us. Please give us some time to prepare some slides >>>> for the >>>> >> meeting. >>>> >> >>>> >> I've created a doc below for discussion. Please feel free to add >>>> more in >>>> >> here: >>>> >> >>>> >> >>>> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing >>>> >> >>>> >> Thanks, >>>> >> Botong >>>> >> >>>> >> On Thu, Jan 28, 2021 at 11:18 AM Julian Hyde <[email protected] >>>> > >>>> >> wrote: >>>> >> >>>> >>> PS The “editable doc” that Rui refers to is also a good idea. I >>>> think we >>>> >>> should create it to continue discussion after the first meeting. >>>> >>> >>>> >>> Julian >>>> >>> >>>> >>>> On Jan 28, 2021, at 11:16 AM, Julian Hyde <[email protected]> >>>> >>> wrote: >>>> >>>> >>>> >>>> I think good next steps would be a PR and a meeting. The PR will >>>> allow >>>> >>> us to read the code, but I think we should do the first round of >>>> questions >>>> >>> at the meeting. The meeting could perhaps start with a >>>> presentation of the >>>> >>> paper (do you have some slides you are planning to present at VLDB, >>>> >>> Botong?) and then move on to questions about the concepts, which >>>> >>> alternatives were considered, and how the concepts map onto other >>>> current >>>> >>> and future concepts in calcite. >>>> >>>> >>>> >>>> I don’t think we should start “reviewing” the PR line-by-line at >>>> this >>>> >>> point. We need to understand the high-level concepts and design >>>> choices. If >>>> >>> we start reviewing the PR we will get lost in the details. >>>> >>>> >>>> >>>> I know that integrating a major change is hard; I doubt that we >>>> will be >>>> >>> able to integrate everything, but we can build understanding about >>>> where >>>> >>> calcite needs to go, and I hope integrate a good amount of code to >>>> help us >>>> >>> get there. >>>> >>>> >>>> >>>> As I said before, after the integration I would like people to be >>>> able >>>> >>> to experiment with it and use it in their production systems. That >>>> way, it >>>> >>> will not be an experiment that withers, but a feature set >>>> integrates with >>>> >>> other calcite features and gets stronger over time. >>>> >>>> >>>> >>>> Julian >>>> >>>> >>>> >>>>> On Jan 28, 2021, at 10:54 AM, Rui Wang <[email protected]> >>>> wrote: >>>> >>>>> >>>> >>>>> For me to participate in the discussion for the above questions, >>>> I >>>> >>> will >>>> >>>>> need to read a lot more to know relevant context and likely ask >>>> lots of >>>> >>>>> questions :-). A editable doc is probably good for questions and >>>> back >>>> >>> and >>>> >>>>> forward discussion. >>>> >>>>> >>>> >>>>> >>>> >>>>> -Rui >>>> >>>>> >>>> >>>>>>> On Thu, Jan 28, 2021 at 10:50 AM Rui Wang <[email protected] >>>> > >>>> >>> wrote: >>>> >>>>>> >>>> >>>>>> I am also happy to help push this work into Calcite (review code >>>> and >>>> >>> doc, >>>> >>>>>> etc.). >>>> >>>>>> >>>> >>>>>> While you can share your code so people can have more idea how >>>> it is >>>> >>>>>> implemented, I think it would be also nice to have a doc to >>>> discuss >>>> >>> open >>>> >>>>>> questions above. Some points that I copy those to here: >>>> >>>>>> >>>> >>>>>> 1. Can this solution be compatible with existing solutions in >>>> Calcite >>>> >>>>>> Streaming, materialized view maintenance, and multi-query >>>> optimization >>>> >>>>>> (Sigma and Delta relational operators, lattice, and Spool >>>> operator), >>>> >>>>>> 2. Did you find that you needed two separate cost models - one >>>> for >>>> >>> “view >>>> >>>>>> maintenance” and another for “user queries” - since the >>>> objectives of >>>> >>> each >>>> >>>>>> activity are so different? >>>> >>>>>> 3. whether this work will hasten the arrival of multi-objective >>>> >>> parametric >>>> >>>>>> query optimization [1] in Calcite. >>>> >>>>>> 4. probably SQL shell support. >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> [1]: >>>> >>>>>> >>>> >>> >>>> https://cacm.acm.org/magazines/2017/10/221322-multi-objective-parametric-query-optimization/fulltext >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> -Rui >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>>> On Wed, Jan 27, 2021 at 6:52 PM Albert <[email protected]> >>>> wrote: >>>> >>>>>>> >>>> >>>>>>> it would be very nice to see a POC of your work. >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>>> On Thu, Jan 28, 2021 at 10:21 AM Botong Huang < >>>> [email protected]> >>>> >>> wrote: >>>> >>>>>>> >>>> >>>>>>>> Hi Julian, >>>> >>>>>>>> >>>> >>>>>>>> Just wondering if there are any updates? We are wondering if it >>>> >>> would >>>> >>>>>>> help >>>> >>>>>>>> to post our code for a quick preview. >>>> >>>>>>>> >>>> >>>>>>>> Thanks, >>>> >>>>>>>> Botong >>>> >>>>>>>> >>>> >>>>>>>> On Fri, Jan 1, 2021 at 11:04 AM Botong Huang <[email protected] >>>> > >>>> >>> wrote: >>>> >>>>>>>> >>>> >>>>>>>>> Hi Julian, >>>> >>>>>>>>> >>>> >>>>>>>>> Thanks for your interest! Sure let's figure out a plan that >>>> best >>>> >>>>>>> benefits >>>> >>>>>>>>> the community. Here are some clarifications that hopefully >>>> answer >>>> >>> your >>>> >>>>>>>>> questions. >>>> >>>>>>>>> >>>> >>>>>>>>> In our work (Tempura), users specify the set of time points to >>>> >>>>>>> consider >>>> >>>>>>>>> running and a cost function that expresses users' preference >>>> over >>>> >>>>>>> time, >>>> >>>>>>>>> Tempura will generate the best incremental plan that >>>> minimizes the >>>> >>>>>>>> overall >>>> >>>>>>>>> cost function. >>>> >>>>>>>>> >>>> >>>>>>>>> In this incremental plan, the sub-plans at different time >>>> points >>>> >>> can >>>> >>>>>>> be >>>> >>>>>>>>> different from each other, as opposed to identical plans in >>>> all >>>> >>> delta >>>> >>>>>>>> runs >>>> >>>>>>>>> as in streaming or IVM. As mentioned in $2.1 of the Tempura >>>> paper, >>>> >>> we >>>> >>>>>>> can >>>> >>>>>>>>> mimic the current streaming implementation by specifying two >>>> >>> (logical) >>>> >>>>>>>> time >>>> >>>>>>>>> points in Tempura, representing the initial run and later >>>> delta >>>> >>> runs >>>> >>>>>>>>> respectively. In general, note that Tempura supports various >>>> form >>>> >>> of >>>> >>>>>>>>> incremental computing, not only the small-delta append-only >>>> data >>>> >>>>>>> model in >>>> >>>>>>>>> streaming systems. That's why we believe Tempura subsumes the >>>> >>> current >>>> >>>>>>>>> streaming support, as well as any IVM implementations. >>>> >>>>>>>>> >>>> >>>>>>>>> About the cost model, we did not come up with a seperate cost >>>> >>> model, >>>> >>>>>>> but >>>> >>>>>>>>> rather extended the existing one. Similar to multi-objective >>>> >>>>>>>> optimization, >>>> >>>>>>>>> costs incurred at different time points are considered >>>> different >>>> >>>>>>>>> dimensions. Tempura lets users supply a function that >>>> converts this >>>> >>>>>>> cost >>>> >>>>>>>>> vector into a final cost. So under this function, any two >>>> >>> incremental >>>> >>>>>>>> plans >>>> >>>>>>>>> are still comparable and there is an overall optimum. I guess >>>> we >>>> >>> can >>>> >>>>>>> go >>>> >>>>>>>>> down the route of multi-objective parametric query >>>> optimization >>>> >>>>>>> instead >>>> >>>>>>>> if >>>> >>>>>>>>> there is a need. >>>> >>>>>>>>> >>>> >>>>>>>>> Next on materialized views and multi-query optimization, >>>> since our >>>> >>>>>>>>> multi-time-point plan naturally involves materializing >>>> intermediate >>>> >>>>>>>> results >>>> >>>>>>>>> for later time points, we need to solve the problem of >>>> choosing >>>> >>>>>>>>> materializations and include the cost of saving and reusing >>>> the >>>> >>>>>>>>> materializations when costing and comparing plans. We >>>> borrowed the >>>> >>>>>>>>> multi-query optimization techniques to solve this problem even >>>> >>> though >>>> >>>>>>> we >>>> >>>>>>>>> are looking at a single query. As a result, we think our work >>>> is >>>> >>>>>>>> orthogonal >>>> >>>>>>>>> to Calcite's facilities around utilizing existing views, >>>> lattice >>>> >>> etc. >>>> >>>>>>> We >>>> >>>>>>>> do >>>> >>>>>>>>> feel that the multi-query optimization component can be >>>> adopted to >>>> >>>>>>> wider >>>> >>>>>>>>> use, but probably need more suggestions from the community. >>>> >>>>>>>>> >>>> >>>>>>>>> Lastly, our current implementation is set up in java code, it >>>> >>> should >>>> >>>>>>> be >>>> >>>>>>>>> straightforward to hook it up with SQL shell. >>>> >>>>>>>>> >>>> >>>>>>>>> Thanks, >>>> >>>>>>>>> Botong >>>> >>>>>>>>> >>>> >>>>>>>>> On Mon, Dec 28, 2020 at 6:44 PM Julian Hyde < >>>> >>> [email protected]> >>>> >>>>>>>>> wrote: >>>> >>>>>>>>> >>>> >>>>>>>>>> Botong, >>>> >>>>>>>>>> >>>> >>>>>>>>>> This is very exciting; congratulations on this research, and >>>> thank >>>> >>>>>>> you >>>> >>>>>>>>>> for contributing it back to Calcite. >>>> >>>>>>>>>> >>>> >>>>>>>>>> The research touches several areas in Calcite: streaming, >>>> >>>>>>> materialized >>>> >>>>>>>>>> view maintenance, and multi-query optimization. As we have >>>> already >>>> >>>>>>> some >>>> >>>>>>>>>> solutions in those areas (Sigma and Delta relational >>>> operators, >>>> >>>>>>> lattice, >>>> >>>>>>>>>> and Spool operator), it will be interesting to see whether >>>> we can >>>> >>>>>>> make >>>> >>>>>>>> them >>>> >>>>>>>>>> compatible, or whether one concept can subsume others. >>>> >>>>>>>>>> >>>> >>>>>>>>>> Your work differs from streaming queries in that your >>>> relations >>>> >>> are >>>> >>>>>>> used >>>> >>>>>>>>>> by “external” user queries, whereas in pure streaming >>>> queries, the >>>> >>>>>>> only >>>> >>>>>>>>>> activity is the change propagation. Did you find that you >>>> needed >>>> >>> two >>>> >>>>>>>>>> separate cost models - one for “view maintenance” and >>>> another for >>>> >>>>>>> “user >>>> >>>>>>>>>> queries” - since the objectives of each activity are so >>>> different? >>>> >>>>>>>>>> >>>> >>>>>>>>>> I wonder whether this work will hasten the arrival of >>>> >>> multi-objective >>>> >>>>>>>>>> parametric query optimization [1] in Calcite. >>>> >>>>>>>>>> >>>> >>>>>>>>>> I will make time over the next few days to read and digest >>>> your >>>> >>>>>>> paper. >>>> >>>>>>>>>> Then I expect that we will have a back-and-forth process to >>>> create >>>> >>>>>>>>>> something that will be useful for the broader community. >>>> >>>>>>>>>> >>>> >>>>>>>>>> One thing will be particularly useful: making this >>>> functionality >>>> >>>>>>>>>> available from a SQL shell, so that people can experiment >>>> with >>>> >>> this >>>> >>>>>>>>>> functionality without writing Java code or setting up complex >>>> >>>>>>> databases >>>> >>>>>>>> and >>>> >>>>>>>>>> metadata. I have in mind something like the simple DDL >>>> operations >>>> >>>>>>> that >>>> >>>>>>>> are >>>> >>>>>>>>>> available in Calcite’s ’server’ module. I wonder whether we >>>> could >>>> >>>>>>> devise >>>> >>>>>>>>>> some kind of SQL syntax for a “multi-query”. >>>> >>>>>>>>>> >>>> >>>>>>>>>> Julian >>>> >>>>>>>>>> >>>> >>>>>>>>>> [1] >>>> >>>>>>>>>> >>>> >>>>>>>> >>>> >>>>>>> >>>> >>> >>>> https://cacm.acm.org/magazines/2017/10/221322-multi-objective-parametric-query-optimization/fulltext >>>> >>>>>>>>>> >>>> >>>>>>>>>> >>>> >>>>>>>>>> >>>> >>>>>>>>>>> On Dec 23, 2020, at 8:55 PM, Botong Huang <[email protected] >>>> > >>>> >>>>>>> wrote: >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Thanks Aron for pointing this out. To see the figure, please >>>> >>> refer >>>> >>>>>>> to >>>> >>>>>>>>>> Fig >>>> >>>>>>>>>>> 3(a) in our paper: >>>> >>>>>>>>>> https://kai-zeng.github.io/papers/tempura-vldb2021.pdf >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Best, >>>> >>>>>>>>>>> Botong >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> On Wed, Dec 23, 2020 at 7:20 PM JiaTao Tao < >>>> [email protected]> >>>> >>>>>>>> wrote: >>>> >>>>>>>>>>> >>>> >>>>>>>>>>>> Seems interesting, the pic can not be seen in the mail, >>>> may you >>>> >>>>>>> open >>>> >>>>>>>> a >>>> >>>>>>>>>> JIRA >>>> >>>>>>>>>>>> for this, people who are interested in this can subscribe >>>> to the >>>> >>>>>>>> JIRA? >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> Regards! >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> Aron Tao >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> Botong Huang <[email protected]> 于2020年12月24日周四 上午3:18写道: >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>>> Hi all, >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> This is a proposal to extend the Calcite optimizer into a >>>> >>> general >>>> >>>>>>>>>>>>> incremental query optimizer, based on our research paper >>>> >>>>>>> published >>>> >>>>>>>> in >>>> >>>>>>>>>>>> VLDB >>>> >>>>>>>>>>>>> 2021: >>>> >>>>>>>>>>>>> Tempura: a general cost-based optimizer framework for >>>> >>> incremental >>>> >>>>>>>> data >>>> >>>>>>>>>>>>> processing >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> We also have a demo in SIGMOD 2020 illustrating how >>>> Alibaba’s >>>> >>>>>>> data >>>> >>>>>>>>>>>>> warehouse is planning to use this incremental query >>>> optimizer >>>> >>> to >>>> >>>>>>>>>>>> alleviate >>>> >>>>>>>>>>>>> cluster-wise resource skewness: >>>> >>>>>>>>>>>>> Grosbeak: A Data Warehouse Supporting Resource-Aware >>>> >>> Incremental >>>> >>>>>>>>>>>> Computing >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> To our best knowledge, this is the first general >>>> cost-based >>>> >>>>>>>>>> incremental >>>> >>>>>>>>>>>>> optimizer that can find the best plan across multiple >>>> families >>>> >>> of >>>> >>>>>>>>>>>>> incremental computing methods, including IVM, Streaming, >>>> >>>>>>> DBToaster, >>>> >>>>>>>>>> etc. >>>> >>>>>>>>>>>>> Experiments (in the paper) shows that the generated best >>>> plan >>>> >>> is >>>> >>>>>>>>>>>>> consistently much better than the plans from each >>>> individual >>>> >>>>>>> method >>>> >>>>>>>>>>>> alone. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> In general, incremental query planning is central to >>>> database >>>> >>>>>>> view >>>> >>>>>>>>>>>>> maintenance and stream processing systems, and are being >>>> >>> adopted >>>> >>>>>>> in >>>> >>>>>>>>>>>> active >>>> >>>>>>>>>>>>> databases, resumable query execution, approximate query >>>> >>>>>>> processing, >>>> >>>>>>>>>> etc. >>>> >>>>>>>>>>>> We >>>> >>>>>>>>>>>>> are hoping that this feature can help widening the >>>> spectrum of >>>> >>>>>>>>>> Calcite, >>>> >>>>>>>>>>>>> solicit more use cases and adoption of Calcite. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> Below is a brief description of the technical details. >>>> Please >>>> >>>>>>> refer >>>> >>>>>>>> to >>>> >>>>>>>>>>>> the >>>> >>>>>>>>>>>>> Tempura paper for more details. We are also working on a >>>> >>> journal >>>> >>>>>>>>>> version >>>> >>>>>>>>>>>> of >>>> >>>>>>>>>>>>> the paper with more implementation details. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> Currently the query plan generated by Calcite is meant to >>>> be >>>> >>>>>>>> executed >>>> >>>>>>>>>>>>> altogether at once. In the proposal, Calcite’s memo will >>>> be >>>> >>>>>>> extended >>>> >>>>>>>>>> with >>>> >>>>>>>>>>>>> temporal information so that it is capable of generating >>>> >>>>>>> incremental >>>> >>>>>>>>>>>> plans >>>> >>>>>>>>>>>>> that include multiple sub-plans to execute at different >>>> time >>>> >>>>>>> points. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> The main idea is to view each table as one that changes >>>> over >>>> >>> time >>>> >>>>>>>>>> (Time >>>> >>>>>>>>>>>>> Varying Relations (TVR)). To achieve that we introduced >>>> >>>>>>> TvrMetaSet >>>> >>>>>>>>>> into >>>> >>>>>>>>>>>>> Calcite’s memo besides RelSet and RelSubset to track >>>> related >>>> >>>>>>> RelSets >>>> >>>>>>>>>> of a >>>> >>>>>>>>>>>>> changing table (e.g. snapshot of the table at certain >>>> time, >>>> >>>>>>> delta of >>>> >>>>>>>>>> the >>>> >>>>>>>>>>>>> table between two time points, etc.). >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> [image: image.png] >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> For example in the above figure, each vertical line is a >>>> >>>>>>> TvrMetaSet >>>> >>>>>>>>>>>>> representing a TVR (S, R, S left outer join R, etc.). >>>> >>> Horizontal >>>> >>>>>>>> lines >>>> >>>>>>>>>>>>> represent time. Each black dot in the grid is a RelSet. >>>> Users >>>> >>> can >>>> >>>>>>>>>> write >>>> >>>>>>>>>>>> TVR >>>> >>>>>>>>>>>>> Rewrite Rules to describe valid transformations between >>>> these >>>> >>>>>>> dots. >>>> >>>>>>>>>> For >>>> >>>>>>>>>>>>> example, the blues lines are inter-TVR rules that >>>> describe how >>>> >>> to >>>> >>>>>>>>>> compute >>>> >>>>>>>>>>>>> certain RelSet of a TVR from RelSets of other TVRs. The >>>> red >>>> >>> lines >>>> >>>>>>>> are >>>> >>>>>>>>>>>>> intra-TVR rules that describe transformations within a >>>> TVR. All >>>> >>>>>>> TVR >>>> >>>>>>>>>>>> rewrite >>>> >>>>>>>>>>>>> rules are logical rules. All existing Calcite rules still >>>> work >>>> >>> in >>>> >>>>>>>> the >>>> >>>>>>>>>> new >>>> >>>>>>>>>>>>> volcano system without modification. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> All changes in this feature will consist of four parts: >>>> >>>>>>>>>>>>> 1. Memo extension with TvrMetaSet >>>> >>>>>>>>>>>>> 2. Rule engine upgrade, capable of matching TvrMetaSet and >>>> >>>>>>> RelNodes, >>>> >>>>>>>>>> as >>>> >>>>>>>>>>>>> well as links in between the nodes. >>>> >>>>>>>>>>>>> 3. A basic set of TvrRules, written using the upgraded >>>> rule >>>> >>>>>>> engine >>>> >>>>>>>>>> API. >>>> >>>>>>>>>>>>> 4. Multi-query optimization, used to find the best >>>> incremental >>>> >>>>>>> plan >>>> >>>>>>>>>>>>> involving multiple time points. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> Note that this feature is an extension in nature and thus >>>> when >>>> >>>>>>>>>> disabled, >>>> >>>>>>>>>>>>> does not change any existing Calcite behavior. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> Other than scenarios in the paper, we also applied this >>>> >>>>>>>>>> Calcite-extended >>>> >>>>>>>>>>>>> incremental query optimizer to a type of periodic query >>>> called >>>> >>>>>>> the >>>> >>>>>>>>>>>> ‘‘range >>>> >>>>>>>>>>>>> query’’ in Alibaba’s data warehouse. It achieved cost >>>> savings >>>> >>> of >>>> >>>>>>> 80% >>>> >>>>>>>>>> on >>>> >>>>>>>>>>>>> total CPU and memory consumption, and 60% on end-to-end >>>> >>> execution >>>> >>>>>>>>>> time. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> All comments and suggestions are welcome. Thanks and happy >>>> >>>>>>> holidays! >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> Best, >>>> >>>>>>>>>>>>> Botong >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>> >>>> >>>>>>>>>> >>>> >>>>>>>>>> >>>> >>>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> -- >>>> >>>>>>> ~~~~~~~~~~~~~~~ >>>> >>>>>>> no mistakes >>>> >>>>>>> ~~~~~~~~~~~~~~~~~~ >>>> >>>>>>> >>>> >>>>>> >>>> >>> >>>> >> >>>> >>>>
