Hi all,

This is a reminder that we are going to have our second discussion meeting
tomorrow at 10-11pm PST. Please find the link below, everyone is welcome to
join!

Join Zoom Meeting
https://uci.zoom.us/j/91986206610
<https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fj%2F91986206610&sa=D&source=calendar&usd=2&usg=AOvVaw24sxPtI6hbukCSo3nlQsbn>

Meeting ID: 919 8620 6610
One tap mobile
+16699006833,,91986206610# US (San Jose)
+12532158782,,91986206610# US (Tacoma)

Dial by your location
        +1 669 900 6833 US (San Jose)
        +1 253 215 8782 US (Tacoma)
        +1 346 248 7799 US (Houston)
        +1 301 715 8592 US (Washington DC)
        +1 312 626 6799 US (Chicago)
        +1 646 558 8656 US (New York)
Meeting ID: 919 8620 6610
Find your local number: https://uci.zoom.us/u/acyXcc43Cd
<https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fu%2FacyXcc43Cd&sa=D&source=calendar&usd=2&usg=AOvVaw2W08kj_8hEx44dryeZlXb6>

Join by Skype for Business
https://uci.zoom.us/skype/91986206610
<https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fskype%2F91986206610&sa=D&source=calendar&usd=2&usg=AOvVaw3w0M0YYbcjPyBXzNpyyk0Z>

Thanks,
Botong

On Wed, May 5, 2021 at 9:55 AM Botong Huang <pku...@gmail.com> wrote:

> Hi Stamatis and all,
>
> Thanks for the interest! Let's tentatively schedule the next meeting next
> Wednesday at May 12, 10pm-11pm PST then. Please let us know if there's new
> needs showing up.
>
> Best,
> Botong
>
> On Sun, May 2, 2021 at 2:59 PM Stamatis Zampetakis <zabe...@gmail.com>
> wrote:
>
>> Hello,
>>
>> I really regret missing the first meeting, sorry about that. I added my
>> preferences in the document.
>> I will make sure to attend the next one and help as much as I can.
>>
>> I didn't have the chance yet to go over the paper but will try to do it
>> before the next meeting.
>>
>> For me the following dates are more convenient than others so it would be
>> nice if we could arrange it then.
>>
>> Thu, May 6, 10pm PST
>> Tue, May 12, 10pm PST
>>
>> Best,
>> Stamatis
>>
>> On Sat, May 1, 2021 at 9:42 PM Julian Hyde <jh...@apache.org> wrote:
>>
>> > I have added my time preferences to the doc [1]. I am generally
>> > available any evening Mon - Thu. How about we meet Monday 10th May?
>> >
>> > Stamatis, Jesus, Given the complexity of this work, I would very much
>> > appreciate your insight, as experts in optimizer theory. Could one of
>> > you join the next meeting? Of course we should choose a time that
>> > works for everyone's schedule.
>> >
>> > Julian
>> >
>> > [1]
>> >
>> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
>> >
>> > On Wed, Apr 28, 2021 at 9:32 AM Botong Huang <pku...@gmail.com> wrote:
>> > >
>> > > We didn't record it, we will try to record the following meetings.
>> Please
>> > > add your time preference in the docs, so that we can find a meeting
>> time
>> > > that works for more people.
>> > >
>> > > Thanks,
>> > > Botong
>> > >
>> > > On Wed, Apr 28, 2021 at 12:23 AM Viliam Durina <vil...@hazelcast.com>
>> > wrote:
>> > >
>> > > > Is there a recording available?
>> > > > Viliam
>> > > >
>> > > > On Wed, 28 Apr 2021 at 00:15, Botong Huang <pku...@gmail.com>
>> wrote:
>> > > >
>> > > > > Hi all,
>> > > > >
>> > > > > The meeting yesterday was fun and productive. As discussed, this
>> is
>> > the
>> > > > > call to schedule our second meeting.
>> > > > >
>> > > > > We encourage everyone to add their time preferences during 05/01 -
>> > 05/15
>> > > > > here:
>> > > > >
>> > > > >
>> > > >
>> >
>> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
>> > > > >
>> > > > > Thanks,
>> > > > > Botong
>> > > > >
>> > > > > On Wed, Apr 21, 2021 at 5:19 PM Botong Huang <pku...@gmail.com>
>> > wrote:
>> > > > >
>> > > > > > Hi all,
>> > > > > > We've created a zoom meeting below for our meeting next Monday
>> > > > > > (9pm-10:30pm PST on 04/26).
>> > > > > > Talk to you all soon!
>> > > > > >
>> > > > > > Join Zoom Meeting
>> > > > > > https://uci.zoom.us/j/91279732686
>> > > > > > <
>> > > > >
>> > > >
>> >
>> https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fj%2F91279732686&sa=D&source=calendar&usd=2&usg=AOvVaw2C5LoOmCaSLWSi-YvMmsOE
>> > > > > >
>> > > > > >
>> > > > > > Meeting ID: 912 7973 2686
>> > > > > > One tap mobile
>> > > > > > +16699006833,,91279732686# US (San Jose)
>> > > > > > +12532158782,,91279732686# US (Tacoma)
>> > > > > >
>> > > > > > Dial by your location
>> > > > > > +1 669 900 6833 US (San Jose)
>> > > > > > +1 253 215 8782 US (Tacoma)
>> > > > > > +1 346 248 7799 US (Houston)
>> > > > > > +1 301 715 8592 US (Washington DC)
>> > > > > > +1 312 626 6799 US (Chicago)
>> > > > > > +1 646 558 8656 US (New York)
>> > > > > > Meeting ID: 912 7973 2686
>> > > > > > Find your local number: https://uci.zoom.us/u/aykHTkJBh
>> > > > > > <
>> > > > >
>> > > >
>> >
>> https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fu%2FaykHTkJBh&sa=D&source=calendar&usd=2&usg=AOvVaw0y_V5CisCHRyt9wsXLa9UM
>> > > > > >
>> > > > > >
>> > > > > > Join by Skype for Business
>> > > > > > https://uci.zoom.us/skype/91279732686
>> > > > > > <
>> > > > >
>> > > >
>> >
>> https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fskype%2F91279732686&sa=D&source=calendar&usd=2&usg=AOvVaw3iQwsDViu3K7-Rb_Iy6Zsy
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Botong
>> > > > > >
>> > > > > > On Tue, Apr 13, 2021 at 10:16 PM Botong Huang <pku...@gmail.com
>> >
>> > > > wrote:
>> > > > > >
>> > > > > >> Hi all,
>> > > > > >>
>> > > > > >> According to the preferences collected, we are tentatively
>> > scheduling
>> > > > > our
>> > > > > >> meeting at 9pm-10:30pm PST on 04/26 Monday.
>> > > > > >>
>> > > > > >> We will give a presentation about Tempura, followed by a free
>> > > > > discussion.
>> > > > > >>
>> > > > > >> Please let us know if there are new other requests. Few days
>> > before
>> > > > > >> the meeting, I will send out a zoom meeting link.
>> > > > > >>
>> > > > > >> Thanks,
>> > > > > >> Botong
>> > > > > >>
>> > > > > >> On Wed, Apr 7, 2021 at 2:46 PM Botong Huang <pku...@gmail.com>
>> > wrote:
>> > > > > >>
>> > > > > >>> Hi Julian and all,
>> > > > > >>>
>> > > > > >>> We've posted the Tempura code base below. Feel free to take a
>> > quick
>> > > > > peek
>> > > > > >>> at the last five commits.
>> > > > > >>>
>> > > > >
>> >
>> https://github.com/alibaba/cost-based-incremental-optimizer/commits/main
>> > > > > >>>
>> > > > > >>> I've also opened a Jira (CALCITE-4568
>> > > > > >>> <https://issues.apache.org/jira/browse/CALCITE-4568>), which
>> > will
>> > > > > serve
>> > > > > >>> as the umbrella Jira for the feature.
>> > > > > >>>
>> > > > > >>> In the meantime, we encourage everyone to enter the time
>> > preferences
>> > > > > for
>> > > > > >>> our first meeting here:
>> > > > > >>>
>> > > > > >>>
>> > > > >
>> > > >
>> >
>> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
>> > > > > >>>
>> > > > > >>> Thanks,
>> > > > > >>> Botong
>> > > > > >>>
>> > > > > >>> On Mon, Apr 5, 2021 at 3:59 PM Julian Hyde <
>> > jhyde.apa...@gmail.com>
>> > > > > >>> wrote:
>> > > > > >>>
>> > > > > >>>> I have added my time preferences to the doc.
>> > > > > >>>>
>> > > > > >>>> Before we meet, could you publish a PR for us to review?
>> > > > > >>>>
>> > > > > >>>> Initial discussions will need to be about architecture and
>> > > > high-level
>> > > > > >>>> design. So I would ask Calcite reviewers not to review the PR
>> > > > > line-by-line
>> > > > > >>>> (or to leave comments in GitHub) but try to understand the
>> > design
>> > > > > >>>> holistically, and prepare questions/comments before the
>> meeting.
>> > > > > >>>>
>> > > > > >>>> Botong, Can you please create a Calcite JIRA case for this
>> task?
>> > > > JIRA
>> > > > > >>>> how we track long-running tasks such as this.
>> > > > > >>>>
>> > > > > >>>> Julian
>> > > > > >>>>
>> > > > > >>>>
>> > > > > >>>> > On Apr 3, 2021, at 5:15 PM, Botong Huang <pku...@gmail.com
>> >
>> > > > wrote:
>> > > > > >>>> >
>> > > > > >>>> > Hi all,
>> > > > > >>>> >
>> > > > > >>>> > Apology for the delay. It took us some time to clean up our
>> > code
>> > > > > base
>> > > > > >>>> and
>> > > > > >>>> > publicly release it (which will be out soon) for a quick
>> peek.
>> > > > > >>>> >
>> > > > > >>>> > We are ready to present our work. Let's schedule a time
>> for a
>> > Zoom
>> > > > > >>>> > meeting and discuss how to integrate Tempura into Calcite.
>> > > > > >>>> >
>> > > > > >>>> > Since some of our team members are in China, we prefer the
>> > time
>> > > > slot
>> > > > > >>>> of
>> > > > > >>>> > 7:00pm-11:30pm PST any day. I've added our time preference
>> in
>> > the
>> > > > > >>>> shared
>> > > > > >>>> > doc below.
>> > > > > >>>> >
>> > > > > >>>>
>> > > > >
>> > > >
>> >
>> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
>> > > > > >>>> >
>> > > > > >>>> > We encourage everyone to add their time preferences (during
>> > > > > >>>> 04/15-04/30) in
>> > > > > >>>> > this doc. In a week or so, we will try to settle a time
>> that
>> > works
>> > > > > for
>> > > > > >>>> > most.
>> > > > > >>>> >
>> > > > > >>>> > Thanks,
>> > > > > >>>> > Botong
>> > > > > >>>> >
>> > > > > >>>> > On Sat, Jan 30, 2021 at 9:19 PM Botong Huang <
>> > pku...@gmail.com>
>> > > > > >>>> wrote:
>> > > > > >>>> >
>> > > > > >>>> >> Hi Julian and Rui,
>> > > > > >>>> >>
>> > > > > >>>> >> Sounds good to us. Please give us some time to prepare
>> some
>> > > > slides
>> > > > > >>>> for the
>> > > > > >>>> >> meeting.
>> > > > > >>>> >>
>> > > > > >>>> >> I've created a doc below for discussion. Please feel free
>> to
>> > add
>> > > > > >>>> more in
>> > > > > >>>> >> here:
>> > > > > >>>> >>
>> > > > > >>>> >>
>> > > > > >>>>
>> > > > >
>> > > >
>> >
>> https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
>> > > > > >>>> >>
>> > > > > >>>> >> Thanks,
>> > > > > >>>> >> Botong
>> > > > > >>>> >>
>> > > > > >>>> >> On Thu, Jan 28, 2021 at 11:18 AM Julian Hyde <
>> > > > > jhyde.apa...@gmail.com
>> > > > > >>>> >
>> > > > > >>>> >> wrote:
>> > > > > >>>> >>
>> > > > > >>>> >>> PS The “editable doc” that Rui refers to is also a good
>> > idea. I
>> > > > > >>>> think we
>> > > > > >>>> >>> should create it to continue discussion after the first
>> > meeting.
>> > > > > >>>> >>>
>> > > > > >>>> >>> Julian
>> > > > > >>>> >>>
>> > > > > >>>> >>>> On Jan 28, 2021, at 11:16 AM, Julian Hyde <
>> > > > > jhyde.apa...@gmail.com>
>> > > > > >>>> >>> wrote:
>> > > > > >>>> >>>>
>> > > > > >>>> >>>> I think good next steps would be a PR and a meeting.
>> The
>> > PR
>> > > > will
>> > > > > >>>> allow
>> > > > > >>>> >>> us to read the code, but I think we should do the first
>> > round of
>> > > > > >>>> questions
>> > > > > >>>> >>> at the meeting.  The meeting could perhaps start with a
>> > > > > >>>> presentation of the
>> > > > > >>>> >>> paper (do you have some slides you are planning to
>> present
>> > at
>> > > > > VLDB,
>> > > > > >>>> >>> Botong?) and then move on to questions about the
>> concepts,
>> > which
>> > > > > >>>> >>> alternatives were considered, and how the concepts map
>> onto
>> > > > other
>> > > > > >>>> current
>> > > > > >>>> >>> and future concepts in calcite.
>> > > > > >>>> >>>>
>> > > > > >>>> >>>> I don’t think we should start “reviewing” the PR
>> > line-by-line
>> > > > at
>> > > > > >>>> this
>> > > > > >>>> >>> point. We need to understand the high-level concepts and
>> > design
>> > > > > >>>> choices. If
>> > > > > >>>> >>> we start reviewing the PR we will get lost in the
>> details.
>> > > > > >>>> >>>>
>> > > > > >>>> >>>> I know that integrating a major change is hard; I doubt
>> > that we
>> > > > > >>>> will be
>> > > > > >>>> >>> able to integrate everything, but we can build
>> understanding
>> > > > about
>> > > > > >>>> where
>> > > > > >>>> >>> calcite needs to go, and I hope integrate a good amount
>> of
>> > code
>> > > > to
>> > > > > >>>> help us
>> > > > > >>>> >>> get there.
>> > > > > >>>> >>>>
>> > > > > >>>> >>>> As I said before, after the integration I would like
>> > people to
>> > > > be
>> > > > > >>>> able
>> > > > > >>>> >>> to experiment with it and use it in their production
>> > systems.
>> > > > > That
>> > > > > >>>> way, it
>> > > > > >>>> >>> will not be an experiment that withers, but a feature set
>> > > > > >>>> integrates with
>> > > > > >>>> >>> other calcite features and gets stronger over time.
>> > > > > >>>> >>>>
>> > > > > >>>> >>>> Julian
>> > > > > >>>> >>>>
>> > > > > >>>> >>>>> On Jan 28, 2021, at 10:54 AM, Rui Wang <
>> > amaliu...@apache.org>
>> > > > > >>>> wrote:
>> > > > > >>>> >>>>>
>> > > > > >>>> >>>>> For me to participate in the discussion for the above
>> > > > > questions,
>> > > > > >>>> I
>> > > > > >>>> >>> will
>> > > > > >>>> >>>>> need to read a lot more to know relevant context and
>> > likely
>> > > > ask
>> > > > > >>>> lots of
>> > > > > >>>> >>>>> questions :-).  A editable doc is probably good for
>> > questions
>> > > > > and
>> > > > > >>>> back
>> > > > > >>>> >>> and
>> > > > > >>>> >>>>> forward discussion.
>> > > > > >>>> >>>>>
>> > > > > >>>> >>>>>
>> > > > > >>>> >>>>> -Rui
>> > > > > >>>> >>>>>
>> > > > > >>>> >>>>>>> On Thu, Jan 28, 2021 at 10:50 AM Rui Wang <
>> > > > > amaliu...@apache.org
>> > > > > >>>> >
>> > > > > >>>> >>> wrote:
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>>>> I am also happy to help push this work into Calcite
>> > (review
>> > > > > code
>> > > > > >>>> and
>> > > > > >>>> >>> doc,
>> > > > > >>>> >>>>>> etc.).
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>>>> While you can share your code so people can have more
>> > idea
>> > > > how
>> > > > > >>>> it is
>> > > > > >>>> >>>>>> implemented, I think it would be also nice to have a
>> doc
>> > to
>> > > > > >>>> discuss
>> > > > > >>>> >>> open
>> > > > > >>>> >>>>>> questions above. Some points that I copy those to
>> here:
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>>>> 1. Can this solution be compatible with existing
>> > solutions in
>> > > > > >>>> Calcite
>> > > > > >>>> >>>>>> Streaming, materialized view maintenance, and
>> multi-query
>> > > > > >>>> optimization
>> > > > > >>>> >>>>>> (Sigma and Delta relational operators, lattice, and
>> Spool
>> > > > > >>>> operator),
>> > > > > >>>> >>>>>> 2. Did you find that you needed two separate cost
>> models
>> > -
>> > > > one
>> > > > > >>>> for
>> > > > > >>>> >>> “view
>> > > > > >>>> >>>>>> maintenance” and another for “user queries” - since
>> the
>> > > > > >>>> objectives of
>> > > > > >>>> >>> each
>> > > > > >>>> >>>>>> activity are so different?
>> > > > > >>>> >>>>>> 3. whether this work will hasten the arrival of
>> > > > multi-objective
>> > > > > >>>> >>> parametric
>> > > > > >>>> >>>>>> query optimization [1] in Calcite.
>> > > > > >>>> >>>>>> 4. probably SQL shell support.
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>>>> [1]:
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>
>> > > > > >>>>
>> > > > >
>> > > >
>> >
>> https://cacm.acm.org/magazines/2017/10/221322-multi-objective-parametric-query-optimization/fulltext
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>>>> -Rui
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>>>>> On Wed, Jan 27, 2021 at 6:52 PM Albert <
>> > zinki...@gmail.com>
>> > > > > >>>> wrote:
>> > > > > >>>> >>>>>>>
>> > > > > >>>> >>>>>>> it would be very nice to see a POC of your work.
>> > > > > >>>> >>>>>>>
>> > > > > >>>> >>>>>>>
>> > > > > >>>> >>>>>>>> On Thu, Jan 28, 2021 at 10:21 AM Botong Huang <
>> > > > > >>>> pku...@gmail.com>
>> > > > > >>>> >>> wrote:
>> > > > > >>>> >>>>>>>
>> > > > > >>>> >>>>>>>> Hi Julian,
>> > > > > >>>> >>>>>>>>
>> > > > > >>>> >>>>>>>> Just wondering if there are any updates? We are
>> > wondering
>> > > > if
>> > > > > it
>> > > > > >>>> >>> would
>> > > > > >>>> >>>>>>> help
>> > > > > >>>> >>>>>>>> to post our code for a quick preview.
>> > > > > >>>> >>>>>>>>
>> > > > > >>>> >>>>>>>> Thanks,
>> > > > > >>>> >>>>>>>> Botong
>> > > > > >>>> >>>>>>>>
>> > > > > >>>> >>>>>>>> On Fri, Jan 1, 2021 at 11:04 AM Botong Huang <
>> > > > > pku...@gmail.com
>> > > > > >>>> >
>> > > > > >>>> >>> wrote:
>> > > > > >>>> >>>>>>>>
>> > > > > >>>> >>>>>>>>> Hi Julian,
>> > > > > >>>> >>>>>>>>>
>> > > > > >>>> >>>>>>>>> Thanks for your interest! Sure let's figure out a
>> plan
>> > > > that
>> > > > > >>>> best
>> > > > > >>>> >>>>>>> benefits
>> > > > > >>>> >>>>>>>>> the community. Here are some clarifications that
>> > hopefully
>> > > > > >>>> answer
>> > > > > >>>> >>> your
>> > > > > >>>> >>>>>>>>> questions.
>> > > > > >>>> >>>>>>>>>
>> > > > > >>>> >>>>>>>>> In our work (Tempura), users specify the set of
>> time
>> > > > points
>> > > > > to
>> > > > > >>>> >>>>>>> consider
>> > > > > >>>> >>>>>>>>> running and a cost function that expresses users'
>> > > > preference
>> > > > > >>>> over
>> > > > > >>>> >>>>>>> time,
>> > > > > >>>> >>>>>>>>> Tempura will generate the best incremental plan
>> that
>> > > > > >>>> minimizes the
>> > > > > >>>> >>>>>>>> overall
>> > > > > >>>> >>>>>>>>> cost function.
>> > > > > >>>> >>>>>>>>>
>> > > > > >>>> >>>>>>>>> In this incremental plan, the sub-plans at
>> different
>> > time
>> > > > > >>>> points
>> > > > > >>>> >>> can
>> > > > > >>>> >>>>>>> be
>> > > > > >>>> >>>>>>>>> different from each other, as opposed to identical
>> > plans
>> > > > in
>> > > > > >>>> all
>> > > > > >>>> >>> delta
>> > > > > >>>> >>>>>>>> runs
>> > > > > >>>> >>>>>>>>> as in streaming or IVM. As mentioned in $2.1 of the
>> > > > Tempura
>> > > > > >>>> paper,
>> > > > > >>>> >>> we
>> > > > > >>>> >>>>>>> can
>> > > > > >>>> >>>>>>>>> mimic the current streaming implementation by
>> > specifying
>> > > > two
>> > > > > >>>> >>> (logical)
>> > > > > >>>> >>>>>>>> time
>> > > > > >>>> >>>>>>>>> points in Tempura, representing the initial run and
>> > later
>> > > > > >>>> delta
>> > > > > >>>> >>> runs
>> > > > > >>>> >>>>>>>>> respectively. In general, note that Tempura
>> supports
>> > > > various
>> > > > > >>>> form
>> > > > > >>>> >>> of
>> > > > > >>>> >>>>>>>>> incremental computing, not only the small-delta
>> > > > append-only
>> > > > > >>>> data
>> > > > > >>>> >>>>>>> model in
>> > > > > >>>> >>>>>>>>> streaming systems. That's why we believe Tempura
>> > subsumes
>> > > > > the
>> > > > > >>>> >>> current
>> > > > > >>>> >>>>>>>>> streaming support, as well as any IVM
>> implementations.
>> > > > > >>>> >>>>>>>>>
>> > > > > >>>> >>>>>>>>> About the cost model, we did not come up with a
>> > seperate
>> > > > > cost
>> > > > > >>>> >>> model,
>> > > > > >>>> >>>>>>> but
>> > > > > >>>> >>>>>>>>> rather extended the existing one. Similar to
>> > > > multi-objective
>> > > > > >>>> >>>>>>>> optimization,
>> > > > > >>>> >>>>>>>>> costs incurred at different time points are
>> considered
>> > > > > >>>> different
>> > > > > >>>> >>>>>>>>> dimensions. Tempura lets users supply a function
>> that
>> > > > > >>>> converts this
>> > > > > >>>> >>>>>>> cost
>> > > > > >>>> >>>>>>>>> vector into a final cost. So under this function,
>> any
>> > two
>> > > > > >>>> >>> incremental
>> > > > > >>>> >>>>>>>> plans
>> > > > > >>>> >>>>>>>>> are still comparable and there is an overall
>> optimum.
>> > I
>> > > > > guess
>> > > > > >>>> we
>> > > > > >>>> >>> can
>> > > > > >>>> >>>>>>> go
>> > > > > >>>> >>>>>>>>> down the route of multi-objective parametric query
>> > > > > >>>> optimization
>> > > > > >>>> >>>>>>> instead
>> > > > > >>>> >>>>>>>> if
>> > > > > >>>> >>>>>>>>> there is a need.
>> > > > > >>>> >>>>>>>>>
>> > > > > >>>> >>>>>>>>> Next on materialized views and multi-query
>> > optimization,
>> > > > > >>>> since our
>> > > > > >>>> >>>>>>>>> multi-time-point plan naturally involves
>> materializing
>> > > > > >>>> intermediate
>> > > > > >>>> >>>>>>>> results
>> > > > > >>>> >>>>>>>>> for later time points, we need to solve the
>> problem of
>> > > > > >>>> choosing
>> > > > > >>>> >>>>>>>>> materializations and include the cost of saving and
>> > > > reusing
>> > > > > >>>> the
>> > > > > >>>> >>>>>>>>> materializations when costing and comparing plans.
>> We
>> > > > > >>>> borrowed the
>> > > > > >>>> >>>>>>>>> multi-query optimization techniques to solve this
>> > problem
>> > > > > even
>> > > > > >>>> >>> though
>> > > > > >>>> >>>>>>> we
>> > > > > >>>> >>>>>>>>> are looking at a single query. As a result, we
>> think
>> > our
>> > > > > work
>> > > > > >>>> is
>> > > > > >>>> >>>>>>>> orthogonal
>> > > > > >>>> >>>>>>>>> to Calcite's facilities around utilizing existing
>> > views,
>> > > > > >>>> lattice
>> > > > > >>>> >>> etc.
>> > > > > >>>> >>>>>>> We
>> > > > > >>>> >>>>>>>> do
>> > > > > >>>> >>>>>>>>> feel that the multi-query optimization component
>> can
>> > be
>> > > > > >>>> adopted to
>> > > > > >>>> >>>>>>> wider
>> > > > > >>>> >>>>>>>>> use, but probably need more suggestions from the
>> > > > community.
>> > > > > >>>> >>>>>>>>>
>> > > > > >>>> >>>>>>>>> Lastly, our current implementation is set up in
>> java
>> > code,
>> > > > > it
>> > > > > >>>> >>> should
>> > > > > >>>> >>>>>>> be
>> > > > > >>>> >>>>>>>>> straightforward to hook it up with SQL shell.
>> > > > > >>>> >>>>>>>>>
>> > > > > >>>> >>>>>>>>> Thanks,
>> > > > > >>>> >>>>>>>>> Botong
>> > > > > >>>> >>>>>>>>>
>> > > > > >>>> >>>>>>>>> On Mon, Dec 28, 2020 at 6:44 PM Julian Hyde <
>> > > > > >>>> >>> jhyde.apa...@gmail.com>
>> > > > > >>>> >>>>>>>>> wrote:
>> > > > > >>>> >>>>>>>>>
>> > > > > >>>> >>>>>>>>>> Botong,
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>> This is very exciting; congratulations on this
>> > research,
>> > > > > and
>> > > > > >>>> thank
>> > > > > >>>> >>>>>>> you
>> > > > > >>>> >>>>>>>>>> for contributing it back to Calcite.
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>> The research touches several areas in Calcite:
>> > streaming,
>> > > > > >>>> >>>>>>> materialized
>> > > > > >>>> >>>>>>>>>> view maintenance, and multi-query optimization.
>> As we
>> > > > have
>> > > > > >>>> already
>> > > > > >>>> >>>>>>> some
>> > > > > >>>> >>>>>>>>>> solutions in those areas (Sigma and Delta
>> relational
>> > > > > >>>> operators,
>> > > > > >>>> >>>>>>> lattice,
>> > > > > >>>> >>>>>>>>>> and Spool operator), it will be interesting to see
>> > > > whether
>> > > > > >>>> we can
>> > > > > >>>> >>>>>>> make
>> > > > > >>>> >>>>>>>> them
>> > > > > >>>> >>>>>>>>>> compatible, or whether one concept can subsume
>> > others.
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>> Your work differs from streaming queries in that
>> your
>> > > > > >>>> relations
>> > > > > >>>> >>> are
>> > > > > >>>> >>>>>>> used
>> > > > > >>>> >>>>>>>>>> by “external” user queries, whereas in pure
>> streaming
>> > > > > >>>> queries, the
>> > > > > >>>> >>>>>>> only
>> > > > > >>>> >>>>>>>>>> activity is the change propagation. Did you find
>> > that you
>> > > > > >>>> needed
>> > > > > >>>> >>> two
>> > > > > >>>> >>>>>>>>>> separate cost models - one for “view maintenance”
>> and
>> > > > > >>>> another for
>> > > > > >>>> >>>>>>> “user
>> > > > > >>>> >>>>>>>>>> queries” - since the objectives of each activity
>> are
>> > so
>> > > > > >>>> different?
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>> I wonder whether this work will hasten the
>> arrival of
>> > > > > >>>> >>> multi-objective
>> > > > > >>>> >>>>>>>>>> parametric query optimization [1] in Calcite.
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>> I will make time over the next few days to read
>> and
>> > > > digest
>> > > > > >>>> your
>> > > > > >>>> >>>>>>> paper.
>> > > > > >>>> >>>>>>>>>> Then I expect that we will have a back-and-forth
>> > process
>> > > > to
>> > > > > >>>> create
>> > > > > >>>> >>>>>>>>>> something that will be useful for the broader
>> > community.
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>> One thing will be particularly useful: making this
>> > > > > >>>> functionality
>> > > > > >>>> >>>>>>>>>> available from a SQL shell, so that people can
>> > experiment
>> > > > > >>>> with
>> > > > > >>>> >>> this
>> > > > > >>>> >>>>>>>>>> functionality without writing Java code or
>> setting up
>> > > > > complex
>> > > > > >>>> >>>>>>> databases
>> > > > > >>>> >>>>>>>> and
>> > > > > >>>> >>>>>>>>>> metadata. I have in mind something like the simple
>> > DDL
>> > > > > >>>> operations
>> > > > > >>>> >>>>>>> that
>> > > > > >>>> >>>>>>>> are
>> > > > > >>>> >>>>>>>>>> available in Calcite’s ’server’ module. I wonder
>> > whether
>> > > > we
>> > > > > >>>> could
>> > > > > >>>> >>>>>>> devise
>> > > > > >>>> >>>>>>>>>> some kind of SQL syntax for a “multi-query”.
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>> Julian
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>> [1]
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>
>> > > > > >>>> >>>>>>>
>> > > > > >>>> >>>
>> > > > > >>>>
>> > > > >
>> > > >
>> >
>> https://cacm.acm.org/magazines/2017/10/221322-multi-objective-parametric-query-optimization/fulltext
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>> On Dec 23, 2020, at 8:55 PM, Botong Huang <
>> > > > > pku...@gmail.com
>> > > > > >>>> >
>> > > > > >>>> >>>>>>> wrote:
>> > > > > >>>> >>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>> Thanks Aron for pointing this out. To see the
>> > figure,
>> > > > > please
>> > > > > >>>> >>> refer
>> > > > > >>>> >>>>>>> to
>> > > > > >>>> >>>>>>>>>> Fig
>> > > > > >>>> >>>>>>>>>>> 3(a) in our paper:
>> > > > > >>>> >>>>>>>>>>
>> > https://kai-zeng.github.io/papers/tempura-vldb2021.pdf
>> > > > > >>>> >>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>> Best,
>> > > > > >>>> >>>>>>>>>>> Botong
>> > > > > >>>> >>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>> On Wed, Dec 23, 2020 at 7:20 PM JiaTao Tao <
>> > > > > >>>> taojia...@gmail.com>
>> > > > > >>>> >>>>>>>> wrote:
>> > > > > >>>> >>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>> Seems interesting, the pic can not be seen in
>> the
>> > mail,
>> > > > > >>>> may you
>> > > > > >>>> >>>>>>> open
>> > > > > >>>> >>>>>>>> a
>> > > > > >>>> >>>>>>>>>> JIRA
>> > > > > >>>> >>>>>>>>>>>> for this, people who are interested in this can
>> > > > subscribe
>> > > > > >>>> to the
>> > > > > >>>> >>>>>>>> JIRA?
>> > > > > >>>> >>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>> Regards!
>> > > > > >>>> >>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>> Aron Tao
>> > > > > >>>> >>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>> Botong Huang <bot...@apache.org> 于2020年12月24日周四
>> > > > > 上午3:18写道:
>> > > > > >>>> >>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> Hi all,
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> This is a proposal to extend the Calcite
>> optimizer
>> > > > into
>> > > > > a
>> > > > > >>>> >>> general
>> > > > > >>>> >>>>>>>>>>>>> incremental query optimizer, based on our
>> research
>> > > > paper
>> > > > > >>>> >>>>>>> published
>> > > > > >>>> >>>>>>>> in
>> > > > > >>>> >>>>>>>>>>>> VLDB
>> > > > > >>>> >>>>>>>>>>>>> 2021:
>> > > > > >>>> >>>>>>>>>>>>> Tempura: a general cost-based optimizer
>> framework
>> > for
>> > > > > >>>> >>> incremental
>> > > > > >>>> >>>>>>>> data
>> > > > > >>>> >>>>>>>>>>>>> processing
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> We also have a demo in SIGMOD 2020 illustrating
>> > how
>> > > > > >>>> Alibaba’s
>> > > > > >>>> >>>>>>> data
>> > > > > >>>> >>>>>>>>>>>>> warehouse is planning to use this incremental
>> > query
>> > > > > >>>> optimizer
>> > > > > >>>> >>> to
>> > > > > >>>> >>>>>>>>>>>> alleviate
>> > > > > >>>> >>>>>>>>>>>>> cluster-wise resource skewness:
>> > > > > >>>> >>>>>>>>>>>>> Grosbeak: A Data Warehouse Supporting
>> > Resource-Aware
>> > > > > >>>> >>> Incremental
>> > > > > >>>> >>>>>>>>>>>> Computing
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> To our best knowledge, this is the first
>> general
>> > > > > >>>> cost-based
>> > > > > >>>> >>>>>>>>>> incremental
>> > > > > >>>> >>>>>>>>>>>>> optimizer that can find the best plan across
>> > multiple
>> > > > > >>>> families
>> > > > > >>>> >>> of
>> > > > > >>>> >>>>>>>>>>>>> incremental computing methods, including IVM,
>> > > > Streaming,
>> > > > > >>>> >>>>>>> DBToaster,
>> > > > > >>>> >>>>>>>>>> etc.
>> > > > > >>>> >>>>>>>>>>>>> Experiments (in the paper) shows that the
>> > generated
>> > > > best
>> > > > > >>>> plan
>> > > > > >>>> >>> is
>> > > > > >>>> >>>>>>>>>>>>> consistently much better than the plans from
>> each
>> > > > > >>>> individual
>> > > > > >>>> >>>>>>> method
>> > > > > >>>> >>>>>>>>>>>> alone.
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> In general, incremental query planning is
>> central
>> > to
>> > > > > >>>> database
>> > > > > >>>> >>>>>>> view
>> > > > > >>>> >>>>>>>>>>>>> maintenance and stream processing systems, and
>> are
>> > > > being
>> > > > > >>>> >>> adopted
>> > > > > >>>> >>>>>>> in
>> > > > > >>>> >>>>>>>>>>>> active
>> > > > > >>>> >>>>>>>>>>>>> databases, resumable query execution,
>> approximate
>> > > > query
>> > > > > >>>> >>>>>>> processing,
>> > > > > >>>> >>>>>>>>>> etc.
>> > > > > >>>> >>>>>>>>>>>> We
>> > > > > >>>> >>>>>>>>>>>>> are hoping that this feature can help widening
>> the
>> > > > > >>>> spectrum of
>> > > > > >>>> >>>>>>>>>> Calcite,
>> > > > > >>>> >>>>>>>>>>>>> solicit more use cases and adoption of Calcite.
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> Below is a brief description of the technical
>> > details.
>> > > > > >>>> Please
>> > > > > >>>> >>>>>>> refer
>> > > > > >>>> >>>>>>>> to
>> > > > > >>>> >>>>>>>>>>>> the
>> > > > > >>>> >>>>>>>>>>>>> Tempura paper for more details. We are also
>> > working
>> > > > on a
>> > > > > >>>> >>> journal
>> > > > > >>>> >>>>>>>>>> version
>> > > > > >>>> >>>>>>>>>>>> of
>> > > > > >>>> >>>>>>>>>>>>> the paper with more implementation details.
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> Currently the query plan generated by Calcite
>> is
>> > meant
>> > > > > to
>> > > > > >>>> be
>> > > > > >>>> >>>>>>>> executed
>> > > > > >>>> >>>>>>>>>>>>> altogether at once. In the proposal, Calcite’s
>> > memo
>> > > > will
>> > > > > >>>> be
>> > > > > >>>> >>>>>>> extended
>> > > > > >>>> >>>>>>>>>> with
>> > > > > >>>> >>>>>>>>>>>>> temporal information so that it is capable of
>> > > > generating
>> > > > > >>>> >>>>>>> incremental
>> > > > > >>>> >>>>>>>>>>>> plans
>> > > > > >>>> >>>>>>>>>>>>> that include multiple sub-plans to execute at
>> > > > different
>> > > > > >>>> time
>> > > > > >>>> >>>>>>> points.
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> The main idea is to view each table as one that
>> > > > changes
>> > > > > >>>> over
>> > > > > >>>> >>> time
>> > > > > >>>> >>>>>>>>>> (Time
>> > > > > >>>> >>>>>>>>>>>>> Varying Relations (TVR)). To achieve that we
>> > > > introduced
>> > > > > >>>> >>>>>>> TvrMetaSet
>> > > > > >>>> >>>>>>>>>> into
>> > > > > >>>> >>>>>>>>>>>>> Calcite’s memo besides RelSet and RelSubset to
>> > track
>> > > > > >>>> related
>> > > > > >>>> >>>>>>> RelSets
>> > > > > >>>> >>>>>>>>>> of a
>> > > > > >>>> >>>>>>>>>>>>> changing table (e.g. snapshot of the table at
>> > certain
>> > > > > >>>> time,
>> > > > > >>>> >>>>>>> delta of
>> > > > > >>>> >>>>>>>>>> the
>> > > > > >>>> >>>>>>>>>>>>> table between two time points, etc.).
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> [image: image.png]
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> For example in the above figure, each vertical
>> > line
>> > > > is a
>> > > > > >>>> >>>>>>> TvrMetaSet
>> > > > > >>>> >>>>>>>>>>>>> representing a TVR (S, R, S left outer join R,
>> > etc.).
>> > > > > >>>> >>> Horizontal
>> > > > > >>>> >>>>>>>> lines
>> > > > > >>>> >>>>>>>>>>>>> represent time. Each black dot in the grid is a
>> > > > RelSet.
>> > > > > >>>> Users
>> > > > > >>>> >>> can
>> > > > > >>>> >>>>>>>>>> write
>> > > > > >>>> >>>>>>>>>>>> TVR
>> > > > > >>>> >>>>>>>>>>>>> Rewrite Rules to describe valid transformations
>> > > > between
>> > > > > >>>> these
>> > > > > >>>> >>>>>>> dots.
>> > > > > >>>> >>>>>>>>>> For
>> > > > > >>>> >>>>>>>>>>>>> example, the blues lines are inter-TVR rules
>> that
>> > > > > >>>> describe how
>> > > > > >>>> >>> to
>> > > > > >>>> >>>>>>>>>> compute
>> > > > > >>>> >>>>>>>>>>>>> certain RelSet of a TVR from RelSets of other
>> > TVRs.
>> > > > The
>> > > > > >>>> red
>> > > > > >>>> >>> lines
>> > > > > >>>> >>>>>>>> are
>> > > > > >>>> >>>>>>>>>>>>> intra-TVR rules that describe transformations
>> > within a
>> > > > > >>>> TVR. All
>> > > > > >>>> >>>>>>> TVR
>> > > > > >>>> >>>>>>>>>>>> rewrite
>> > > > > >>>> >>>>>>>>>>>>> rules are logical rules. All existing Calcite
>> > rules
>> > > > > still
>> > > > > >>>> work
>> > > > > >>>> >>> in
>> > > > > >>>> >>>>>>>> the
>> > > > > >>>> >>>>>>>>>> new
>> > > > > >>>> >>>>>>>>>>>>> volcano system without modification.
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> All changes in this feature will consist of
>> four
>> > > > parts:
>> > > > > >>>> >>>>>>>>>>>>> 1. Memo extension with TvrMetaSet
>> > > > > >>>> >>>>>>>>>>>>> 2. Rule engine upgrade, capable of matching
>> > TvrMetaSet
>> > > > > and
>> > > > > >>>> >>>>>>> RelNodes,
>> > > > > >>>> >>>>>>>>>> as
>> > > > > >>>> >>>>>>>>>>>>> well as links in between the nodes.
>> > > > > >>>> >>>>>>>>>>>>> 3. A basic set of TvrRules, written using the
>> > upgraded
>> > > > > >>>> rule
>> > > > > >>>> >>>>>>> engine
>> > > > > >>>> >>>>>>>>>> API.
>> > > > > >>>> >>>>>>>>>>>>> 4. Multi-query optimization, used to find the
>> best
>> > > > > >>>> incremental
>> > > > > >>>> >>>>>>> plan
>> > > > > >>>> >>>>>>>>>>>>> involving multiple time points.
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> Note that this feature is an extension in
>> nature
>> > and
>> > > > > thus
>> > > > > >>>> when
>> > > > > >>>> >>>>>>>>>> disabled,
>> > > > > >>>> >>>>>>>>>>>>> does not change any existing Calcite behavior.
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> Other than scenarios in the paper, we also
>> applied
>> > > > this
>> > > > > >>>> >>>>>>>>>> Calcite-extended
>> > > > > >>>> >>>>>>>>>>>>> incremental query optimizer to a type of
>> periodic
>> > > > query
>> > > > > >>>> called
>> > > > > >>>> >>>>>>> the
>> > > > > >>>> >>>>>>>>>>>> ‘‘range
>> > > > > >>>> >>>>>>>>>>>>> query’’ in Alibaba’s data warehouse. It
>> achieved
>> > cost
>> > > > > >>>> savings
>> > > > > >>>> >>> of
>> > > > > >>>> >>>>>>> 80%
>> > > > > >>>> >>>>>>>>>> on
>> > > > > >>>> >>>>>>>>>>>>> total CPU and memory consumption, and 60% on
>> > > > end-to-end
>> > > > > >>>> >>> execution
>> > > > > >>>> >>>>>>>>>> time.
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> All comments and suggestions are welcome.
>> Thanks
>> > and
>> > > > > happy
>> > > > > >>>> >>>>>>> holidays!
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>> Best,
>> > > > > >>>> >>>>>>>>>>>>> Botong
>> > > > > >>>> >>>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>>>
>> > > > > >>>> >>>>>>>>
>> > > > > >>>> >>>>>>>
>> > > > > >>>> >>>>>>>
>> > > > > >>>> >>>>>>> --
>> > > > > >>>> >>>>>>> ~~~~~~~~~~~~~~~
>> > > > > >>>> >>>>>>> no mistakes
>> > > > > >>>> >>>>>>> ~~~~~~~~~~~~~~~~~~
>> > > > > >>>> >>>>>>>
>> > > > > >>>> >>>>>>
>> > > > > >>>> >>>
>> > > > > >>>> >>
>> > > > > >>>>
>> > > > > >>>>
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > Viliam Durina
>> > > > Jet Developer
>> > > >       hazelcast®
>> > > >
>> > > >   <https://www.hazelcast.com> 2 W 5th Ave, Ste 300 | San Mateo, CA
>> > 94402 |
>> > > > USA
>> > > > +1 (650) 521-5453 | hazelcast.com <https://www.hazelcast.com>
>> > > >
>> > > > --
>> > > > This message contains confidential information and is intended only
>> for
>> > > > the
>> > > > individuals named. If you are not the named addressee you should not
>> > > > disseminate, distribute or copy this e-mail. Please notify the
>> sender
>> > > > immediately by e-mail if you have received this e-mail by mistake
>> and
>> > > > delete this e-mail from your system. E-mail transmission cannot be
>> > > > guaranteed to be secure or error-free as information could be
>> > intercepted,
>> > > > corrupted, lost, destroyed, arrive late or incomplete, or contain
>> > viruses.
>> > > > The sender therefore does not accept liability for any errors or
>> > omissions
>> > > > in the contents of this message, which arise as a result of e-mail
>> > > > transmission. If verification is required, please request a
>> hard-copy
>> > > > version. -Hazelcast
>> > > >
>> >
>>
>

Reply via email to