Re: Will the number of traits in the traitSet affect the time of Volcano optimization?

Hequn Cheng Mon, 21 Jan 2019 01:48:19 -0800

Hi Albert,

Thank you for the input! It seems `Context` is not what I want. The
`Context` is used to store data within the planner session and access it
within rules. My requirements are:
- I don't need the configuration during optimization. The configurations
are set from the API by users while used after optimization.
- A job may contain multi aggregates and each of them may contain a
different configuration, so unwrapping using class name seems useless.


Thank you all the same for the suggestions!

Best, Hequn

On Fri, Jan 18, 2019 at 3:44 PM Albert <[email protected]> wrote:

> I guess trait really feels different from your config.  I guess `Context`
> is where you are looking for, see below.
>
> /**
>  * Does nothing.
>  *
>  * @deprecated Previously, this method installed the cancellation-checking
>  * flag for this planner, but is now deprecated. Now, you should add a
>  * {@link CancelFlag} to the {@link Context} passed to the constructor.
>  *
>  * @param cancelFlag flag which the planner should periodically check
>  */
> @Deprecated // to be removed before 2.0
> void setCancelFlag(CancelFlag cancelFlag);
>
>
>
>
> On Fri, Jan 18, 2019 at 9:35 AM Hequn Cheng <[email protected]> wrote:
>
> > Hi Stamatis,
> >
> > Thanks a lot for your reply. Yes, it seems the traits currently in
> Calcite
> > are used by the optimizer. I wonder whether we can extend it for other
> > use-cases. For example, I want to provide a way to the users that they
> can
> > set memory or cpu settings for an aggregate node from the user api. These
> > settings will only be used after optimization.
> >
> > I haven't found other ways to achieve this, so maybe using trait is a
> neat
> > way?
> >
> > Best, Hequn
> >
> >
> > On Fri, Jan 18, 2019 at 12:00 AM Stamatis Zampetakis <[email protected]>
> > wrote:
> >
> > > Hi Hequn,
> > >
> > > I would describe traits as properties associated with RelNodes that
> > provide
> > > useful information to the optimizer (rules etc.) in order to generate a
> > > plan.
> > >
> > > If the configuration you are referring to is meant to guide the
> optimizer
> > > in generating a plan then it seems ok to use traits. If not then
> probably
> > > you need something different.
> > >
> > > Can you elaborate a bit more on your usecase?
> > >
> > > Best,
> > > Stamatis
> > >
> > >
> > > On Tue, Jan 15, 2019, 10:57 AM Hequn Cheng <[email protected]
> wrote:
> > >
> > > > Hi Julian,
> > > >
> > > > Thanks a lot for your reply and the detailed explanation. It solves
> my
> > > > doubts well.
> > > > My custom trait only contains one value, so I think that there will
> not
> > > be
> > > > a problem.
> > > >
> > > > May I further the email with another question:
> > > > Is it ok or right to use a trait to pass configurations through
> > RelNodes?
> > > > For example, a configuration set from api for the aggregate and used
> > > after
> > > > the optimization.
> > > > If not, are there any standard ways to achieve this?
> > > >
> > > > I haven't found any clear definition about trait. Only find comments
> in
> > > > code: *RelTrait represents the manifestation of a relational
> expression
> > > > trait within a trait definition.*
> > > >
> > > > Thank you!
> > > >
> > > > On Tue, Jan 15, 2019 at 3:12 AM Julian Hyde <[email protected]>
> wrote:
> > > >
> > > > > In most cases increasing the number of traits from one to two will
> > > > > increase the planning time by a negligible amount.
> > > > >
> > > > > But it can increase the size of the search space. Suppose a
> > particular
> > > > > relational expression has 5 possible sort orders (order by x, order
> > by
> > > x,
> > > > > y, order by (), order by z, order by x, z), and initially you have
> > only
> > > > the
> > > > > collation trait enabled. A particular equivalence set might have 5
> > > > subsets,
> > > > > one for each sort order. Now let’s suppose you add the distribution
> > > trait
> > > > > to the mix, and there are 3 distributions (partition by (),
> partition
> > > by
> > > > x,
> > > > > partition by z). Now that subset will have 15 subsets, for the
> > > cartesian
> > > > > product of the traits.
> > > > >
> > > > > A larger search space could increase the planning time (and memory
> > > usage)
> > > > > significantly.
> > > > >
> > > > > But if each trait has only one or two values I doubt that there
> will
> > > be a
> > > > > problem.
> > > > >
> > > > > Julian
> > > > >
> > > > >
> > > > > > On Jan 14, 2019, at 1:38 AM, Hequn Cheng <[email protected]>
> > > wrote:
> > > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I want to pass properties through RelNodes via trait and I wonder
> > if
> > > > the
> > > > > > number of traits in traitSet will affect the time of Volcano
> > > > > optimization.
> > > > > > For example, increasing one traitDef to two in VolcanoPlanner.
> > > > > >
> > > > > > I guess the answer is No. Is it correct?
> > > > > > As long as the search space is not increased, the Volcano
> > > optimization
> > > > > time
> > > > > > will not increase. And simply increasing the number of traits
> alone
> > > > does
> > > > > > not add complexity.
> > > > > >
> > > > > > Furthermore, besides time, are there any other side effects if I
> > > > increase
> > > > > > the number of traits?
> > > > > >
> > > > > > Thank you very much!
> > > > > >
> > > > > > Best,
> > > > > > Hequn
> > > > >
> > > > >
> > > >
> > >
> >
>
>
> --
> ~~~~~~~~~~~~~~~
> no mistakes
> ~~~~~~~~~~~~~~~~~~
>

Re: Will the number of traits in the traitSet affect the time of Volcano optimization?

Reply via email to