Thanks Danny.

Those are good points. I think it depends on what we consider hint as.
IMHO, if we consider hint as a kind of metadata,
it is not a good idea to store the hints in the RelNode instance.



Best,
Chunwei

On Wed, Apr 24, 2019 at 11:09 AM Yuzhao Chen <[email protected]> wrote:
>
> Thx, Julian
>
> I think the hint path is a good way for searching RelNode’s parents, broadly, 
> there may be these modules/things need to be modified:
>
> 1. Supports hints grammar for parser.jj
> 2. Cache the hints in the RelNode instance, and add method like 
> RelNode#getHints() to fetch all the hints inherited for this node.
> 3. Modify #copy method for every kind of RelNode so that the hints can be 
> copied when creating new equivalent nodes.
> 4. Add a visitor in after sql-to-rel phrase, to set up full hints list for 
> every children RelNode if there exists any.
> 5. Add hints metadata handler and handles the hints fetching and overriding 
> for specific kind of RelNode
>
> The 2 and 3 are the modifications that i really want to confirm, that is, 
> shall we store the hints in the RelNode instance ?
>
> These are initial thoughts and if we make agreement, I would output a detail 
> design doc which contains:
>
> 1. The hints grammar supported for the major sql engines
> 2. The hints grammar supported for Apache Calcite
> 3. The interface and design ideas of the proposed modifications
>
>
> Best,
> Danny Chan
> 在 2019年4月24日 +0800 AM3:04,Julian Hyde <[email protected]>,写道:
> > I see that if you have a hint on, say, the root node then it would be nice 
> > for its child or grand-child to be able to see that hint.
> >
> > How about giving each hint an inherit path? Thus given
> >
> > Filter Hint1
> > +- Join
> > +- Scan
> > +- Project Hint2
> > +- Scan
> >
> >
> > Filter would have hints {Hint1[]}
> > Join would have hints {Hint1[0]}
> > Scan would have hints {Hint1[0, 0]}
> > Project would have hints {Hint1[0,1], Hint2}
> > Scan2 would have hints {[Hint1[0, 0, 1, 0], Hint2[0]}
> >
> > You could populate the hints and inherit paths with a single visitor pass 
> > after sql-to-rel conversion.
> >
> > By the way, I still like the idea of having kinds as a kind of RelMetadata, 
> > but I realize that a given RelNode might have more than one hint. So I 
> > think that the getHints(RelNode) method would return a List<Hint>, with 
> > Hint as follows:
> >
> > class Hint {
> > public final List<Integer> inheritPath; // immutable, not null
> > public final String type; // not null
> > public final Object operand; // immutable, may be null, must be JSON data
> > }
> >
> > operand must be JSON-style data (null, boolean, number, String, immutable 
> > List of JSON data, or immutable order-preserving Map from String to JSON 
> > data).
> >
> > > On Apr 23, 2019, at 1:25 AM, Yuzhao Chen <[email protected]> wrote:
> > >
> > > Thx, Andrew
> > >
> > > I don’t want to have a custom RelNode class, I hope all the work about 
> > > hints would be contributed to the community. I want to find an acceptable 
> > > way to keep and propagate the hints if we use the MetadataHandler to 
> > > cache and query the hints.
> > >
> > > I don’t think the hints should be mixed into the cost model, that would 
> > > make the cost computation very complex and hard to maintain, we only need 
> > > the hints in our planning phrase to give suggestions, hints is more like 
> > > another guideline for me and transparent to the planner.
> > >
> > > Best,
> > > Danny Chan
> > > 在 2019年4月23日 +0800 PM2:24,Андрей Цвелодуб <[email protected]>,写道:
> > > > Hi Danny,
> > > >
> > > > I would also agree with Julian on his position. I've tried to get around
> > > > this limitation in several different ways, but none of it ended well :)
> > > >
> > > > For your idea with hints, if you have custom RelNode classes, you can 
> > > > add
> > > > hint as an additional field of the class and you can write a simple rule
> > > > that propagates the hint downwards, step by step. And also include the 
> > > > hint
> > > > in your cost estimation, so that nodes with hints would be more 
> > > > attractive
> > > > to the planner. I'm not sure this would be the most correct way to use 
> > > > the
> > > > cost mechanism, but at least it is straightforward and it works.
> > > >
> > > > Best Regards,
> > > > Andrew Tsvelodub
> > > >
> > > > On Tue, 23 Apr 2019 at 08:44, Yuzhao Chen <[email protected]> wrote:
> > > >
> > > > > Julian,
> > > > >
> > > > > I want to add hint support for Calcite, the initial idea was to tag a
> > > > > RelNode(transformed from a SqlNode with hint) with a hit attribute(or
> > > > > trait), then I hope that the children (inputs) of it can see this 
> > > > > hint, so
> > > > > to make some decisions if it should consume or propagate the hint.
> > > > >
> > > > > The problem I got here is the trait propagate from inputs from, which 
> > > > > is
> > > > > the opposite as what I need, can you give some suggestions ? If I use
> > > > > MetadataHandler to cache and propagate the hints, how to propagate 
> > > > > from
> > > > > parents to children ?
> > > > >
> > > > > Best,
> > > > > Danny Chan
> > > > > 在 2019年4月23日 +0800 AM3:14,Julian Hyde <[email protected]>,写道:
> > > > > > TL;DR: RelNodes don’t really have parents. Be careful if you are 
> > > > > > relying
> > > > > on the parent concept too much. Rely on rules instead.
> > > > > >
> > > > > > In the Volcano model, a RelNode doesn’t really have a parent. It 
> > > > > > might
> > > > > be used in several places. (RelSet has a field ‘List<RelNode> 
> > > > > parents’ that
> > > > > is kept up to date as planing progresses. But it’s really for 
> > > > > Volcano’s
> > > > > internal use.)
> > > > > >
> > > > > > Even if you are not using Volcano, there are reasons to want the 
> > > > > > RelNode
> > > > > graph to be a dag, so again, a RelNode doesn’t have a unique parent.
> > > > > >
> > > > > > RelShuttleImpl has a stack. You can use that to find the parent. 
> > > > > > But the
> > > > > “parent” is just “where we came from as we traversed the RelNode 
> > > > > graph”.
> > > > > There may be other “parents” that you do not know about.
> > > > > >
> > > > > > If you have a Project and want to find all parents that are Filters,
> > > > > don’t even think about “iterating over the parents” of the Project. 
> > > > > Just
> > > > > write a rule that matches a Filter on a Project, and trust Volcano to 
> > > > > do
> > > > > its job.
> > > > > >
> > > > > > Julian
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > On Apr 22, 2019, at 6:15 AM, Yuzhao Chen <[email protected]> 
> > > > > > > wrote:
> > > > > > >
> > > > > > > Thx, Stamatis, that somehow make sense, if i pass around the 
> > > > > > > parent
> > > > > node every time I visit a RelNode and keep the parents in the cache, 
> > > > > but it
> > > > > is still not that intuitive. Actually I what a to add a new RelTrait 
> > > > > which
> > > > > bind to a specific scope, for example:
> > > > > > >
> > > > > > > join-rel(trait1)
> > > > > > > / \
> > > > > > > join2 join3
> > > > > > >
> > > > > > > Join-rel has a trait trait1, and I want all the children of 
> > > > > > > join-rel
> > > > > can see this trait, with Calcite’s default metadata handler, I can 
> > > > > only see
> > > > > the trait from children nodes(traits propagate from the inputs), and 
> > > > > I have
> > > > > no idea how to propagate a trait reversely?
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Danny Chan
> > > > > > > 在 2019年4月22日 +0800 PM8:44,Stamatis Zampetakis 
> > > > > > > <[email protected]>,写道:
> > > > > > > > Hi Danny,
> > > > > > > >
> > > > > > > > Apart from RelShuttle there is also RelVisitor which has a visit
> > > > > method
> > > > > > > > that provides the parent [1]. Not sure, if it suits your needs.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Stamatis
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > https://github.com/apache/calcite/blob/ee83efd360793ef4201f4cdfc2af8d837b76ca69/core/src/main/java/org/apache/calcite/rel/RelVisitor.java#L43
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Apr 22, 2019 at 2:14 PM Yuzhao Chen 
> > > > > > > > <[email protected]>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Now for RelNode, we have method getInput()[1] to fetch the 
> > > > > > > > > input
> > > > > > > > > RelNodes, but how we fetch the parent ?
> > > > > > > > >
> > > > > > > > > For example, we have plan:
> > > > > > > > >
> > > > > > > > > join-rel
> > > > > > > > > / \
> > > > > > > > > scan1 scan2
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > We can get scan1 and scan2 in join-rel directly with method
> > > > > getInput, but
> > > > > > > > > how can we get the join rel in scan1 and scan 2 ?
> > > > > > > > >
> > > > > > > > > I know that there is a RelShuttle that can visit every 
> > > > > > > > > RelNode and
> > > > > if I
> > > > > > > > > make a cache for the inputs mapping, finally I can get the
> > > > > ‘parents’ from
> > > > > > > > > the cache, but this is boring code and not that intuitive.
> > > > > > > > >
> > > > > > > > > Do you guys have any good ideas ?
> > > > > > > > >
> > > > > > > > > [1]
> > > > > > > > >
> > > > > https://github.com/apache/calcite/blob/ee83efd360793ef4201f4cdfc2af8d837b76ca69/core/src/main/java/org/apache/calcite/rel/RelNode.java#L132
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Danny Chan
> > > > > > > > >
> > > > > >
> > > > >
> >

Reply via email to