Thx, Julian

Let me repeat my thoughts about the details again, in order to implement the 
hints, maybe these things are needed:

The main diff is that we will maintain a global hints cache
1. Supports hints grammar for parser.jj
2. During/after sql-to-rel, we may pass a hints cache to the SqlToRelConverter, 
there is a visitor to setup/init the RelNodes’hints to the cache once at a 
time, this cache scope is global and would be active the whole query planning 
time. The cache only keep hints for few top nodes that really needs
3. In the Planner, add set/get hints cache method, so that in the planning 
rules, we can see the hints cache,
And we can also ban some rule matching in the planner
4. Hook the RelOptCall#transformTo method to handle logic of hints 
propagating(invoke the hints logic again same as sql-to-rel phrase), this will 
also update the global hints cache
It seems that given the global hints cache, we do not need the MetaDataHandler 
any more, this is the thing I most want to make sure.
Hope for your suggestions.

Best,
Danny Chan
在 2019年4月25日 +0800 AM3:07,Julian Hyde <[email protected]>,写道:
> I think it’s OK to attach hints to the (few) RelNodes that come out of the 
> SqlToRelConverter.
>
> But it would be a mistake to try to propagate those hints to all of the 
> RelNodes that are created during query planning. Even if we changed all of 
> the copy methods (a huge task) there are many other ways that RelNodes get 
> created. We would end up with a RelNode graph with lots of hints, and most of 
> those hints would be inaccurate or not applicable.
>
> For a particular hint, say "/*+ nohashjoin */“, some piece of code would need 
> to look at the initial RelNode tree and take its own action: say, build a 
> data structure to be used by planner rules, or enable or disable planner 
> rules.
>
>
> > On Apr 23, 2019, at 9:31 PM, Chunwei Lei <[email protected]> wrote:
> >
> > Thanks Danny.
> >
> > Those are good points. I think it depends on what we consider hint as.
> > IMHO, if we consider hint as a kind of metadata,
> > it is not a good idea to store the hints in the RelNode instance.
> >
> >
> >
> > Best,
> > Chunwei
> >
> > On Wed, Apr 24, 2019 at 11:09 AM Yuzhao Chen <[email protected]> wrote:
> > >
> > > Thx, Julian
> > >
> > > I think the hint path is a good way for searching RelNode’s parents, 
> > > broadly, there may be these modules/things need to be modified:
> > >
> > > 1. Supports hints grammar for parser.jj
> > > 2. Cache the hints in the RelNode instance, and add method like 
> > > RelNode#getHints() to fetch all the hints inherited for this node.
> > > 3. Modify #copy method for every kind of RelNode so that the hints can be 
> > > copied when creating new equivalent nodes.
> > > 4. Add a visitor in after sql-to-rel phrase, to set up full hints list 
> > > for every children RelNode if there exists any.
> > > 5. Add hints metadata handler and handles the hints fetching and 
> > > overriding for specific kind of RelNode
> > >
> > > The 2 and 3 are the modifications that i really want to confirm, that is, 
> > > shall we store the hints in the RelNode instance ?
> > >
> > > These are initial thoughts and if we make agreement, I would output a 
> > > detail design doc which contains:
> > >
> > > 1. The hints grammar supported for the major sql engines
> > > 2. The hints grammar supported for Apache Calcite
> > > 3. The interface and design ideas of the proposed modifications
> > >
> > >
> > > Best,
> > > Danny Chan
> > > 在 2019年4月24日 +0800 AM3:04,Julian Hyde <[email protected]>,写道:
> > > > I see that if you have a hint on, say, the root node then it would be 
> > > > nice for its child or grand-child to be able to see that hint.
> > > >
> > > > How about giving each hint an inherit path? Thus given
> > > >
> > > > Filter Hint1
> > > > +- Join
> > > > +- Scan
> > > > +- Project Hint2
> > > > +- Scan
> > > >
> > > >
> > > > Filter would have hints {Hint1[]}
> > > > Join would have hints {Hint1[0]}
> > > > Scan would have hints {Hint1[0, 0]}
> > > > Project would have hints {Hint1[0,1], Hint2}
> > > > Scan2 would have hints {[Hint1[0, 0, 1, 0], Hint2[0]}
> > > >
> > > > You could populate the hints and inherit paths with a single visitor 
> > > > pass after sql-to-rel conversion.
> > > >
> > > > By the way, I still like the idea of having kinds as a kind of 
> > > > RelMetadata, but I realize that a given RelNode might have more than 
> > > > one hint. So I think that the getHints(RelNode) method would return a 
> > > > List<Hint>, with Hint as follows:
> > > >
> > > > class Hint {
> > > > public final List<Integer> inheritPath; // immutable, not null
> > > > public final String type; // not null
> > > > public final Object operand; // immutable, may be null, must be JSON 
> > > > data
> > > > }
> > > >
> > > > operand must be JSON-style data (null, boolean, number, String, 
> > > > immutable List of JSON data, or immutable order-preserving Map from 
> > > > String to JSON data).
> > > >
> > > > > On Apr 23, 2019, at 1:25 AM, Yuzhao Chen <[email protected]> wrote:
> > > > >
> > > > > Thx, Andrew
> > > > >
> > > > > I don’t want to have a custom RelNode class, I hope all the work 
> > > > > about hints would be contributed to the community. I want to find an 
> > > > > acceptable way to keep and propagate the hints if we use the 
> > > > > MetadataHandler to cache and query the hints.
> > > > >
> > > > > I don’t think the hints should be mixed into the cost model, that 
> > > > > would make the cost computation very complex and hard to maintain, we 
> > > > > only need the hints in our planning phrase to give suggestions, hints 
> > > > > is more like another guideline for me and transparent to the planner.
> > > > >
> > > > > Best,
> > > > > Danny Chan
> > > > > 在 2019年4月23日 +0800 PM2:24,Андрей Цвелодуб <[email protected]>,写道:
> > > > > > Hi Danny,
> > > > > >
> > > > > > I would also agree with Julian on his position. I've tried to get 
> > > > > > around
> > > > > > this limitation in several different ways, but none of it ended 
> > > > > > well :)
> > > > > >
> > > > > > For your idea with hints, if you have custom RelNode classes, you 
> > > > > > can add
> > > > > > hint as an additional field of the class and you can write a simple 
> > > > > > rule
> > > > > > that propagates the hint downwards, step by step. And also include 
> > > > > > the hint
> > > > > > in your cost estimation, so that nodes with hints would be more 
> > > > > > attractive
> > > > > > to the planner. I'm not sure this would be the most correct way to 
> > > > > > use the
> > > > > > cost mechanism, but at least it is straightforward and it works.
> > > > > >
> > > > > > Best Regards,
> > > > > > Andrew Tsvelodub
> > > > > >
> > > > > > On Tue, 23 Apr 2019 at 08:44, Yuzhao Chen <[email protected]> 
> > > > > > wrote:
> > > > > >
> > > > > > > Julian,
> > > > > > >
> > > > > > > I want to add hint support for Calcite, the initial idea was to 
> > > > > > > tag a
> > > > > > > RelNode(transformed from a SqlNode with hint) with a hit 
> > > > > > > attribute(or
> > > > > > > trait), then I hope that the children (inputs) of it can see this 
> > > > > > > hint, so
> > > > > > > to make some decisions if it should consume or propagate the hint.
> > > > > > >
> > > > > > > The problem I got here is the trait propagate from inputs from, 
> > > > > > > which is
> > > > > > > the opposite as what I need, can you give some suggestions ? If I 
> > > > > > > use
> > > > > > > MetadataHandler to cache and propagate the hints, how to 
> > > > > > > propagate from
> > > > > > > parents to children ?
> > > > > > >
> > > > > > > Best,
> > > > > > > Danny Chan
> > > > > > > 在 2019年4月23日 +0800 AM3:14,Julian Hyde <[email protected]>,写道:
> > > > > > > > TL;DR: RelNodes don’t really have parents. Be careful if you 
> > > > > > > > are relying
> > > > > > > on the parent concept too much. Rely on rules instead.
> > > > > > > >
> > > > > > > > In the Volcano model, a RelNode doesn’t really have a parent. 
> > > > > > > > It might
> > > > > > > be used in several places. (RelSet has a field ‘List<RelNode> 
> > > > > > > parents’ that
> > > > > > > is kept up to date as planing progresses. But it’s really for 
> > > > > > > Volcano’s
> > > > > > > internal use.)
> > > > > > > >
> > > > > > > > Even if you are not using Volcano, there are reasons to want 
> > > > > > > > the RelNode
> > > > > > > graph to be a dag, so again, a RelNode doesn’t have a unique 
> > > > > > > parent.
> > > > > > > >
> > > > > > > > RelShuttleImpl has a stack. You can use that to find the 
> > > > > > > > parent. But the
> > > > > > > “parent” is just “where we came from as we traversed the RelNode 
> > > > > > > graph”.
> > > > > > > There may be other “parents” that you do not know about.
> > > > > > > >
> > > > > > > > If you have a Project and want to find all parents that are 
> > > > > > > > Filters,
> > > > > > > don’t even think about “iterating over the parents” of the 
> > > > > > > Project. Just
> > > > > > > write a rule that matches a Filter on a Project, and trust 
> > > > > > > Volcano to do
> > > > > > > its job.
> > > > > > > >
> > > > > > > > Julian
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > On Apr 22, 2019, at 6:15 AM, Yuzhao Chen 
> > > > > > > > > <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > Thx, Stamatis, that somehow make sense, if i pass around the 
> > > > > > > > > parent
> > > > > > > node every time I visit a RelNode and keep the parents in the 
> > > > > > > cache, but it
> > > > > > > is still not that intuitive. Actually I what a to add a new 
> > > > > > > RelTrait which
> > > > > > > bind to a specific scope, for example:
> > > > > > > > >
> > > > > > > > > join-rel(trait1)
> > > > > > > > > / \
> > > > > > > > > join2 join3
> > > > > > > > >
> > > > > > > > > Join-rel has a trait trait1, and I want all the children of 
> > > > > > > > > join-rel
> > > > > > > can see this trait, with Calcite’s default metadata handler, I 
> > > > > > > can only see
> > > > > > > the trait from children nodes(traits propagate from the inputs), 
> > > > > > > and I have
> > > > > > > no idea how to propagate a trait reversely?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Danny Chan
> > > > > > > > > 在 2019年4月22日 +0800 PM8:44,Stamatis Zampetakis 
> > > > > > > > > <[email protected]>,写道:
> > > > > > > > > > Hi Danny,
> > > > > > > > > >
> > > > > > > > > > Apart from RelShuttle there is also RelVisitor which has a 
> > > > > > > > > > visit
> > > > > > > method
> > > > > > > > > > that provides the parent [1]. Not sure, if it suits your 
> > > > > > > > > > needs.
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Stamatis
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > > > >
> > > > > > > https://github.com/apache/calcite/blob/ee83efd360793ef4201f4cdfc2af8d837b76ca69/core/src/main/java/org/apache/calcite/rel/RelVisitor.java#L43
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Apr 22, 2019 at 2:14 PM Yuzhao Chen 
> > > > > > > > > > <[email protected]>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Now for RelNode, we have method getInput()[1] to fetch 
> > > > > > > > > > > the input
> > > > > > > > > > > RelNodes, but how we fetch the parent ?
> > > > > > > > > > >
> > > > > > > > > > > For example, we have plan:
> > > > > > > > > > >
> > > > > > > > > > > join-rel
> > > > > > > > > > > / \
> > > > > > > > > > > scan1 scan2
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > We can get scan1 and scan2 in join-rel directly with 
> > > > > > > > > > > method
> > > > > > > getInput, but
> > > > > > > > > > > how can we get the join rel in scan1 and scan 2 ?
> > > > > > > > > > >
> > > > > > > > > > > I know that there is a RelShuttle that can visit every 
> > > > > > > > > > > RelNode and
> > > > > > > if I
> > > > > > > > > > > make a cache for the inputs mapping, finally I can get the
> > > > > > > ‘parents’ from
> > > > > > > > > > > the cache, but this is boring code and not that intuitive.
> > > > > > > > > > >
> > > > > > > > > > > Do you guys have any good ideas ?
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > > > > > >
> > > > > > > https://github.com/apache/calcite/blob/ee83efd360793ef4201f4cdfc2af8d837b76ca69/core/src/main/java/org/apache/calcite/rel/RelNode.java#L132
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Danny Chan
> > > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > >
>

Reply via email to