Thx Julian Mostly got your idea, but one thing needs to confirm:
Now the MetadataHandler is kind of query lazy the cache is code-gen ed in the handler class, the metadata also propagate from inputs, when I got an RelNode’s hint, how can I cache it in the metadata handler for querying ? Best, Danny Chan 在 2019年4月27日 +0800 AM6:57,Julian Hyde <[email protected]>,写道: > The RelMetadata system is designed for these kinds of annotations - if there > is a “global hints cache” there’s no benefit to doing it outside the > RelMetadata system. > > That said, I don’t know (and I don’t think anyone knows) how we want hints to > be propagated as we generate RelNodes from RelNodes. I think we should focus > on really simple cases first (e.g. hints about the whole query, or about > particular table scans), and not try to automatically propagate them. > > We can make the hints propagation mechanism more sophisticated when we have > an actual use case to drive us. > > Julian > > > > On Apr 26, 2019, at 3:41 PM, Yuzhao Chen <[email protected]> wrote: > > > > Thx, Julian > > > > Let me repeat my thoughts about the details again, in order to implement > > the hints, maybe these things are needed: > > > > The main diff is that we will maintain a global hints cache > > 1. Supports hints grammar for parser.jj > > 2. During/after sql-to-rel, we may pass a hints cache to the > > SqlToRelConverter, there is a visitor to setup/init the RelNodes’hints to > > the cache once at a time, this cache scope is global and would be active > > the whole query planning time. The cache only keep hints for few top nodes > > that really needs > > 3. In the Planner, add set/get hints cache method, so that in the planning > > rules, we can see the hints cache, > > And we can also ban some rule matching in the planner > > 4. Hook the RelOptCall#transformTo method to handle logic of hints > > propagating(invoke the hints logic again same as sql-to-rel phrase), this > > will also update the global hints cache > > It seems that given the global hints cache, we do not need the > > MetaDataHandler any more, this is the thing I most want to make sure. > > Hope for your suggestions. > > > > Best, > > Danny Chan > > 在 2019年4月25日 +0800 AM3:07,Julian Hyde <[email protected]>,写道: > > > I think it’s OK to attach hints to the (few) RelNodes that come out of > > > the SqlToRelConverter. > > > > > > But it would be a mistake to try to propagate those hints to all of the > > > RelNodes that are created during query planning. Even if we changed all > > > of the copy methods (a huge task) there are many other ways that RelNodes > > > get created. We would end up with a RelNode graph with lots of hints, and > > > most of those hints would be inaccurate or not applicable. > > > > > > For a particular hint, say "/*+ nohashjoin */“, some piece of code would > > > need to look at the initial RelNode tree and take its own action: say, > > > build a data structure to be used by planner rules, or enable or disable > > > planner rules. > > > > > > > > > > On Apr 23, 2019, at 9:31 PM, Chunwei Lei <[email protected]> wrote: > > > > > > > > Thanks Danny. > > > > > > > > Those are good points. I think it depends on what we consider hint as. > > > > IMHO, if we consider hint as a kind of metadata, > > > > it is not a good idea to store the hints in the RelNode instance. > > > > > > > > > > > > > > > > Best, > > > > Chunwei > > > > > > > > On Wed, Apr 24, 2019 at 11:09 AM Yuzhao Chen <[email protected]> > > > > wrote: > > > > > > > > > > Thx, Julian > > > > > > > > > > I think the hint path is a good way for searching RelNode’s parents, > > > > > broadly, there may be these modules/things need to be modified: > > > > > > > > > > 1. Supports hints grammar for parser.jj > > > > > 2. Cache the hints in the RelNode instance, and add method like > > > > > RelNode#getHints() to fetch all the hints inherited for this node. > > > > > 3. Modify #copy method for every kind of RelNode so that the hints > > > > > can be copied when creating new equivalent nodes. > > > > > 4. Add a visitor in after sql-to-rel phrase, to set up full hints > > > > > list for every children RelNode if there exists any. > > > > > 5. Add hints metadata handler and handles the hints fetching and > > > > > overriding for specific kind of RelNode > > > > > > > > > > The 2 and 3 are the modifications that i really want to confirm, that > > > > > is, shall we store the hints in the RelNode instance ? > > > > > > > > > > These are initial thoughts and if we make agreement, I would output a > > > > > detail design doc which contains: > > > > > > > > > > 1. The hints grammar supported for the major sql engines > > > > > 2. The hints grammar supported for Apache Calcite > > > > > 3. The interface and design ideas of the proposed modifications > > > > > > > > > > > > > > > Best, > > > > > Danny Chan > > > > > 在 2019年4月24日 +0800 AM3:04,Julian Hyde <[email protected]>,写道: > > > > > > I see that if you have a hint on, say, the root node then it would > > > > > > be nice for its child or grand-child to be able to see that hint. > > > > > > > > > > > > How about giving each hint an inherit path? Thus given > > > > > > > > > > > > Filter Hint1 > > > > > > +- Join > > > > > > +- Scan > > > > > > +- Project Hint2 > > > > > > +- Scan > > > > > > > > > > > > > > > > > > Filter would have hints {Hint1[]} > > > > > > Join would have hints {Hint1[0]} > > > > > > Scan would have hints {Hint1[0, 0]} > > > > > > Project would have hints {Hint1[0,1], Hint2} > > > > > > Scan2 would have hints {[Hint1[0, 0, 1, 0], Hint2[0]} > > > > > > > > > > > > You could populate the hints and inherit paths with a single > > > > > > visitor pass after sql-to-rel conversion. > > > > > > > > > > > > By the way, I still like the idea of having kinds as a kind of > > > > > > RelMetadata, but I realize that a given RelNode might have more > > > > > > than one hint. So I think that the getHints(RelNode) method would > > > > > > return a List<Hint>, with Hint as follows: > > > > > > > > > > > > class Hint { > > > > > > public final List<Integer> inheritPath; // immutable, not null > > > > > > public final String type; // not null > > > > > > public final Object operand; // immutable, may be null, must be > > > > > > JSON data > > > > > > } > > > > > > > > > > > > operand must be JSON-style data (null, boolean, number, String, > > > > > > immutable List of JSON data, or immutable order-preserving Map from > > > > > > String to JSON data). > > > > > > > > > > > > > On Apr 23, 2019, at 1:25 AM, Yuzhao Chen <[email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > Thx, Andrew > > > > > > > > > > > > > > I don’t want to have a custom RelNode class, I hope all the work > > > > > > > about hints would be contributed to the community. I want to find > > > > > > > an acceptable way to keep and propagate the hints if we use the > > > > > > > MetadataHandler to cache and query the hints. > > > > > > > > > > > > > > I don’t think the hints should be mixed into the cost model, that > > > > > > > would make the cost computation very complex and hard to > > > > > > > maintain, we only need the hints in our planning phrase to give > > > > > > > suggestions, hints is more like another guideline for me and > > > > > > > transparent to the planner. > > > > > > > > > > > > > > Best, > > > > > > > Danny Chan > > > > > > > 在 2019年4月23日 +0800 PM2:24,Андрей Цвелодуб > > > > > > > <[email protected]>,写道: > > > > > > > > Hi Danny, > > > > > > > > > > > > > > > > I would also agree with Julian on his position. I've tried to > > > > > > > > get around > > > > > > > > this limitation in several different ways, but none of it ended > > > > > > > > well :) > > > > > > > > > > > > > > > > For your idea with hints, if you have custom RelNode classes, > > > > > > > > you can add > > > > > > > > hint as an additional field of the class and you can write a > > > > > > > > simple rule > > > > > > > > that propagates the hint downwards, step by step. And also > > > > > > > > include the hint > > > > > > > > in your cost estimation, so that nodes with hints would be more > > > > > > > > attractive > > > > > > > > to the planner. I'm not sure this would be the most correct way > > > > > > > > to use the > > > > > > > > cost mechanism, but at least it is straightforward and it works. > > > > > > > > > > > > > > > > Best Regards, > > > > > > > > Andrew Tsvelodub > > > > > > > > > > > > > > > > On Tue, 23 Apr 2019 at 08:44, Yuzhao Chen > > > > > > > > <[email protected]> wrote: > > > > > > > > > > > > > > > > > Julian, > > > > > > > > > > > > > > > > > > I want to add hint support for Calcite, the initial idea was > > > > > > > > > to tag a > > > > > > > > > RelNode(transformed from a SqlNode with hint) with a hit > > > > > > > > > attribute(or > > > > > > > > > trait), then I hope that the children (inputs) of it can see > > > > > > > > > this hint, so > > > > > > > > > to make some decisions if it should consume or propagate the > > > > > > > > > hint. > > > > > > > > > > > > > > > > > > The problem I got here is the trait propagate from inputs > > > > > > > > > from, which is > > > > > > > > > the opposite as what I need, can you give some suggestions ? > > > > > > > > > If I use > > > > > > > > > MetadataHandler to cache and propagate the hints, how to > > > > > > > > > propagate from > > > > > > > > > parents to children ? > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Danny Chan > > > > > > > > > 在 2019年4月23日 +0800 AM3:14,Julian Hyde <[email protected]>,写道: > > > > > > > > > > TL;DR: RelNodes don’t really have parents. Be careful if > > > > > > > > > > you are relying > > > > > > > > > on the parent concept too much. Rely on rules instead. > > > > > > > > > > > > > > > > > > > > In the Volcano model, a RelNode doesn’t really have a > > > > > > > > > > parent. It might > > > > > > > > > be used in several places. (RelSet has a field ‘List<RelNode> > > > > > > > > > parents’ that > > > > > > > > > is kept up to date as planing progresses. But it’s really for > > > > > > > > > Volcano’s > > > > > > > > > internal use.) > > > > > > > > > > > > > > > > > > > > Even if you are not using Volcano, there are reasons to > > > > > > > > > > want the RelNode > > > > > > > > > graph to be a dag, so again, a RelNode doesn’t have a unique > > > > > > > > > parent. > > > > > > > > > > > > > > > > > > > > RelShuttleImpl has a stack. You can use that to find the > > > > > > > > > > parent. But the > > > > > > > > > “parent” is just “where we came from as we traversed the > > > > > > > > > RelNode graph”. > > > > > > > > > There may be other “parents” that you do not know about. > > > > > > > > > > > > > > > > > > > > If you have a Project and want to find all parents that are > > > > > > > > > > Filters, > > > > > > > > > don’t even think about “iterating over the parents” of the > > > > > > > > > Project. Just > > > > > > > > > write a rule that matches a Filter on a Project, and trust > > > > > > > > > Volcano to do > > > > > > > > > its job. > > > > > > > > > > > > > > > > > > > > Julian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Apr 22, 2019, at 6:15 AM, Yuzhao Chen > > > > > > > > > > > <[email protected]> wrote: > > > > > > > > > > > > > > > > > > > > > > Thx, Stamatis, that somehow make sense, if i pass around > > > > > > > > > > > the parent > > > > > > > > > node every time I visit a RelNode and keep the parents in the > > > > > > > > > cache, but it > > > > > > > > > is still not that intuitive. Actually I what a to add a new > > > > > > > > > RelTrait which > > > > > > > > > bind to a specific scope, for example: > > > > > > > > > > > > > > > > > > > > > > join-rel(trait1) > > > > > > > > > > > / \ > > > > > > > > > > > join2 join3 > > > > > > > > > > > > > > > > > > > > > > Join-rel has a trait trait1, and I want all the children > > > > > > > > > > > of join-rel > > > > > > > > > can see this trait, with Calcite’s default metadata handler, > > > > > > > > > I can only see > > > > > > > > > the trait from children nodes(traits propagate from the > > > > > > > > > inputs), and I have > > > > > > > > > no idea how to propagate a trait reversely? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Danny Chan > > > > > > > > > > > 在 2019年4月22日 +0800 PM8:44,Stamatis Zampetakis > > > > > > > > > > > <[email protected]>,写道: > > > > > > > > > > > > Hi Danny, > > > > > > > > > > > > > > > > > > > > > > > > Apart from RelShuttle there is also RelVisitor which > > > > > > > > > > > > has a visit > > > > > > > > > method > > > > > > > > > > > > that provides the parent [1]. Not sure, if it suits > > > > > > > > > > > > your needs. > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > Stamatis > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/calcite/blob/ee83efd360793ef4201f4cdfc2af8d837b76ca69/core/src/main/java/org/apache/calcite/rel/RelVisitor.java#L43 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Apr 22, 2019 at 2:14 PM Yuzhao Chen > > > > > > > > > > > > <[email protected]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Now for RelNode, we have method getInput()[1] to > > > > > > > > > > > > > fetch the input > > > > > > > > > > > > > RelNodes, but how we fetch the parent ? > > > > > > > > > > > > > > > > > > > > > > > > > > For example, we have plan: > > > > > > > > > > > > > > > > > > > > > > > > > > join-rel > > > > > > > > > > > > > / \ > > > > > > > > > > > > > scan1 scan2 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We can get scan1 and scan2 in join-rel directly with > > > > > > > > > > > > > method > > > > > > > > > getInput, but > > > > > > > > > > > > > how can we get the join rel in scan1 and scan 2 ? > > > > > > > > > > > > > > > > > > > > > > > > > > I know that there is a RelShuttle that can visit > > > > > > > > > > > > > every RelNode and > > > > > > > > > if I > > > > > > > > > > > > > make a cache for the inputs mapping, finally I can > > > > > > > > > > > > > get the > > > > > > > > > ‘parents’ from > > > > > > > > > > > > > the cache, but this is boring code and not that > > > > > > > > > > > > > intuitive. > > > > > > > > > > > > > > > > > > > > > > > > > > Do you guys have any good ideas ? > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/calcite/blob/ee83efd360793ef4201f4cdfc2af8d837b76ca69/core/src/main/java/org/apache/calcite/rel/RelNode.java#L132 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > Danny Chan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
