Yanjing, That is a valid point. The API could perhaps allow people to define rules with nodes that are optional. Can you please log a JIRA for this? Include the code that you would like to write in order to define a rule and handle a match.
Julian On Thu, Apr 14, 2022 at 2:27 AM Yanjing Wang <[email protected]> wrote: > Hi, all, > > I think the rule configuration is not convenient for optional operand, such > as a match operand tree > Project > | > Union > / \ > Filter Filter > If I want to compose a rule that the Project operand is optional, I must > create two configuration and new two rule instance, if we can make the > operand optional, I think I just need one rule and configuration. > > > Viliam Durina <[email protected]> 于2022年4月14日周四 14:55写道: > > > I can say for myself that though the migration was a nuisance, after we > > figured out what to do, it was straightforward and the current approach > > with Immutables is more readable and easier to write than the previous > with > > `operand` and `operandJ`. > > > > Viliam > > > > On Wed, 13 Apr 2022 at 19:23, Julian Hyde <[email protected]> > wrote: > > > > > 1. I agree with Jacques. Thanks, Vladimir, for asking the question ‘is > > > progress really progress?’. We may not back out these changes (probably > > > won’t) but it will inform future discussions. It’s valid to say ‘when > > did X > > > as part of the rule refactoring, that didn’t deliver on promises’. > > > > > > 2. Thanks, Thomas, for sending the link to the original discussion. It > is > > > still my position that many more people will use existing rules than > > write > > > their own. > > > > > > 3. We have effectively created a DSL [1] (especially for operand > patterns > > > [2]) embedded in Java. Java 8 is not a great language for writing DSLs: > > > Kotlin, Scala, Lisp and even later versions of Java are superior. But a > > > design option we should consider is a DSL with its own syntax and > parser. > > > CockroachDB’s OptGen [3] is an example of this, and there is much to > > like. > > > > > > Compared to DSLs embedded in other languages, languages with their own > > > parser are more intuitive for users, and give better error messages. It > > is > > > also possible to support multiple versions simultaneously. > > > > > > Julian > > > > > > [1] https://en.wikipedia.org/wiki/Domain-specific_language < > > > https://en.wikipedia.org/wiki/Domain-specific_language> > > > > > > [2] https://issues.apache.org/jira/browse/CALCITE-3923 < > > > https://issues.apache.org/jira/browse/CALCITE-3923> > > > > > > [3] > > > > > > https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/optgen/lang/doc.go > > > < > > > > > > https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/optgen/lang/doc.go > > > > > > > > > > > > > > > > On Apr 13, 2022, at 9:56 AM, Jacques Nadeau <[email protected]> > > wrote: > > > > > > > > Vladimir, > > > > > > > > It's always good to ask these kinds of questions. e.g. "Is progress > > > really > > > > progress?" > > > > > > > > I think moving to a configuration object was a substantial > improvement. > > > > Until Java has default parameters and named parameters, I don't know > of > > > > another good way to make rules sufficiently configurable for a wide > > range > > > > of users. > > > > > > > > To me, one of the main benefits of the new configuration objects is a > > > much > > > > less frequent need to subclass a rule. Back when we built original > > Drill > > > > and Dremio code, we were constantly pained by the need to control > more > > > > pieces of a rule than were exposed by public constructors. When > trying > > to > > > > introduce those options, we had to both: (1) update calcite and wait > > for > > > a > > > > release cycle and (2) create another constructor overload, sometimes > > > > leading to a large number of overloads. The config object almost > > entirely > > > > eliminated this challenge. > > > > > > > > I do agree that having config objects is more complex for the casual > > user > > > > so in some ways we're adding advanced functionality at the potential > > cost > > > > of usability for new users. That's not something great. I'm not sure > > it's > > > > really bad but it is something to be aware of. Having default configs > > for > > > > rules seems to make this a rather small additional challenge. > > > > > > > > For your comments on downstream projects and the examples you give: I > > > think > > > > that sounds more like advanced usage. My experience is somewhat > > different > > > > from yours: these changes have been net positive on the stuff I've > been > > > > working on. > > > > > > > > Two last thoughts wrt downstream projects: > > > > > > > > - I think it is important to separate how to create config objects > > from > > > > the way that Calcite does it. We did change from proxies to > > immutables > > > > within Caclite but neither has ever been the required way to > > implement > > > > config objects. Config objects are always defined as plain java > > > interfaces. > > > > If a particular subproject wants to expose configuration in a > > > completely > > > > different way, I don't know of any constraint the rules interface > > > forces. > > > > - A nice thing that was done with the initial implementation is now > > of > > > > the configuration concepts are exposed on the RelOptRule abstract > > > class. A > > > > user must opt in to RelRule for them to have to worry about any of > > > these > > > > concerns. > > > > > > > > > > > > In working with the config concepts and extensibility, I do think a > > > couple > > > > of things are relatively painful. I don't > > > > > > > > 1. The definition of configs as interfaces inside the rules causes > > some > > > > issues with static initialization order. I think Julian even > > mentioned > > > some > > > > of this in one of the original docs config docs. > > > > 2. The generics of RelRule specifically can be challenging when > > > extended > > > > rules in some cases due to how generics and inheritance work. > > > > 3. The class definition of the config interfaces are not > > > > self-referencing (one of my shelved reworks actually modified to > > > > support this) meaning that there are a lot of ugly (my subjective > > > opinion) > > > > uses of Config.as() in the code. This last one could be resolved by > > > > exposing the concrete builders that Immutables generates but for at > > > least > > > > the initial changes, I wanted to avoid exposing these as new public > > > > interfaces (and thus all the immutables classes are marked package > > > private). > > > > 4. Merging construction and use feels like an anti-pattern (my > > > > subjective opinion). The most common patterns I've seen treat a > > builder > > > > separate from an immutable constructed object (and have something > > akin > > > to a > > > > toBuilder() method to take an immutable config back to a builder). > In > > > the > > > > Calcite config, these two concepts are merged. In some ways it > makes > > > things > > > > simpler for trivial cases. However, it is less formal and causes > > pain > > > when > > > > you have required properties that have no defaults since there is > no > > > formal > > > > model for "I'm done building, now check everything is complete". > This > > > means > > > > in several places we have to put default values that are actually > > > invalid > > > > rather than just rely on a builder's build validation step. > > > > > > > > One other note, I think if someone is working in Java 14+ (records) > or > > > > Kotlin, there are also several easy ways to produce config impls that > > are > > > > easy to use (without proxies and/or immutables). > > > > > > > > > > > > > > > > > > > > On Tue, Apr 12, 2022 at 9:29 AM Thomas Rebele > > <[email protected] > > > > > > > > wrote: > > > > > > > >> Hello, > > > >> > > > >> The reasons for the planner rules configuration can be found here: > > > >> CALCITE-3923 <https://issues.apache.org/jira/browse/CALCITE-3923>. > > See > > > >> also > > > >> the email thread [DISCUSS] Refactor how planner rules are > > parameterized > > > >> < > > > >> > > > > > > https://lists.apache.org/thread.html/rfdf6f9b7821988bdd92b0377e3d293443a6376f4773c4c658c891cf9%40%3Cdev.calcite.apache.org%3E > > > >>> > > > >> . > > > >> > > > >> Cordialement / Best Regards, > > > >> *Thomas Rebele, PhD* | R&D Developer | Germany | www.tibco.com > > > >> > > > >> > > > >> On Tue, Apr 12, 2022 at 6:10 PM Gavin Ray <[email protected]> > > > wrote: > > > >> > > > >>> I don't have any weight behind my opinion or experience, > > > >>> but anything that lowers the barrier to entry to Calcite for > > newcomers > > > >> is a > > > >>> huge win in my mind. > > > >>> > > > >>> I assume the reason for the changes was because codegen improved > > > >>> performance? > > > >>> > > > >>> Could it make sense to allow both options, the easy/less-performant > > way > > > >> for > > > >>> people who want to experiment and learn the ropes, > > > >>> and the codegen path for productionizing the final rules you come > up > > > >> with? > > > >>> > > > >>> Or does this make matters worse, trying to support two API's > > > >>> > > > >>> On Tue, Apr 12, 2022 at 6:25 AM Vladimir Ozerov < > [email protected]> > > > >>> wrote: > > > >>> > > > >>>> Hi folks, > > > >>>> > > > >>>> Rules are an essential part of the Calcite-based query > optimizers. A > > > >>>> typical optimizer may require dozens of custom rules that are > > created > > > >> by > > > >>>> extending some Apache Calcite interfaces. > > > >>>> > > > >>>> During the last two years, there were two major revisions of how > > rules > > > >>> are > > > >>>> created: > > > >>>> > > > >>>> 1. In early 1.2x versions, the typical approach was to use > > > >>>> RelOptRuleOperand with a set of helper methods in a builder-like > > > >>>> pattern. > > > >>>> 2. Then, we switched to the runtime code generation. > > > >>>> 3. Finally, we switched to the compile-time code generation with > > the > > > >>>> Immutables framework. > > > >>>> > > > >>>> Every such change requires the downstream projects to rewrite all > > > their > > > >>>> rules. Not only does this require time to understand the new > > approach, > > > >>> but > > > >>>> it may also compromise the correctness of the downstream optimizer > > > >>> because > > > >>>> the regression tracking in query optimizers is not trivial. > > > >>>> > > > >>>> I had the privilege to try all three approaches, and I cannot get > > rid > > > >> of > > > >>>> the feeling that every new approach is more complicated than the > > > >> previous > > > >>>> one. I understand that this is a highly subjective statement, but > > when > > > >> I > > > >>>> just started using Apache Calcite and knew very little about it, I > > was > > > >>> able > > > >>>> to write rule patterns by simply looking at the IDE JavaDoc > pop-ups > > > and > > > >>>> code completion. When the RuleConfig was introduced, every new > rule > > > >>> always > > > >>>> required me to look at some other rule as an example, yet it was > > > >> doable. > > > >>>> Now we also need to configure the project build system to write a > > > >> single > > > >>>> custom rule. > > > >>>> > > > >>>> At the same time, a significant fraction of the rules are pretty > > > >> simple. > > > >>>> E.g., "operator A on top of operator B". If some additional > > > >> configuration > > > >>>> is required, it could be added via plain rules fields, because at > > the > > > >> end > > > >>>> of the day the rule instance is not more than a plain Java object. > > > >>>> > > > >>>> A good example is the FilterProjectTransposeRule. What now takes > > tens > > > >> of > > > >>>> lines of code in the Config subclass [1] (that you hardly could > > write > > > >>>> without a reference example), and ~500 LOC in the generated code > > that > > > >> you > > > >>>> get through additional plugin configuration [2] in your build > > system, > > > >>> could > > > >>>> have been expressed in a dozen lines of code [3] in Apache Calcite > > > >>> 1.22.0. > > > >>>> > > > >>>> My question is - are we sure we are going in the right direction > in > > > >> terms > > > >>>> of complexity and the entry bar for the newcomers? Wouldn't it be > > > >> better > > > >>> to > > > >>>> follow the 80/20 rule, when simple rules could be easily created > > > >>>> programmatically with no external dependencies, while more > advanced > > > >>>> facilities like Immutables are used only for the complex rules? > > > >>>> > > > >>>> Regards, > > > >>>> Vladimir. > > > >>>> > > > >>>> [1] > > > >>>> > > > >>>> > > > >>> > > > >> > > > > > > https://github.com/apache/calcite/blob/calcite-1.30.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L208-L260 > > > >>>> [2] > > > >>>> > > > >>>> > > > >>> > > > >> > > > > > > https://github.com/apache/calcite/blob/calcite-1.30.0/core/build.gradle.kts#L215-L224 > > > >>>> [3] > > > >>>> > > > >>>> > > > >>> > > > >> > > > > > > https://github.com/apache/calcite/blob/calcite-1.22.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L99-L110 > > > >>>> > > > >>> > > > >> > > > > > > > > > > -- > > This message contains confidential information and is intended only for > > the > > individuals named. If you are not the named addressee you should not > > disseminate, distribute or copy this e-mail. Please notify the sender > > immediately by e-mail if you have received this e-mail by mistake and > > delete this e-mail from your system. E-mail transmission cannot be > > guaranteed to be secure or error-free as information could be > intercepted, > > corrupted, lost, destroyed, arrive late or incomplete, or contain > viruses. > > The sender therefore does not accept liability for any errors or > omissions > > in the contents of this message, which arise as a result of e-mail > > transmission. If verification is required, please request a hard-copy > > version. -Hazelcast > > >
