Vladimir, It's always good to ask these kinds of questions. e.g. "Is progress really progress?"
I think moving to a configuration object was a substantial improvement. Until Java has default parameters and named parameters, I don't know of another good way to make rules sufficiently configurable for a wide range of users. To me, one of the main benefits of the new configuration objects is a much less frequent need to subclass a rule. Back when we built original Drill and Dremio code, we were constantly pained by the need to control more pieces of a rule than were exposed by public constructors. When trying to introduce those options, we had to both: (1) update calcite and wait for a release cycle and (2) create another constructor overload, sometimes leading to a large number of overloads. The config object almost entirely eliminated this challenge. I do agree that having config objects is more complex for the casual user so in some ways we're adding advanced functionality at the potential cost of usability for new users. That's not something great. I'm not sure it's really bad but it is something to be aware of. Having default configs for rules seems to make this a rather small additional challenge. For your comments on downstream projects and the examples you give: I think that sounds more like advanced usage. My experience is somewhat different from yours: these changes have been net positive on the stuff I've been working on. Two last thoughts wrt downstream projects: - I think it is important to separate how to create config objects from the way that Calcite does it. We did change from proxies to immutables within Caclite but neither has ever been the required way to implement config objects. Config objects are always defined as plain java interfaces. If a particular subproject wants to expose configuration in a completely different way, I don't know of any constraint the rules interface forces. - A nice thing that was done with the initial implementation is now of the configuration concepts are exposed on the RelOptRule abstract class. A user must opt in to RelRule for them to have to worry about any of these concerns. In working with the config concepts and extensibility, I do think a couple of things are relatively painful. I don't 1. The definition of configs as interfaces inside the rules causes some issues with static initialization order. I think Julian even mentioned some of this in one of the original docs config docs. 2. The generics of RelRule specifically can be challenging when extended rules in some cases due to how generics and inheritance work. 3. The class definition of the config interfaces are not self-referencing (one of my shelved reworks actually modified to support this) meaning that there are a lot of ugly (my subjective opinion) uses of Config.as() in the code. This last one could be resolved by exposing the concrete builders that Immutables generates but for at least the initial changes, I wanted to avoid exposing these as new public interfaces (and thus all the immutables classes are marked package private). 4. Merging construction and use feels like an anti-pattern (my subjective opinion). The most common patterns I've seen treat a builder separate from an immutable constructed object (and have something akin to a toBuilder() method to take an immutable config back to a builder). In the Calcite config, these two concepts are merged. In some ways it makes things simpler for trivial cases. However, it is less formal and causes pain when you have required properties that have no defaults since there is no formal model for "I'm done building, now check everything is complete". This means in several places we have to put default values that are actually invalid rather than just rely on a builder's build validation step. One other note, I think if someone is working in Java 14+ (records) or Kotlin, there are also several easy ways to produce config impls that are easy to use (without proxies and/or immutables). On Tue, Apr 12, 2022 at 9:29 AM Thomas Rebele <[email protected]> wrote: > Hello, > > The reasons for the planner rules configuration can be found here: > CALCITE-3923 <https://issues.apache.org/jira/browse/CALCITE-3923>. See > also > the email thread [DISCUSS] Refactor how planner rules are parameterized > < > https://lists.apache.org/thread.html/rfdf6f9b7821988bdd92b0377e3d293443a6376f4773c4c658c891cf9%40%3Cdev.calcite.apache.org%3E > > > . > > Cordialement / Best Regards, > *Thomas Rebele, PhD* | R&D Developer | Germany | www.tibco.com > > > On Tue, Apr 12, 2022 at 6:10 PM Gavin Ray <[email protected]> wrote: > > > I don't have any weight behind my opinion or experience, > > but anything that lowers the barrier to entry to Calcite for newcomers > is a > > huge win in my mind. > > > > I assume the reason for the changes was because codegen improved > > performance? > > > > Could it make sense to allow both options, the easy/less-performant way > for > > people who want to experiment and learn the ropes, > > and the codegen path for productionizing the final rules you come up > with? > > > > Or does this make matters worse, trying to support two API's > > > > On Tue, Apr 12, 2022 at 6:25 AM Vladimir Ozerov <[email protected]> > > wrote: > > > > > Hi folks, > > > > > > Rules are an essential part of the Calcite-based query optimizers. A > > > typical optimizer may require dozens of custom rules that are created > by > > > extending some Apache Calcite interfaces. > > > > > > During the last two years, there were two major revisions of how rules > > are > > > created: > > > > > > 1. In early 1.2x versions, the typical approach was to use > > > RelOptRuleOperand with a set of helper methods in a builder-like > > > pattern. > > > 2. Then, we switched to the runtime code generation. > > > 3. Finally, we switched to the compile-time code generation with the > > > Immutables framework. > > > > > > Every such change requires the downstream projects to rewrite all their > > > rules. Not only does this require time to understand the new approach, > > but > > > it may also compromise the correctness of the downstream optimizer > > because > > > the regression tracking in query optimizers is not trivial. > > > > > > I had the privilege to try all three approaches, and I cannot get rid > of > > > the feeling that every new approach is more complicated than the > previous > > > one. I understand that this is a highly subjective statement, but when > I > > > just started using Apache Calcite and knew very little about it, I was > > able > > > to write rule patterns by simply looking at the IDE JavaDoc pop-ups and > > > code completion. When the RuleConfig was introduced, every new rule > > always > > > required me to look at some other rule as an example, yet it was > doable. > > > Now we also need to configure the project build system to write a > single > > > custom rule. > > > > > > At the same time, a significant fraction of the rules are pretty > simple. > > > E.g., "operator A on top of operator B". If some additional > configuration > > > is required, it could be added via plain rules fields, because at the > end > > > of the day the rule instance is not more than a plain Java object. > > > > > > A good example is the FilterProjectTransposeRule. What now takes tens > of > > > lines of code in the Config subclass [1] (that you hardly could write > > > without a reference example), and ~500 LOC in the generated code that > you > > > get through additional plugin configuration [2] in your build system, > > could > > > have been expressed in a dozen lines of code [3] in Apache Calcite > > 1.22.0. > > > > > > My question is - are we sure we are going in the right direction in > terms > > > of complexity and the entry bar for the newcomers? Wouldn't it be > better > > to > > > follow the 80/20 rule, when simple rules could be easily created > > > programmatically with no external dependencies, while more advanced > > > facilities like Immutables are used only for the complex rules? > > > > > > Regards, > > > Vladimir. > > > > > > [1] > > > > > > > > > https://github.com/apache/calcite/blob/calcite-1.30.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L208-L260 > > > [2] > > > > > > > > > https://github.com/apache/calcite/blob/calcite-1.30.0/core/build.gradle.kts#L215-L224 > > > [3] > > > > > > > > > https://github.com/apache/calcite/blob/calcite-1.22.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L99-L110 > > > > > >
