I don't have any weight behind my opinion or experience,
but anything that lowers the barrier to entry to Calcite for newcomers is a
huge win in my mind.

I assume the reason for the changes was because codegen improved
performance?

Could it make sense to allow both options, the easy/less-performant way for
people who want to experiment and learn the ropes,
and the codegen path for productionizing the final rules you come up with?

Or does this make matters worse, trying to support two API's

On Tue, Apr 12, 2022 at 6:25 AM Vladimir Ozerov <[email protected]> wrote:

> Hi folks,
>
> Rules are an essential part of the Calcite-based query optimizers. A
> typical optimizer may require dozens of custom rules that are created by
> extending some Apache Calcite interfaces.
>
> During the last two years, there were two major revisions of how rules are
> created:
>
>    1. In early 1.2x versions, the typical approach was to use
>    RelOptRuleOperand with a set of helper methods in a builder-like
>    pattern.
>    2. Then, we switched to the runtime code generation.
>    3. Finally, we switched to the compile-time code generation with the
>    Immutables framework.
>
> Every such change requires the downstream projects to rewrite all their
> rules. Not only does this require time to understand the new approach, but
> it may also compromise the correctness of the downstream optimizer because
> the regression tracking in query optimizers is not trivial.
>
> I had the privilege to try all three approaches, and I cannot get rid of
> the feeling that every new approach is more complicated than the previous
> one. I understand that this is a highly subjective statement, but when I
> just started using Apache Calcite and knew very little about it, I was able
> to write rule patterns by simply looking at the IDE JavaDoc pop-ups and
> code completion. When the RuleConfig was introduced, every new rule always
> required me to look at some other rule as an example, yet it was doable.
> Now we also need to configure the project build system to write a single
> custom rule.
>
> At the same time, a significant fraction of the rules are pretty simple.
> E.g., "operator A on top of operator B". If some additional configuration
> is required, it could be added via plain rules fields, because at the end
> of the day the rule instance is not more than a plain Java object.
>
> A good example is the FilterProjectTransposeRule. What now takes tens of
> lines of code in the Config subclass [1] (that you hardly could write
> without a reference example), and ~500 LOC in the generated code that you
> get through additional plugin configuration [2] in your build system, could
> have been expressed in a dozen lines of code [3] in Apache Calcite 1.22.0.
>
> My question is - are we sure we are going in the right direction in terms
> of complexity and the entry bar for the newcomers? Wouldn't it be better to
> follow the 80/20 rule, when simple rules could be easily created
> programmatically with no external dependencies, while more advanced
> facilities like Immutables are used only for the complex rules?
>
> Regards,
> Vladimir.
>
> [1]
>
> https://github.com/apache/calcite/blob/calcite-1.30.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L208-L260
> [2]
>
> https://github.com/apache/calcite/blob/calcite-1.30.0/core/build.gradle.kts#L215-L224
> [3]
>
> https://github.com/apache/calcite/blob/calcite-1.22.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L99-L110
>

Reply via email to