Hi, all,

I think the rule configuration is not convenient for optional operand, such
as a match operand tree
              Project
                  |
              Union
              /        \
            Filter    Filter
If I want to compose a rule that the Project operand is optional, I must
create two configuration and new two rule instance, if we can make the
operand optional, I think I just need one rule and configuration.


Viliam Durina <[email protected]> 于2022年4月14日周四 14:55写道:

> I can say for myself that though the migration was a nuisance, after we
> figured out what to do, it was straightforward and the current approach
> with Immutables is more readable and easier to write than the previous with
> `operand` and `operandJ`.
>
> Viliam
>
> On Wed, 13 Apr 2022 at 19:23, Julian Hyde <[email protected]> wrote:
>
> > 1. I agree with Jacques. Thanks, Vladimir, for asking the question ‘is
> > progress really progress?’. We may not back out these changes (probably
> > won’t) but it will inform future discussions. It’s valid to say ‘when
> did X
> > as part of the rule refactoring, that didn’t deliver on promises’.
> >
> > 2. Thanks, Thomas, for sending the link to the original discussion. It is
> > still my position that many more people will use existing rules than
> write
> > their own.
> >
> > 3. We have effectively created a DSL [1] (especially for operand patterns
> > [2]) embedded in Java. Java 8 is not a great language for writing DSLs:
> > Kotlin, Scala, Lisp and even later versions of Java are superior. But a
> > design option we should consider is a DSL with its own syntax and parser.
> > CockroachDB’s OptGen [3] is an example of this, and there is much to
> like.
> >
> > Compared to DSLs embedded in other languages, languages with their own
> > parser are more intuitive for users, and give better error messages. It
> is
> > also possible to support multiple versions simultaneously.
> >
> > Julian
> >
> > [1] https://en.wikipedia.org/wiki/Domain-specific_language <
> > https://en.wikipedia.org/wiki/Domain-specific_language>
> >
> > [2] https://issues.apache.org/jira/browse/CALCITE-3923 <
> > https://issues.apache.org/jira/browse/CALCITE-3923>
> >
> > [3]
> >
> https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/optgen/lang/doc.go
> > <
> >
> https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/optgen/lang/doc.go
> >
> >
> >
> >
> > > On Apr 13, 2022, at 9:56 AM, Jacques Nadeau <[email protected]>
> wrote:
> > >
> > > Vladimir,
> > >
> > > It's always good to ask these kinds of questions. e.g. "Is progress
> > really
> > > progress?"
> > >
> > > I think moving to a configuration object was a substantial improvement.
> > > Until Java has default parameters and named parameters, I don't know of
> > > another good way to make rules sufficiently configurable for a wide
> range
> > > of users.
> > >
> > > To me, one of the main benefits of the new configuration objects is a
> > much
> > > less frequent need to subclass a rule. Back when we built original
> Drill
> > > and Dremio code, we were constantly pained by the need to control more
> > > pieces of a rule than were exposed by public constructors. When trying
> to
> > > introduce those options, we had to both: (1) update calcite and wait
> for
> > a
> > > release cycle and (2) create another constructor overload, sometimes
> > > leading to a large number of overloads. The config object almost
> entirely
> > > eliminated this challenge.
> > >
> > > I do agree that having config objects is more complex for the casual
> user
> > > so in some ways we're adding advanced functionality at the potential
> cost
> > > of usability for new users. That's not something great. I'm not sure
> it's
> > > really bad but it is something to be aware of. Having default configs
> for
> > > rules seems to make this a rather small additional challenge.
> > >
> > > For your comments on downstream projects and the examples you give: I
> > think
> > > that sounds more like advanced usage. My experience is somewhat
> different
> > > from yours: these changes have been net positive on the stuff I've been
> > > working on.
> > >
> > > Two last thoughts wrt downstream projects:
> > >
> > >   - I think it is important to separate how to create config objects
> from
> > >   the way that Calcite does it. We did change from proxies to
> immutables
> > >   within Caclite but neither has ever been the required way to
> implement
> > >   config objects. Config objects are always defined as plain java
> > interfaces.
> > >   If a particular subproject wants to expose configuration in a
> > completely
> > >   different way, I don't know of any constraint the rules interface
> > forces.
> > >   - A nice thing that was done with the initial implementation is now
> of
> > >   the configuration concepts are exposed on the RelOptRule abstract
> > class. A
> > >   user must opt in to RelRule for them to have to worry about any of
> > these
> > >   concerns.
> > >
> > >
> > > In working with the config concepts and extensibility, I do think a
> > couple
> > > of things are relatively painful. I don't
> > >
> > >   1. The definition of configs as interfaces inside the rules causes
> some
> > >   issues with static initialization order. I think Julian even
> mentioned
> > some
> > >   of this in one of the original docs config docs.
> > >   2. The generics of RelRule specifically can be challenging when
> > extended
> > >   rules in some cases due to how generics and inheritance work.
> > >   3. The class definition of the config interfaces are not
> > >   self-referencing (one of my shelved reworks actually modified to
> > >   support this) meaning that there are a lot of ugly (my subjective
> > opinion)
> > >   uses of Config.as() in the code. This last one could be resolved by
> > >   exposing the concrete builders that Immutables generates but for at
> > least
> > >   the initial changes, I wanted to avoid exposing these as new public
> > >   interfaces (and thus all the immutables classes are marked package
> > private).
> > >   4. Merging construction and use feels like an anti-pattern (my
> > >   subjective opinion). The most common patterns I've seen treat a
> builder
> > >   separate from an immutable constructed object (and have something
> akin
> > to a
> > >   toBuilder() method to take an immutable config back to a builder). In
> > the
> > >   Calcite config, these two concepts are merged. In some ways it makes
> > things
> > >   simpler for trivial cases. However,  it is less formal and causes
> pain
> > when
> > >   you have required properties that have no defaults since there is no
> > formal
> > >   model for "I'm done building, now check everything is complete". This
> > means
> > >   in several places we have to put default values that are actually
> > invalid
> > >   rather than just rely on a builder's build validation step.
> > >
> > > One other note, I think if someone is working in Java 14+ (records) or
> > > Kotlin, there are also several easy ways to produce config impls that
> are
> > > easy to use (without proxies and/or immutables).
> > >
> > >
> > >
> > >
> > > On Tue, Apr 12, 2022 at 9:29 AM Thomas Rebele
> <[email protected]
> > >
> > > wrote:
> > >
> > >> Hello,
> > >>
> > >> The reasons for the planner rules configuration can be found here:
> > >> CALCITE-3923 <https://issues.apache.org/jira/browse/CALCITE-3923>.
> See
> > >> also
> > >> the email thread [DISCUSS] Refactor how planner rules are
> parameterized
> > >> <
> > >>
> >
> https://lists.apache.org/thread.html/rfdf6f9b7821988bdd92b0377e3d293443a6376f4773c4c658c891cf9%40%3Cdev.calcite.apache.org%3E
> > >>>
> > >> .
> > >>
> > >> Cordialement / Best Regards,
> > >> *Thomas Rebele, PhD* | R&D Developer | Germany | www.tibco.com
> > >>
> > >>
> > >> On Tue, Apr 12, 2022 at 6:10 PM Gavin Ray <[email protected]>
> > wrote:
> > >>
> > >>> I don't have any weight behind my opinion or experience,
> > >>> but anything that lowers the barrier to entry to Calcite for
> newcomers
> > >> is a
> > >>> huge win in my mind.
> > >>>
> > >>> I assume the reason for the changes was because codegen improved
> > >>> performance?
> > >>>
> > >>> Could it make sense to allow both options, the easy/less-performant
> way
> > >> for
> > >>> people who want to experiment and learn the ropes,
> > >>> and the codegen path for productionizing the final rules you come up
> > >> with?
> > >>>
> > >>> Or does this make matters worse, trying to support two API's
> > >>>
> > >>> On Tue, Apr 12, 2022 at 6:25 AM Vladimir Ozerov <[email protected]>
> > >>> wrote:
> > >>>
> > >>>> Hi folks,
> > >>>>
> > >>>> Rules are an essential part of the Calcite-based query optimizers. A
> > >>>> typical optimizer may require dozens of custom rules that are
> created
> > >> by
> > >>>> extending some Apache Calcite interfaces.
> > >>>>
> > >>>> During the last two years, there were two major revisions of how
> rules
> > >>> are
> > >>>> created:
> > >>>>
> > >>>>   1. In early 1.2x versions, the typical approach was to use
> > >>>>   RelOptRuleOperand with a set of helper methods in a builder-like
> > >>>>   pattern.
> > >>>>   2. Then, we switched to the runtime code generation.
> > >>>>   3. Finally, we switched to the compile-time code generation with
> the
> > >>>>   Immutables framework.
> > >>>>
> > >>>> Every such change requires the downstream projects to rewrite all
> > their
> > >>>> rules. Not only does this require time to understand the new
> approach,
> > >>> but
> > >>>> it may also compromise the correctness of the downstream optimizer
> > >>> because
> > >>>> the regression tracking in query optimizers is not trivial.
> > >>>>
> > >>>> I had the privilege to try all three approaches, and I cannot get
> rid
> > >> of
> > >>>> the feeling that every new approach is more complicated than the
> > >> previous
> > >>>> one. I understand that this is a highly subjective statement, but
> when
> > >> I
> > >>>> just started using Apache Calcite and knew very little about it, I
> was
> > >>> able
> > >>>> to write rule patterns by simply looking at the IDE JavaDoc pop-ups
> > and
> > >>>> code completion. When the RuleConfig was introduced, every new rule
> > >>> always
> > >>>> required me to look at some other rule as an example, yet it was
> > >> doable.
> > >>>> Now we also need to configure the project build system to write a
> > >> single
> > >>>> custom rule.
> > >>>>
> > >>>> At the same time, a significant fraction of the rules are pretty
> > >> simple.
> > >>>> E.g., "operator A on top of operator B". If some additional
> > >> configuration
> > >>>> is required, it could be added via plain rules fields, because at
> the
> > >> end
> > >>>> of the day the rule instance is not more than a plain Java object.
> > >>>>
> > >>>> A good example is the FilterProjectTransposeRule. What now takes
> tens
> > >> of
> > >>>> lines of code in the Config subclass [1] (that you hardly could
> write
> > >>>> without a reference example), and ~500 LOC in the generated code
> that
> > >> you
> > >>>> get through additional plugin configuration [2] in your build
> system,
> > >>> could
> > >>>> have been expressed in a dozen lines of code [3] in Apache Calcite
> > >>> 1.22.0.
> > >>>>
> > >>>> My question is - are we sure we are going in the right direction in
> > >> terms
> > >>>> of complexity and the entry bar for the newcomers? Wouldn't it be
> > >> better
> > >>> to
> > >>>> follow the 80/20 rule, when simple rules could be easily created
> > >>>> programmatically with no external dependencies, while more advanced
> > >>>> facilities like Immutables are used only for the complex rules?
> > >>>>
> > >>>> Regards,
> > >>>> Vladimir.
> > >>>>
> > >>>> [1]
> > >>>>
> > >>>>
> > >>>
> > >>
> >
> https://github.com/apache/calcite/blob/calcite-1.30.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L208-L260
> > >>>> [2]
> > >>>>
> > >>>>
> > >>>
> > >>
> >
> https://github.com/apache/calcite/blob/calcite-1.30.0/core/build.gradle.kts#L215-L224
> > >>>> [3]
> > >>>>
> > >>>>
> > >>>
> > >>
> >
> https://github.com/apache/calcite/blob/calcite-1.22.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L99-L110
> > >>>>
> > >>>
> > >>
> >
> >
>
> --
> This message contains confidential information and is intended only for
> the
> individuals named. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> immediately by e-mail if you have received this e-mail by mistake and
> delete this e-mail from your system. E-mail transmission cannot be
> guaranteed to be secure or error-free as information could be intercepted,
> corrupted, lost, destroyed, arrive late or incomplete, or contain viruses.
> The sender therefore does not accept liability for any errors or omissions
> in the contents of this message, which arise as a result of e-mail
> transmission. If verification is required, please request a hard-copy
> version. -Hazelcast
>

Reply via email to