Yanjing,

That is a valid point. The API could perhaps allow people to define rules
with nodes that are optional. Can you please log a JIRA for this? Include
the code that you would like to write in order to define a rule and handle
a match.

Julian

On Thu, Apr 14, 2022 at 2:27 AM Yanjing Wang <[email protected]>
wrote:

> Hi, all,
>
> I think the rule configuration is not convenient for optional operand, such
> as a match operand tree
>               Project
>                   |
>               Union
>               /        \
>             Filter    Filter
> If I want to compose a rule that the Project operand is optional, I must
> create two configuration and new two rule instance, if we can make the
> operand optional, I think I just need one rule and configuration.
>
>
> Viliam Durina <[email protected]> 于2022年4月14日周四 14:55写道:
>
> > I can say for myself that though the migration was a nuisance, after we
> > figured out what to do, it was straightforward and the current approach
> > with Immutables is more readable and easier to write than the previous
> with
> > `operand` and `operandJ`.
> >
> > Viliam
> >
> > On Wed, 13 Apr 2022 at 19:23, Julian Hyde <[email protected]>
> wrote:
> >
> > > 1. I agree with Jacques. Thanks, Vladimir, for asking the question ‘is
> > > progress really progress?’. We may not back out these changes (probably
> > > won’t) but it will inform future discussions. It’s valid to say ‘when
> > did X
> > > as part of the rule refactoring, that didn’t deliver on promises’.
> > >
> > > 2. Thanks, Thomas, for sending the link to the original discussion. It
> is
> > > still my position that many more people will use existing rules than
> > write
> > > their own.
> > >
> > > 3. We have effectively created a DSL [1] (especially for operand
> patterns
> > > [2]) embedded in Java. Java 8 is not a great language for writing DSLs:
> > > Kotlin, Scala, Lisp and even later versions of Java are superior. But a
> > > design option we should consider is a DSL with its own syntax and
> parser.
> > > CockroachDB’s OptGen [3] is an example of this, and there is much to
> > like.
> > >
> > > Compared to DSLs embedded in other languages, languages with their own
> > > parser are more intuitive for users, and give better error messages. It
> > is
> > > also possible to support multiple versions simultaneously.
> > >
> > > Julian
> > >
> > > [1] https://en.wikipedia.org/wiki/Domain-specific_language <
> > > https://en.wikipedia.org/wiki/Domain-specific_language>
> > >
> > > [2] https://issues.apache.org/jira/browse/CALCITE-3923 <
> > > https://issues.apache.org/jira/browse/CALCITE-3923>
> > >
> > > [3]
> > >
> >
> https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/optgen/lang/doc.go
> > > <
> > >
> >
> https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/optgen/lang/doc.go
> > >
> > >
> > >
> > >
> > > > On Apr 13, 2022, at 9:56 AM, Jacques Nadeau <[email protected]>
> > wrote:
> > > >
> > > > Vladimir,
> > > >
> > > > It's always good to ask these kinds of questions. e.g. "Is progress
> > > really
> > > > progress?"
> > > >
> > > > I think moving to a configuration object was a substantial
> improvement.
> > > > Until Java has default parameters and named parameters, I don't know
> of
> > > > another good way to make rules sufficiently configurable for a wide
> > range
> > > > of users.
> > > >
> > > > To me, one of the main benefits of the new configuration objects is a
> > > much
> > > > less frequent need to subclass a rule. Back when we built original
> > Drill
> > > > and Dremio code, we were constantly pained by the need to control
> more
> > > > pieces of a rule than were exposed by public constructors. When
> trying
> > to
> > > > introduce those options, we had to both: (1) update calcite and wait
> > for
> > > a
> > > > release cycle and (2) create another constructor overload, sometimes
> > > > leading to a large number of overloads. The config object almost
> > entirely
> > > > eliminated this challenge.
> > > >
> > > > I do agree that having config objects is more complex for the casual
> > user
> > > > so in some ways we're adding advanced functionality at the potential
> > cost
> > > > of usability for new users. That's not something great. I'm not sure
> > it's
> > > > really bad but it is something to be aware of. Having default configs
> > for
> > > > rules seems to make this a rather small additional challenge.
> > > >
> > > > For your comments on downstream projects and the examples you give: I
> > > think
> > > > that sounds more like advanced usage. My experience is somewhat
> > different
> > > > from yours: these changes have been net positive on the stuff I've
> been
> > > > working on.
> > > >
> > > > Two last thoughts wrt downstream projects:
> > > >
> > > >   - I think it is important to separate how to create config objects
> > from
> > > >   the way that Calcite does it. We did change from proxies to
> > immutables
> > > >   within Caclite but neither has ever been the required way to
> > implement
> > > >   config objects. Config objects are always defined as plain java
> > > interfaces.
> > > >   If a particular subproject wants to expose configuration in a
> > > completely
> > > >   different way, I don't know of any constraint the rules interface
> > > forces.
> > > >   - A nice thing that was done with the initial implementation is now
> > of
> > > >   the configuration concepts are exposed on the RelOptRule abstract
> > > class. A
> > > >   user must opt in to RelRule for them to have to worry about any of
> > > these
> > > >   concerns.
> > > >
> > > >
> > > > In working with the config concepts and extensibility, I do think a
> > > couple
> > > > of things are relatively painful. I don't
> > > >
> > > >   1. The definition of configs as interfaces inside the rules causes
> > some
> > > >   issues with static initialization order. I think Julian even
> > mentioned
> > > some
> > > >   of this in one of the original docs config docs.
> > > >   2. The generics of RelRule specifically can be challenging when
> > > extended
> > > >   rules in some cases due to how generics and inheritance work.
> > > >   3. The class definition of the config interfaces are not
> > > >   self-referencing (one of my shelved reworks actually modified to
> > > >   support this) meaning that there are a lot of ugly (my subjective
> > > opinion)
> > > >   uses of Config.as() in the code. This last one could be resolved by
> > > >   exposing the concrete builders that Immutables generates but for at
> > > least
> > > >   the initial changes, I wanted to avoid exposing these as new public
> > > >   interfaces (and thus all the immutables classes are marked package
> > > private).
> > > >   4. Merging construction and use feels like an anti-pattern (my
> > > >   subjective opinion). The most common patterns I've seen treat a
> > builder
> > > >   separate from an immutable constructed object (and have something
> > akin
> > > to a
> > > >   toBuilder() method to take an immutable config back to a builder).
> In
> > > the
> > > >   Calcite config, these two concepts are merged. In some ways it
> makes
> > > things
> > > >   simpler for trivial cases. However,  it is less formal and causes
> > pain
> > > when
> > > >   you have required properties that have no defaults since there is
> no
> > > formal
> > > >   model for "I'm done building, now check everything is complete".
> This
> > > means
> > > >   in several places we have to put default values that are actually
> > > invalid
> > > >   rather than just rely on a builder's build validation step.
> > > >
> > > > One other note, I think if someone is working in Java 14+ (records)
> or
> > > > Kotlin, there are also several easy ways to produce config impls that
> > are
> > > > easy to use (without proxies and/or immutables).
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Apr 12, 2022 at 9:29 AM Thomas Rebele
> > <[email protected]
> > > >
> > > > wrote:
> > > >
> > > >> Hello,
> > > >>
> > > >> The reasons for the planner rules configuration can be found here:
> > > >> CALCITE-3923 <https://issues.apache.org/jira/browse/CALCITE-3923>.
> > See
> > > >> also
> > > >> the email thread [DISCUSS] Refactor how planner rules are
> > parameterized
> > > >> <
> > > >>
> > >
> >
> https://lists.apache.org/thread.html/rfdf6f9b7821988bdd92b0377e3d293443a6376f4773c4c658c891cf9%40%3Cdev.calcite.apache.org%3E
> > > >>>
> > > >> .
> > > >>
> > > >> Cordialement / Best Regards,
> > > >> *Thomas Rebele, PhD* | R&D Developer | Germany | www.tibco.com
> > > >>
> > > >>
> > > >> On Tue, Apr 12, 2022 at 6:10 PM Gavin Ray <[email protected]>
> > > wrote:
> > > >>
> > > >>> I don't have any weight behind my opinion or experience,
> > > >>> but anything that lowers the barrier to entry to Calcite for
> > newcomers
> > > >> is a
> > > >>> huge win in my mind.
> > > >>>
> > > >>> I assume the reason for the changes was because codegen improved
> > > >>> performance?
> > > >>>
> > > >>> Could it make sense to allow both options, the easy/less-performant
> > way
> > > >> for
> > > >>> people who want to experiment and learn the ropes,
> > > >>> and the codegen path for productionizing the final rules you come
> up
> > > >> with?
> > > >>>
> > > >>> Or does this make matters worse, trying to support two API's
> > > >>>
> > > >>> On Tue, Apr 12, 2022 at 6:25 AM Vladimir Ozerov <
> [email protected]>
> > > >>> wrote:
> > > >>>
> > > >>>> Hi folks,
> > > >>>>
> > > >>>> Rules are an essential part of the Calcite-based query
> optimizers. A
> > > >>>> typical optimizer may require dozens of custom rules that are
> > created
> > > >> by
> > > >>>> extending some Apache Calcite interfaces.
> > > >>>>
> > > >>>> During the last two years, there were two major revisions of how
> > rules
> > > >>> are
> > > >>>> created:
> > > >>>>
> > > >>>>   1. In early 1.2x versions, the typical approach was to use
> > > >>>>   RelOptRuleOperand with a set of helper methods in a builder-like
> > > >>>>   pattern.
> > > >>>>   2. Then, we switched to the runtime code generation.
> > > >>>>   3. Finally, we switched to the compile-time code generation with
> > the
> > > >>>>   Immutables framework.
> > > >>>>
> > > >>>> Every such change requires the downstream projects to rewrite all
> > > their
> > > >>>> rules. Not only does this require time to understand the new
> > approach,
> > > >>> but
> > > >>>> it may also compromise the correctness of the downstream optimizer
> > > >>> because
> > > >>>> the regression tracking in query optimizers is not trivial.
> > > >>>>
> > > >>>> I had the privilege to try all three approaches, and I cannot get
> > rid
> > > >> of
> > > >>>> the feeling that every new approach is more complicated than the
> > > >> previous
> > > >>>> one. I understand that this is a highly subjective statement, but
> > when
> > > >> I
> > > >>>> just started using Apache Calcite and knew very little about it, I
> > was
> > > >>> able
> > > >>>> to write rule patterns by simply looking at the IDE JavaDoc
> pop-ups
> > > and
> > > >>>> code completion. When the RuleConfig was introduced, every new
> rule
> > > >>> always
> > > >>>> required me to look at some other rule as an example, yet it was
> > > >> doable.
> > > >>>> Now we also need to configure the project build system to write a
> > > >> single
> > > >>>> custom rule.
> > > >>>>
> > > >>>> At the same time, a significant fraction of the rules are pretty
> > > >> simple.
> > > >>>> E.g., "operator A on top of operator B". If some additional
> > > >> configuration
> > > >>>> is required, it could be added via plain rules fields, because at
> > the
> > > >> end
> > > >>>> of the day the rule instance is not more than a plain Java object.
> > > >>>>
> > > >>>> A good example is the FilterProjectTransposeRule. What now takes
> > tens
> > > >> of
> > > >>>> lines of code in the Config subclass [1] (that you hardly could
> > write
> > > >>>> without a reference example), and ~500 LOC in the generated code
> > that
> > > >> you
> > > >>>> get through additional plugin configuration [2] in your build
> > system,
> > > >>> could
> > > >>>> have been expressed in a dozen lines of code [3] in Apache Calcite
> > > >>> 1.22.0.
> > > >>>>
> > > >>>> My question is - are we sure we are going in the right direction
> in
> > > >> terms
> > > >>>> of complexity and the entry bar for the newcomers? Wouldn't it be
> > > >> better
> > > >>> to
> > > >>>> follow the 80/20 rule, when simple rules could be easily created
> > > >>>> programmatically with no external dependencies, while more
> advanced
> > > >>>> facilities like Immutables are used only for the complex rules?
> > > >>>>
> > > >>>> Regards,
> > > >>>> Vladimir.
> > > >>>>
> > > >>>> [1]
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://github.com/apache/calcite/blob/calcite-1.30.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L208-L260
> > > >>>> [2]
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://github.com/apache/calcite/blob/calcite-1.30.0/core/build.gradle.kts#L215-L224
> > > >>>> [3]
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://github.com/apache/calcite/blob/calcite-1.22.0/core/src/main/java/org/apache/calcite/rel/rules/FilterProjectTransposeRule.java#L99-L110
> > > >>>>
> > > >>>
> > > >>
> > >
> > >
> >
> > --
> > This message contains confidential information and is intended only for
> > the
> > individuals named. If you are not the named addressee you should not
> > disseminate, distribute or copy this e-mail. Please notify the sender
> > immediately by e-mail if you have received this e-mail by mistake and
> > delete this e-mail from your system. E-mail transmission cannot be
> > guaranteed to be secure or error-free as information could be
> intercepted,
> > corrupted, lost, destroyed, arrive late or incomplete, or contain
> viruses.
> > The sender therefore does not accept liability for any errors or
> omissions
> > in the contents of this message, which arise as a result of e-mail
> > transmission. If verification is required, please request a hard-copy
> > version. -Hazelcast
> >
>

Reply via email to