Without simplifications, many trivial RelNodes would be produced. It is 
beneficial to have those in RelBuilder; if they were in rules, the trivial 
RelNodes (and equivalence sets) would still be present, increasing the size of 
the search space.

I want to draw a distinction between simplification and normalization. A 
normalized form is relied upon throughout the system. Suppose for example, that 
we always normalize ‘RexLiteral = RexInputRef’ to ‘RexInputRef = RexLiteral’. 
If a rule encountered the latter case, it would not be a bug if the rule failed 
with, say, a ClassCastException.

So, I disagree with Vladimir that 'RexSimplify may also be considered a 
“normalization”’. If simplification is turned off, each rule must be able to 
deal with the unsimplified expressions.

Also, the very idea of normalizations being optional, enabled by system 
properties or other config, is rather disturbing, because the rules probably 
don’t know that the normalization has been turned off.

The only place for normalization, in my opinion, is explicitly, in a particular 
planner phase. For example, pulling up all filters before attempting to match 
materialized views.

Julian

> On Mar 11, 2021, at 10:37 AM, Vladimir Ozerov <[email protected]> wrote:
> 
> in our practice, we also had some problems with normalization. First, we
> observed problems with the unwanted (and sometimes
> incorrect) simplification of expressions with CASTs and literals which came
> from RexSimplify. I couldn't find an easy way to disable that behavior.
> Note, that RexSimplify may also be considered a "normalization". Second, we
> implemented a way to avoid Project when doing join reordering but had some
> issues with operator signatures due to lack of automatic normalization for
> expressions for permuted inputs. These two cases demonstrate two opposite
> views: sometimes you want a specific normalization to happen automatically,
> but sometimes you want to disable it.
> 
> Perhaps an alternative approach could be to unify all simplification and
> normalization logic and split it into configurable rules. Then, we may add
> these rules as a separate rule set to the planner, which would be invoked
> heuristically every time an operator with expressions is registered in
> MEMO. In this case, a user would not need to bother about RexNode
> constructors. To clarify, under "rules" I do not mean heavy-weight rules
> similar to normal rules. Instead, it might be simple pattern+method pairs,
> that could even be compiled into a static program using Janino. This
> approach could be very flexible and convenient: a single place in the code
> where all rewrite happens, complete control of the optimization rules,
> modular rules instead of monolithic code (like in RexSimplify). The obvious
> downside - it would require more time to implement than other proposed
> approaches.
> 
> What do you think about that?
> 
> Regards,
> Vladimir.
> 
> чт, 11 мар. 2021 г. в 13:33, Vladimir Sitnikov <[email protected]
>> :
> 
>> Stamatis>just the option to use it or not in a more friendly way
>> Stamatis>than a system property.
>> 
>> As far as I remember, the key issue here is that new RexBuilder(...) is a
>> quite common pattern,
>> and what you suggest looks like "everyone would have to provide extra
>> argument when creating RexBuilder".
>> 
>> On top of that, there are use cases like "new RexCall(...)" in the static
>> context (see org.apache.calcite.rex.RexUtil#not).
>> 
>> Making the uses customizable adds significant overhead with doubtful gains.
>> 
>> I have not explored the route though, so there might be solutions.
>> For instance, it might work if we have an in-core dependency injection that
>> would hide the complexity
>> when coding :core, however, I don't think we could expose DI to Calcite
>> users.
>> 
>> Vladimir
>> 

Reply via email to