Re: [DISCUSS] Should RexSimplify retain the order of the predicates or not

Michael Mior Fri, 06 Mar 2020 04:27:44 -0800

In your example with random(), I would expect it to execute twice because
random is not deterministic.


On Thu, Mar 5, 2020, 13:15 Julian Hyde <[email protected]> wrote:

> Rex is not SQL. The SQL standard does not have a say in what is valid Rex.
> (Clearly we have to comply with the SQL standard, but we change the Rex
> that we generate for this.)
>
> We need some guarantee of ordering, for example the case you cite, where
> we check whether a reference is null before we reference it. AND may or may
> not be the operator that guarantees ordering. CASE retains ordering
> (meaning that you have to evaluate the condition before the branch) but I
> am not convinced that we need to keep ordering in AND and OR.
>
> Whether RexSimplify should aggressively canonize the order or preserve
> order at all costs are different topics. I am actually against both.
>
> Related to the ordering question is the question of the number of
> evaluations. If I write ‘random() * 2 < random() < 3’, is it guaranteed to
> execute ‘random()’ at most once? Precisely once? I think the Rex language
> could use a concept like single assignment, like ‘let r = random() in (r *
> 2, r * 3) end’, which ensures that ‘random()’ is called exactly once and is
> called before the expressions ‘r * 2’ and ‘r * 3’ are evaluated.
>
> This week, via a twitter exchange with Torsten Grust [1] I came across the
> paper “SSA is Functional Programming (Appel, 1998)” [2]. I could see Rex
> evolving towards SSA.
>
> Julian
>
> [1] https://twitter.com/Teggy/status/1234935448310603777 <
> https://twitter.com/Teggy/status/1234935448310603777>
>
> [2] https://www.cs.princeton.edu/~appel/papers/ssafun.pdf <
> https://www.cs.princeton.edu/~appel/papers/ssafun.pdf>
>
> > On Mar 5, 2020, at 3:14 AM, Chunwei Lei <[email protected]> wrote:
> >
> > Currently, RexSimplify would decompose and compose the AND expression,
> > which particularly puts
> > IS NOT NULL and NOT predicates at the end of the AND expression[1][2].
> For
> > instance,
> > `$1 is not null and $2=1` will be changed to `$2=1 and $1 is not null`
> > after being simplified.
> >
> > I know it is not a bug because the SQL standard[3] does not say the
> > expression order should be retained.
> > But I am wondering whether we can improve a little bit, namely, users can
> > decide whether RexSimplify
> > retains the order of the predicates or not. Because in my humble opinion,
> > changing the order of these
> > predicates might lead to two disadvantages:
> >
> > 1) it might break some queries, especially those which contain udf.
> > Assuming we have a udf called udf1 which throws an exception when meeting
> > NULL operand.
> > For query `select a is not null and udf1(a);`, it can run successfully
> > because of short-circuiting.
> > But after being simplified it will fail because `udf(a)` is executed
> before
> > 'a is not null'.
> >
> > 2) it might bring extra overhead.
> > For instance, for `a is not null and heavey_udf(a)!='1'`, if we change
> the
> > order,
> > `heavey_udf(a)` will be executed even when a is null which might lead to
> > extra overhead.
> >
> > There are some discussions about this topic[4]. Unfortunately, we do not
> > reach a consensus.
> > What do you think about it? Would appreciate your feedback.
> >
> >
> > [1]
> >
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1529
> > [2]
> >
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1552
> > [3]
> >
> https://standards.iso.org/ittf/PubliclyAvailableStandards/c053681_ISO_IEC_9075-1_2011.zip
> > [4] https://issues.apache.org/jira/browse/CALCITE-3746
>
>

Re: [DISCUSS] Should RexSimplify retain the order of the predicates or not

Reply via email to