Thanks for sharing your thoughts, Julian and Michael.

Now I understand more about the design.


Best,
Chunwei


On Sat, Mar 7, 2020 at 1:51 AM Julian Hyde <[email protected]> wrote:

> Sorry, I didn’t explain the example very well. Should have said “assume a
> random() function that returns the same value each time it is called on a
> particular row”. I believe that NEXT_VALUE (of a sequence) has that
> behavior.
>
> > On Mar 6, 2020, at 4:27 AM, Michael Mior <[email protected]> wrote:
> >
> > In your example with random(), I would expect it to execute twice because
> > random is not deterministic.
> >
> > On Thu, Mar 5, 2020, 13:15 Julian Hyde <[email protected] <mailto:
> [email protected]>> wrote:
> >
> >> Rex is not SQL. The SQL standard does not have a say in what is valid
> Rex.
> >> (Clearly we have to comply with the SQL standard, but we change the Rex
> >> that we generate for this.)
> >>
> >> We need some guarantee of ordering, for example the case you cite, where
> >> we check whether a reference is null before we reference it. AND may or
> may
> >> not be the operator that guarantees ordering. CASE retains ordering
> >> (meaning that you have to evaluate the condition before the branch) but
> I
> >> am not convinced that we need to keep ordering in AND and OR.
> >>
> >> Whether RexSimplify should aggressively canonize the order or preserve
> >> order at all costs are different topics. I am actually against both.
> >>
> >> Related to the ordering question is the question of the number of
> >> evaluations. If I write ‘random() * 2 < random() < 3’, is it guaranteed
> to
> >> execute ‘random()’ at most once? Precisely once? I think the Rex
> language
> >> could use a concept like single assignment, like ‘let r = random() in
> (r *
> >> 2, r * 3) end’, which ensures that ‘random()’ is called exactly once
> and is
> >> called before the expressions ‘r * 2’ and ‘r * 3’ are evaluated.
> >>
> >> This week, via a twitter exchange with Torsten Grust [1] I came across
> the
> >> paper “SSA is Functional Programming (Appel, 1998)” [2]. I could see Rex
> >> evolving towards SSA.
> >>
> >> Julian
> >>
> >> [1] https://twitter.com/Teggy/status/1234935448310603777 <
> >> https://twitter.com/Teggy/status/1234935448310603777 <
> https://twitter.com/Teggy/status/1234935448310603777>>
> >>
> >> [2] https://www.cs.princeton.edu/~appel/papers/ssafun.pdf <
> https://www.cs.princeton.edu/~appel/papers/ssafun.pdf> <
> >> https://www.cs.princeton.edu/~appel/papers/ssafun.pdf <
> https://www.cs.princeton.edu/~appel/papers/ssafun.pdf>>
> >>
> >>> On Mar 5, 2020, at 3:14 AM, Chunwei Lei <[email protected]>
> wrote:
> >>>
> >>> Currently, RexSimplify would decompose and compose the AND expression,
> >>> which particularly puts
> >>> IS NOT NULL and NOT predicates at the end of the AND expression[1][2].
> >> For
> >>> instance,
> >>> `$1 is not null and $2=1` will be changed to `$2=1 and $1 is not null`
> >>> after being simplified.
> >>>
> >>> I know it is not a bug because the SQL standard[3] does not say the
> >>> expression order should be retained.
> >>> But I am wondering whether we can improve a little bit, namely, users
> can
> >>> decide whether RexSimplify
> >>> retains the order of the predicates or not. Because in my humble
> opinion,
> >>> changing the order of these
> >>> predicates might lead to two disadvantages:
> >>>
> >>> 1) it might break some queries, especially those which contain udf.
> >>> Assuming we have a udf called udf1 which throws an exception when
> meeting
> >>> NULL operand.
> >>> For query `select a is not null and udf1(a);`, it can run successfully
> >>> because of short-circuiting.
> >>> But after being simplified it will fail because `udf(a)` is executed
> >> before
> >>> 'a is not null'.
> >>>
> >>> 2) it might bring extra overhead.
> >>> For instance, for `a is not null and heavey_udf(a)!='1'`, if we change
> >> the
> >>> order,
> >>> `heavey_udf(a)` will be executed even when a is null which might lead
> to
> >>> extra overhead.
> >>>
> >>> There are some discussions about this topic[4]. Unfortunately, we do
> not
> >>> reach a consensus.
> >>> What do you think about it? Would appreciate your feedback.
> >>>
> >>>
> >>> [1]
> >>>
> >>
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1529
> >>> [2]
> >>>
> >>
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1552
> >>> [3]
> >>>
> >>
> https://standards.iso.org/ittf/PubliclyAvailableStandards/c053681_ISO_IEC_9075-1_2011.zip
> >>> [4] https://issues.apache.org/jira/browse/CALCITE-3746
>
>

Reply via email to