Sorry, I didn’t explain the example very well. Should have said “assume a random() function that returns the same value each time it is called on a particular row”. I believe that NEXT_VALUE (of a sequence) has that behavior.
> On Mar 6, 2020, at 4:27 AM, Michael Mior <[email protected]> wrote: > > In your example with random(), I would expect it to execute twice because > random is not deterministic. > > On Thu, Mar 5, 2020, 13:15 Julian Hyde <[email protected] > <mailto:[email protected]>> wrote: > >> Rex is not SQL. The SQL standard does not have a say in what is valid Rex. >> (Clearly we have to comply with the SQL standard, but we change the Rex >> that we generate for this.) >> >> We need some guarantee of ordering, for example the case you cite, where >> we check whether a reference is null before we reference it. AND may or may >> not be the operator that guarantees ordering. CASE retains ordering >> (meaning that you have to evaluate the condition before the branch) but I >> am not convinced that we need to keep ordering in AND and OR. >> >> Whether RexSimplify should aggressively canonize the order or preserve >> order at all costs are different topics. I am actually against both. >> >> Related to the ordering question is the question of the number of >> evaluations. If I write ‘random() * 2 < random() < 3’, is it guaranteed to >> execute ‘random()’ at most once? Precisely once? I think the Rex language >> could use a concept like single assignment, like ‘let r = random() in (r * >> 2, r * 3) end’, which ensures that ‘random()’ is called exactly once and is >> called before the expressions ‘r * 2’ and ‘r * 3’ are evaluated. >> >> This week, via a twitter exchange with Torsten Grust [1] I came across the >> paper “SSA is Functional Programming (Appel, 1998)” [2]. I could see Rex >> evolving towards SSA. >> >> Julian >> >> [1] https://twitter.com/Teggy/status/1234935448310603777 < >> https://twitter.com/Teggy/status/1234935448310603777 >> <https://twitter.com/Teggy/status/1234935448310603777>> >> >> [2] https://www.cs.princeton.edu/~appel/papers/ssafun.pdf >> <https://www.cs.princeton.edu/~appel/papers/ssafun.pdf> < >> https://www.cs.princeton.edu/~appel/papers/ssafun.pdf >> <https://www.cs.princeton.edu/~appel/papers/ssafun.pdf>> >> >>> On Mar 5, 2020, at 3:14 AM, Chunwei Lei <[email protected]> wrote: >>> >>> Currently, RexSimplify would decompose and compose the AND expression, >>> which particularly puts >>> IS NOT NULL and NOT predicates at the end of the AND expression[1][2]. >> For >>> instance, >>> `$1 is not null and $2=1` will be changed to `$2=1 and $1 is not null` >>> after being simplified. >>> >>> I know it is not a bug because the SQL standard[3] does not say the >>> expression order should be retained. >>> But I am wondering whether we can improve a little bit, namely, users can >>> decide whether RexSimplify >>> retains the order of the predicates or not. Because in my humble opinion, >>> changing the order of these >>> predicates might lead to two disadvantages: >>> >>> 1) it might break some queries, especially those which contain udf. >>> Assuming we have a udf called udf1 which throws an exception when meeting >>> NULL operand. >>> For query `select a is not null and udf1(a);`, it can run successfully >>> because of short-circuiting. >>> But after being simplified it will fail because `udf(a)` is executed >> before >>> 'a is not null'. >>> >>> 2) it might bring extra overhead. >>> For instance, for `a is not null and heavey_udf(a)!='1'`, if we change >> the >>> order, >>> `heavey_udf(a)` will be executed even when a is null which might lead to >>> extra overhead. >>> >>> There are some discussions about this topic[4]. Unfortunately, we do not >>> reach a consensus. >>> What do you think about it? Would appreciate your feedback. >>> >>> >>> [1] >>> >> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1529 >>> [2] >>> >> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1552 >>> [3] >>> >> https://standards.iso.org/ittf/PubliclyAvailableStandards/c053681_ISO_IEC_9075-1_2011.zip >>> [4] https://issues.apache.org/jira/browse/CALCITE-3746
