Thanks for sharing your thoughts, Julian and Michael. Now I understand more about the design.
Best, Chunwei On Sat, Mar 7, 2020 at 1:51 AM Julian Hyde <[email protected]> wrote: > Sorry, I didn’t explain the example very well. Should have said “assume a > random() function that returns the same value each time it is called on a > particular row”. I believe that NEXT_VALUE (of a sequence) has that > behavior. > > > On Mar 6, 2020, at 4:27 AM, Michael Mior <[email protected]> wrote: > > > > In your example with random(), I would expect it to execute twice because > > random is not deterministic. > > > > On Thu, Mar 5, 2020, 13:15 Julian Hyde <[email protected] <mailto: > [email protected]>> wrote: > > > >> Rex is not SQL. The SQL standard does not have a say in what is valid > Rex. > >> (Clearly we have to comply with the SQL standard, but we change the Rex > >> that we generate for this.) > >> > >> We need some guarantee of ordering, for example the case you cite, where > >> we check whether a reference is null before we reference it. AND may or > may > >> not be the operator that guarantees ordering. CASE retains ordering > >> (meaning that you have to evaluate the condition before the branch) but > I > >> am not convinced that we need to keep ordering in AND and OR. > >> > >> Whether RexSimplify should aggressively canonize the order or preserve > >> order at all costs are different topics. I am actually against both. > >> > >> Related to the ordering question is the question of the number of > >> evaluations. If I write ‘random() * 2 < random() < 3’, is it guaranteed > to > >> execute ‘random()’ at most once? Precisely once? I think the Rex > language > >> could use a concept like single assignment, like ‘let r = random() in > (r * > >> 2, r * 3) end’, which ensures that ‘random()’ is called exactly once > and is > >> called before the expressions ‘r * 2’ and ‘r * 3’ are evaluated. > >> > >> This week, via a twitter exchange with Torsten Grust [1] I came across > the > >> paper “SSA is Functional Programming (Appel, 1998)” [2]. I could see Rex > >> evolving towards SSA. > >> > >> Julian > >> > >> [1] https://twitter.com/Teggy/status/1234935448310603777 < > >> https://twitter.com/Teggy/status/1234935448310603777 < > https://twitter.com/Teggy/status/1234935448310603777>> > >> > >> [2] https://www.cs.princeton.edu/~appel/papers/ssafun.pdf < > https://www.cs.princeton.edu/~appel/papers/ssafun.pdf> < > >> https://www.cs.princeton.edu/~appel/papers/ssafun.pdf < > https://www.cs.princeton.edu/~appel/papers/ssafun.pdf>> > >> > >>> On Mar 5, 2020, at 3:14 AM, Chunwei Lei <[email protected]> > wrote: > >>> > >>> Currently, RexSimplify would decompose and compose the AND expression, > >>> which particularly puts > >>> IS NOT NULL and NOT predicates at the end of the AND expression[1][2]. > >> For > >>> instance, > >>> `$1 is not null and $2=1` will be changed to `$2=1 and $1 is not null` > >>> after being simplified. > >>> > >>> I know it is not a bug because the SQL standard[3] does not say the > >>> expression order should be retained. > >>> But I am wondering whether we can improve a little bit, namely, users > can > >>> decide whether RexSimplify > >>> retains the order of the predicates or not. Because in my humble > opinion, > >>> changing the order of these > >>> predicates might lead to two disadvantages: > >>> > >>> 1) it might break some queries, especially those which contain udf. > >>> Assuming we have a udf called udf1 which throws an exception when > meeting > >>> NULL operand. > >>> For query `select a is not null and udf1(a);`, it can run successfully > >>> because of short-circuiting. > >>> But after being simplified it will fail because `udf(a)` is executed > >> before > >>> 'a is not null'. > >>> > >>> 2) it might bring extra overhead. > >>> For instance, for `a is not null and heavey_udf(a)!='1'`, if we change > >> the > >>> order, > >>> `heavey_udf(a)` will be executed even when a is null which might lead > to > >>> extra overhead. > >>> > >>> There are some discussions about this topic[4]. Unfortunately, we do > not > >>> reach a consensus. > >>> What do you think about it? Would appreciate your feedback. > >>> > >>> > >>> [1] > >>> > >> > https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1529 > >>> [2] > >>> > >> > https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1552 > >>> [3] > >>> > >> > https://standards.iso.org/ittf/PubliclyAvailableStandards/c053681_ISO_IEC_9075-1_2011.zip > >>> [4] https://issues.apache.org/jira/browse/CALCITE-3746 > >
