Sorry, I didn’t explain the example very well. Should have said “assume a 
random() function that returns the same value each time it is called on a 
particular row”. I believe that NEXT_VALUE (of a sequence) has that behavior.

> On Mar 6, 2020, at 4:27 AM, Michael Mior <[email protected]> wrote:
> 
> In your example with random(), I would expect it to execute twice because
> random is not deterministic.
> 
> On Thu, Mar 5, 2020, 13:15 Julian Hyde <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>> Rex is not SQL. The SQL standard does not have a say in what is valid Rex.
>> (Clearly we have to comply with the SQL standard, but we change the Rex
>> that we generate for this.)
>> 
>> We need some guarantee of ordering, for example the case you cite, where
>> we check whether a reference is null before we reference it. AND may or may
>> not be the operator that guarantees ordering. CASE retains ordering
>> (meaning that you have to evaluate the condition before the branch) but I
>> am not convinced that we need to keep ordering in AND and OR.
>> 
>> Whether RexSimplify should aggressively canonize the order or preserve
>> order at all costs are different topics. I am actually against both.
>> 
>> Related to the ordering question is the question of the number of
>> evaluations. If I write ‘random() * 2 < random() < 3’, is it guaranteed to
>> execute ‘random()’ at most once? Precisely once? I think the Rex language
>> could use a concept like single assignment, like ‘let r = random() in (r *
>> 2, r * 3) end’, which ensures that ‘random()’ is called exactly once and is
>> called before the expressions ‘r * 2’ and ‘r * 3’ are evaluated.
>> 
>> This week, via a twitter exchange with Torsten Grust [1] I came across the
>> paper “SSA is Functional Programming (Appel, 1998)” [2]. I could see Rex
>> evolving towards SSA.
>> 
>> Julian
>> 
>> [1] https://twitter.com/Teggy/status/1234935448310603777 <
>> https://twitter.com/Teggy/status/1234935448310603777 
>> <https://twitter.com/Teggy/status/1234935448310603777>>
>> 
>> [2] https://www.cs.princeton.edu/~appel/papers/ssafun.pdf 
>> <https://www.cs.princeton.edu/~appel/papers/ssafun.pdf> <
>> https://www.cs.princeton.edu/~appel/papers/ssafun.pdf 
>> <https://www.cs.princeton.edu/~appel/papers/ssafun.pdf>>
>> 
>>> On Mar 5, 2020, at 3:14 AM, Chunwei Lei <[email protected]> wrote:
>>> 
>>> Currently, RexSimplify would decompose and compose the AND expression,
>>> which particularly puts
>>> IS NOT NULL and NOT predicates at the end of the AND expression[1][2].
>> For
>>> instance,
>>> `$1 is not null and $2=1` will be changed to `$2=1 and $1 is not null`
>>> after being simplified.
>>> 
>>> I know it is not a bug because the SQL standard[3] does not say the
>>> expression order should be retained.
>>> But I am wondering whether we can improve a little bit, namely, users can
>>> decide whether RexSimplify
>>> retains the order of the predicates or not. Because in my humble opinion,
>>> changing the order of these
>>> predicates might lead to two disadvantages:
>>> 
>>> 1) it might break some queries, especially those which contain udf.
>>> Assuming we have a udf called udf1 which throws an exception when meeting
>>> NULL operand.
>>> For query `select a is not null and udf1(a);`, it can run successfully
>>> because of short-circuiting.
>>> But after being simplified it will fail because `udf(a)` is executed
>> before
>>> 'a is not null'.
>>> 
>>> 2) it might bring extra overhead.
>>> For instance, for `a is not null and heavey_udf(a)!='1'`, if we change
>> the
>>> order,
>>> `heavey_udf(a)` will be executed even when a is null which might lead to
>>> extra overhead.
>>> 
>>> There are some discussions about this topic[4]. Unfortunately, we do not
>>> reach a consensus.
>>> What do you think about it? Would appreciate your feedback.
>>> 
>>> 
>>> [1]
>>> 
>> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1529
>>> [2]
>>> 
>> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexSimplify.java#L1552
>>> [3]
>>> 
>> https://standards.iso.org/ittf/PubliclyAvailableStandards/c053681_ISO_IEC_9075-1_2011.zip
>>> [4] https://issues.apache.org/jira/browse/CALCITE-3746

Reply via email to