I agree that expandLocalRef and expandList are broken. They should probably 
throw if expansion creates multiple references to non-deterministic expressions.

Let’s be clear what we mean by non-deterministic. “RANDOM" and “NEXT VALUE OF 
<sequence>” are non-deterministic. “CURRENT_TIMESTAMP” is deterministic.

There are uses of expandLocalRef and expandList that are probably safe:

1. When a RexProgram is copied, it expands the DAG of expressions, and then 
performs common-subexpression to build a new DAG. Thus the duplicates are 
re-combined by the end of the process.

2. It is valid to flatten

  SELECT z
  FROM (
    SELECT random() - 1 AS x,
        random() + 1 AS y,
        3 AS z
    FROM t)

to

  SELECT 3 AS z
  FROM t

even though there are common expressions.

3. Expanding common sub-expressions in

  SELECT CASE
    WHEN x > 10 THEN random() + 5
    WHEN x > 5 THEN random() - 3
    ELSE 0 END AS c
  FROM t

is harmless because at most one of the will be used.

Please log a bug. The fix would deprecate the old functions, add new ones with 
the revised behavior, and add tests that the new behavior doesn’t generate 
spurious errors.

Julian


> On Feb 9, 2023, at 3:34 AM, Viliam Durina <[email protected]> 
> wrote:
> 
> Hello all,
> 
> we're using `RexProgram.expandLocalRef()` and `RexProgram.expandList()` to
> have a simple RexNode which we later evaluate. However, if the program
> contains a non-deterministic function call in the common subexpressions,
> such call is inlined and can be called a different number of times as a
> result, leading to an incorrect result.
> 
> This problem spreads to anywhere these methods are used, and it's used
> quite a many times in Calcite, and we've also hit it in our custom rule.
> 
> It seems to me that this function is flawed by design and should be removed
> as it leads to correctness problems with non-deterministic functions. Or
> should be redesigned, because perhaps some uses are valid, for example the
> one to determine selectivity in `RelMdSelectivity`.
> 
> What do others think about this?
> 
> Viliam
> 
> -- 
> This message contains confidential information and is intended only for the 
> individuals named. If you are not the named addressee you should not 
> disseminate, distribute or copy this e-mail. Please notify the sender 
> immediately by e-mail if you have received this e-mail by mistake and 
> delete this e-mail from your system. E-mail transmission cannot be 
> guaranteed to be secure or error-free as information could be intercepted, 
> corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. 
> The sender therefore does not accept liability for any errors or omissions 
> in the contents of this message, which arise as a result of e-mail 
> transmission. If verification is required, please request a hard-copy 
> version. -Hazelcast

Reply via email to