[
https://issues.apache.org/jira/browse/CALCITE-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027196#comment-17027196
]
Jin Xing commented on CALCITE-3760:
-----------------------------------
Hi, [~julianhyde] [~amaliujia] Thanks a lot for feedback ~
Yes, a LET clause would be very helpful. It allows us to store the result of a
sub-expression, e.g. result generated from a non-deterministic udf/udaf, and
use it in subsequent clauses. Thus to ensure non-deterministic expressions are
evaluated the consistent number of times. It's already supported by some
vendors [1]. But I would prefer the rewriting within scope of sql standard, a
common scenario is we always want to convert expression back to Sql string and
run in jdbc convention. A non-standard clause might bring obstacle to run the
sql in other dialects. So I propose to don't do the rewrites when found
non-deterministic.
{quote}Related issues are re-ordering of the branches of AND and OR conditions,
and behavior when an expression throws.
{quote}
Currently RexSimplify already takes determinism of expression into
consideration (there might be space to improve). A missed part is to add an
interface for udf/udaf to specify whether it's deterministic.
[1]https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/let.html
> Rewriting non-deterministic function can break query semantics
> --------------------------------------------------------------
>
> Key: CALCITE-3760
> URL: https://issues.apache.org/jira/browse/CALCITE-3760
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: Jin Xing
> Assignee: Jin Xing
> Priority: Major
>
> Calcite rewrite some *SqlFunctions* during validation. But whether the
> function is deterministic is not considered. For a non-deterministic
> operator, the rewriting can break semantics. Additionally there's no
> interface for user to specify the determinism for a UDF/UDAF.
> Say I have non-deterministic UDF & UDAF and run sql like below
> {code:java}
> select coalesce(udf(col0), 100) from foo;
> select nullif(udaf(col0), 1024) from foo;{code}
> They will be rewritten as
> {code:java}
> select case when udf(col0) is not null then udf(col0) else 100 end
> from foo;
> select case when udaf(col0)=1024 then null udaf(col0)
> from foo{code}
> As we can see that non-deterministic UDF & UDAF are called multiple times
> after written. Thus the condition in WHEN clause might NOT be held all the
> time.
> We need to provide an interface for user to specify the determinism in
> UDF/UDAF and consider whether a SqlNode is deterministic when rewriting.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)