Re: Re: [DISCUSS] IS NOT DISTINCT FROM rewrite

Haisheng Yuan Wed, 05 Jun 2019 17:35:43 -0700

>  if sending SQL to a database that does not understand IS NOT DISTINCT FROM
That is sadly true. I found that Hive just starts to support INDF since 3.0.0. 
But dy default expanding INDF is still questionable. Their incapability should 
not force Calcite expand INDF by default.

> If both arguments are not null, it probably makes sense to rewrite “x IS NOT 
> DISTINCT FROM y” to “x = y”,
I agree. But it is can be done with RexSimplify easily, expanding it or using a 
rule FilterRemoveIsNotDistinctFromRule is an overkill, IMHO.

- Haisheng

------------------------------------------------------------------
发件人：Julian Hyde<[email protected]>
日　期：2019年06月06日 05:09:14
收件人：dev<[email protected]>
主　题：Re: [DISCUSS] IS NOT DISTINCT FROM rewrite

My instinct is that we should leave it unexpanded. And that we should recognize 
“equals-like operators”, so that a planner rule originally written for ‘=‘ 
could easily be expanded to also apply to ‘is not distinct from’.

Of course there would be a way of expanding it that we could use if 
circumstances required it — e.g. if sending SQL to a database that does not 
understand IS NOT DISTINCT FROM — but we would not expand it by default.

If both arguments are not null, it probably makes sense to rewrite “x IS NOT 
DISTINCT FROM y” to “x = y”, because the latter is more common and no less 
simple.

Julian

> On Jun 5, 2019, at 1:46 PM, Haisheng Yuan <[email protected]> wrote:
> 
> I see INDF is rewritten to OR, and FilterRemoveIsNotDistinctFromRule rewrites 
> INDF to CASE expression. Why do we want to do that? To simplify expression 
> like "a is not distinct from b or a = b"? Then we spend a lot effort to 
> convert OR/CASE back to INDF. 
> 
> I am curious what is the motivation to rewrite INDF. Does it really help a 
> lot in production? I would like to hear the use cases if it does.
> 
> - Haisheng
>

Re: Re: [DISCUSS] IS NOT DISTINCT FROM rewrite

Reply via email to