>  if sending SQL to a database that does not understand IS NOT DISTINCT FROM
That is sadly true. I found that Hive just starts to support INDF since 3.0.0. 
But dy default expanding INDF is still questionable. Their incapability should 
not force Calcite expand INDF by default.

> If both arguments are not null, it probably makes sense to rewrite “x IS NOT 
> DISTINCT FROM y” to “x = y”,
I agree. But it is can be done with RexSimplify easily, expanding it or using a 
rule FilterRemoveIsNotDistinctFromRule is an overkill, IMHO.

- Haisheng

------------------------------------------------------------------
发件人:Julian Hyde<[email protected]>
日 期:2019年06月06日 05:09:14
收件人:dev<[email protected]>
主 题:Re: [DISCUSS] IS NOT DISTINCT FROM rewrite

My instinct is that we should leave it unexpanded. And that we should recognize 
“equals-like operators”, so that a planner rule originally written for ‘=‘ 
could easily be expanded to also apply to ‘is not distinct from’.

Of course there would be a way of expanding it that we could use if 
circumstances required it — e.g. if sending SQL to a database that does not 
understand IS NOT DISTINCT FROM — but we would not expand it by default.

If both arguments are not null, it probably makes sense to rewrite “x IS NOT 
DISTINCT FROM y” to “x = y”, because the latter is more common and no less 
simple.

Julian



> On Jun 5, 2019, at 1:46 PM, Haisheng Yuan <[email protected]> wrote:
> 
> I see INDF is rewritten to OR, and FilterRemoveIsNotDistinctFromRule rewrites 
> INDF to CASE expression. Why do we want to do that? To simplify expression 
> like "a is not distinct from b or a = b"? Then we spend a lot effort to 
> convert OR/CASE back to INDF. 
> 
> I am curious what is the motivation to rewrite INDF. Does it really help a 
> lot in production? I would like to hear the use cases if it does.
> 
> - Haisheng
> 

Reply via email to