> if sending SQL to a database that does not understand IS NOT DISTINCT FROM That is sadly true. I found that Hive just starts to support INDF since 3.0.0. But dy default expanding INDF is still questionable. Their incapability should not force Calcite expand INDF by default.
> If both arguments are not null, it probably makes sense to rewrite “x IS NOT > DISTINCT FROM y” to “x = y”, I agree. But it is can be done with RexSimplify easily, expanding it or using a rule FilterRemoveIsNotDistinctFromRule is an overkill, IMHO. - Haisheng ------------------------------------------------------------------ 发件人:Julian Hyde<[email protected]> 日 期:2019年06月06日 05:09:14 收件人:dev<[email protected]> 主 题:Re: [DISCUSS] IS NOT DISTINCT FROM rewrite My instinct is that we should leave it unexpanded. And that we should recognize “equals-like operators”, so that a planner rule originally written for ‘=‘ could easily be expanded to also apply to ‘is not distinct from’. Of course there would be a way of expanding it that we could use if circumstances required it — e.g. if sending SQL to a database that does not understand IS NOT DISTINCT FROM — but we would not expand it by default. If both arguments are not null, it probably makes sense to rewrite “x IS NOT DISTINCT FROM y” to “x = y”, because the latter is more common and no less simple. Julian > On Jun 5, 2019, at 1:46 PM, Haisheng Yuan <[email protected]> wrote: > > I see INDF is rewritten to OR, and FilterRemoveIsNotDistinctFromRule rewrites > INDF to CASE expression. Why do we want to do that? To simplify expression > like "a is not distinct from b or a = b"? Then we spend a lot effort to > convert OR/CASE back to INDF. > > I am curious what is the motivation to rewrite INDF. Does it really help a > lot in production? I would like to hear the use cases if it does. > > - Haisheng >
