alamb commented on issue #8688: URL: https://github.com/apache/arrow-datafusion/issues/8688#issuecomment-1872932551
> I would like to take a look at this. Thank you ! > If my understanding is correct, the first case i.e. `SELECT .. WHERE NULL IN (…)` would simplify to `SELECT .. WHERE False`. I think it actually would simplify to ` SELECT .. WHERE null` (which is technically different than `WHERE null` but will have the same effect of returning no rows). This is explained a little more in https://github.com/apache/arrow-datafusion/issues/8689 Using datafusion-cli, you can check it: ```sql ❯ create table t (x int) as values (1), (2); 0 rows in set. Query took 0.007 seconds. ❯ select NULL IN (1, 2); +----------------------------------------------------------------------+ | NULL IN (Map { iter: Iter([Literal(Int64(1)), Literal(Int64(2))]) }) | +----------------------------------------------------------------------+ | | +----------------------------------------------------------------------+ 1 row in set. Query took 0.002 seconds. ``` > As for the second case, `SELECT .. WHERE x IN (NULL, …)` would be simplified to `SELECT .. WHERE x IN (…)` as it could still return rows for the other exprs in the list. I actually tried this out and it seems like this is NOT a correct simplification (sorry for my confusion). So maybe we can't simplify such select lists 🤔 ```sql DataFusion CLI v34.0.0 ❯ create table t (x int) as values (1), (2); 0 rows in set. Query took 0.006 seconds. ❯ select x IN (NULL, 2) from t; +-----------------------------------------------------------------+ | t.x IN (Map { iter: Iter([Literal(NULL), Literal(Int64(2))]) }) | +-----------------------------------------------------------------+ | | | true | +-----------------------------------------------------------------+ 2 rows in set. Query took 0.002 seconds. ❯ select x IN (2) from t; +--------------------------------------------------+ | t.x IN (Map { iter: Iter([Literal(Int64(2))]) }) | +--------------------------------------------------+ | false | | true | +--------------------------------------------------+ 2 rows in set. Query took 0.001 seconds. ❯ ``` Also, the same in postgres: ```sql postgres=# create table t (x int); CREATE TABLE postgres=# insert into t values (1), (2); INSERT 0 2 postgres=# select x in (NULL, 2) from t; ?column? ---------- t (2 rows) postgres=# select x in (2) from t; ?column? ---------- f t (2 rows) ``` ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
