alamb commented on issue #8688:
URL: 
https://github.com/apache/arrow-datafusion/issues/8688#issuecomment-1872932551

   > I would like to take a look at this.
   
   Thank you !
   
   > If my understanding is correct, the first case i.e. `SELECT .. WHERE NULL 
IN (…)` would simplify to `SELECT .. WHERE False`.
   
   I think it actually would simplify to ` SELECT .. WHERE null` (which is 
technically different than `WHERE null` but will have the same effect of 
returning no rows). This is explained a little more in  
https://github.com/apache/arrow-datafusion/issues/8689
   
   Using datafusion-cli, you can check it:
   
   ```sql
   ❯ create table t (x int) as values (1), (2);
   0 rows in set. Query took 0.007 seconds.
   
   ❯ select NULL IN (1, 2);
   +----------------------------------------------------------------------+
   | NULL IN (Map { iter: Iter([Literal(Int64(1)), Literal(Int64(2))]) }) |
   +----------------------------------------------------------------------+
   |                                                                      |
   +----------------------------------------------------------------------+
   1 row in set. Query took 0.002 seconds.
   ```
   
   > As for the second case, `SELECT .. WHERE x IN (NULL, …)` would be 
simplified to `SELECT .. WHERE x IN (…)` as it could still return rows for the 
other exprs in the list.
   
   I actually tried this out and it seems like this is NOT a correct 
simplification (sorry for my confusion). So maybe we can't simplify such select 
lists 🤔 
   
   
   
   
   
   ```sql
   DataFusion CLI v34.0.0
   ❯ create table t (x int) as values (1), (2);
   0 rows in set. Query took 0.006 seconds.
   
   ❯ select x IN (NULL, 2) from t;
   +-----------------------------------------------------------------+
   | t.x IN (Map { iter: Iter([Literal(NULL), Literal(Int64(2))]) }) |
   +-----------------------------------------------------------------+
   |                                                                 |
   | true                                                            |
   +-----------------------------------------------------------------+
   2 rows in set. Query took 0.002 seconds.
   
   ❯ select x IN (2) from t;
   +--------------------------------------------------+
   | t.x IN (Map { iter: Iter([Literal(Int64(2))]) }) |
   +--------------------------------------------------+
   | false                                            |
   | true                                             |
   +--------------------------------------------------+
   2 rows in set. Query took 0.001 seconds.
   
   ❯
   ```
   
   Also, the same in postgres:
   
   ```sql
   postgres=# create table t (x int);
   CREATE TABLE
   postgres=# insert into t values (1), (2);
   INSERT 0 2
   
   postgres=# select x in (NULL, 2) from t;
    ?column?
   ----------
   
    t
   (2 rows)
   
   postgres=# select x in (2) from t;
    ?column?
   ----------
    f
    t
   (2 rows)
   ```
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to