RustomMS opened a new issue, #5633: URL: https://github.com/apache/arrow-datafusion/issues/5633
**Describe the bug** <!-- A clear and concise description of what the bug is. --> Unable to run the following style query due to error in subquery: ``` select column1 from t1 where column1 in (select distinct column1 as other from t1 where false); ``` **To Reproduce** <!-- Steps to reproduce the behavior: --> ``` $ export DATAFUSION_OPTIMIZER_SKIP_FAILED_RULES=false $ datafusion-cli DataFusion CLI v19.0.0 ❯ CREATE TABLE IF NOT EXISTS t1 AS VALUES(1,111),(2,222); 0 rows in set. Query took 0.003 seconds. ❯ select * from t1; +---------+---------+ | column1 | column2 | +---------+---------+ | 1 | 111 | | 2 | 222 | +---------+---------+ 2 rows in set. Query took 0.001 seconds. ❯ select column1 from t1; +---------+ | column1 | +---------+ | 1 | | 2 | +---------+ 2 rows in set. Query took 0.001 seconds. ❯ select distinct column1 from t1; +---------+ | column1 | +---------+ | 2 | | 1 | +---------+ 2 rows in set. Query took 0.004 seconds. ❯ select column1 from t1 where column1 in (select column1 as other from t1 where false); 0 rows in set. Query took 0.002 seconds. ❯ select column1 from t1 where column1 in ((select column1 as other from t1 where false)); This feature is not implemented: Physical plan does not support logical expression (<subquery>) ❯ select column1 from t1 where column1 in (select distinct column1 as other from t1 where false); decorrelate_where_in caused by Internal error: Optimizer rule 'decorrelate_where_in' failed due to unexpected error: a projection is required at /Users/rustomms/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-optimizer-19.0.0/src/decorrelate_where_in.rs:147 caused by Error during planning: Could not coerce into Projection! at /Users/rustomms/.cargo/registry/src/github.com-1ecc6299db9ec823/datafusion-expr-19.0.0/src/logical_plan/plan.rs:1394. This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker ❯ ``` Or run the following sql file with `datafusion-cli -f test.sql` ``` -- Test queries CREATE TABLE IF NOT EXISTS t1 AS VALUES(1,111),(2,222); select * from t1; select column1 from t1; select distinct column1 from t1; -- The first select with subquery statement works with 0 rows the second errors and third error on datafusion 19.0.0 select column1 from t1 where column1 in (select column1 as other from t1 where false); -- This is fixed in main/20.0.0 by https://github.com/apache/arrow-datafusion/issues/5529 select column1 from t1 where column1 in ((select column1 as other from t1 where false)); select column1 from t1 where column1 in (select distinct column1 as other from t1 where false); ``` **Expected behavior** <!-- A clear and concise description of what you expected to happen. --> Can use distinct in a subquery **Additional context** <!-- Add any other context about the problem here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
