jychen7 opened a new issue #2061: URL: https://github.com/apache/arrow-datafusion/issues/2061
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Assume we have a data lake stores as ``` table/year=2022/month=03/day=20/log.parquet table/year=2022/month=03/day=21/log.parquet ``` Currently, CreateExternalTable supports defining columns and location (e.g. `table/`) https://github.com/apache/arrow-datafusion/blob/5936edc2a94d5fb20702a41eab2b80695961b9dc/datafusion/src/sql/parser.rs#L70-L81 a sql query of `select * from table where year = '2022' and month = '03' and day = '20'` seems to scan all files under `table/`. **Describe the solution you'd like** ``` CREATE EXTERNAL TABLE test ( c1 VARCHAR NOT NULL, ) STORED AS CSV WITH HEADER ROW PARTITIONED BY (p1, p2) LOCATION '/path/to/'; ``` same as existing ListingOption, `PARTITIONED BY` only supports String https://github.com/apache/arrow-datafusion/blob/5936edc2a94d5fb20702a41eab2b80695961b9dc/datafusion/src/datasource/listing/table.rs#L178 **Describe alternatives you've considered** A clear and concise description of any alternative solutions or features you've considered. **Additional context** `partitioned by` is also used in Trino and AWS Athena https://trino.io/episodes/5.html https://docs.aws.amazon.com/athena/latest/ug/create-table.html I notice that `ListingOptions` supports `table_partition_cols` and also `partition pruning`, but just `CreateExternalTable` does not accept such input and pass through https://github.com/apache/arrow-datafusion/blob/5936edc2a94d5fb20702a41eab2b80695961b9dc/datafusion/src/datasource/listing/table.rs#L165-L186 https://github.com/apache/arrow-datafusion/blob/5936edc2a94d5fb20702a41eab2b80695961b9dc/datafusion/src/datasource/listing/table.rs#L358-L365 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
