pjmore opened a new issue #1999:
URL: https://github.com/apache/arrow-datafusion/issues/1999


   **Describe the bug**
   When using a partitioned datasource selecting only the partitioned columns 
causes errors. 
   
   **To Reproduce**
   Steps to reproduce the behavior:
   Create sample parquet/csv file content doesn't matter. Copy file to two 
separate paths using path partitioning scheme. E.g.
   "year=2021/month=09/day=09/file.parquet",
   "year=2021/month=10/day=09/file.parquet",
   
   Register listing table and execute query 
   ```select distinct year,month, day from t```
   
   **Expected behavior**
   Should return record batch with: 
   ```
   +------+-------+-----+
   | year | month | day||
   +------+-------+-----+
   | 2021 | 09    | 09  |
   | 2021 | 10    | 09  |
   | 2021 | 10    | 28  |
   +------+-------+-----+ 
   ```
   
   **Additional context**
   This was verified to be a problem with parquet and csv formats. Unsure about 
avro or json.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to