mach-kernel opened a new issue, #17920: URL: https://github.com/apache/datafusion/issues/17920
### Describe the bug When querying a Hive partitioned table, trying to project any of the partition columns after ser/de fails on deserialize. This query, using the test dataset `partitioned_table_json` fails over Ballista: ```sql select id, part from partitioned_table_json ``` ### To Reproduce `amazon_reviews` is a table Hive partitioned on (`marketplace`, `review_date`): ```sql explain select marketplace, review_date, count(*) from amazon_reviews group by marketplace, review_date; ``` Produces plan: https://gist.githubusercontent.com/mach-kernel/a20d00c8e6595cfc4332476bb857e251/raw/0a5c630ab883a6d092c3b9a21bc13d7ae99ca2d7/agg_plan.json On deserialize, column lookup fails here because TableScan schema does not include partition colums: https://github.com/apache/datafusion/blob/182d5dc5e456322664da921f446018a0549e60bc/datafusion/proto/src/logical_plan/mod.rs#L382-L390 Related: https://github.com/apache/datafusion/issues/15718 https://github.com/apache/datafusion/pull/15737 ### Expected behavior _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
