mach-kernel opened a new issue, #17920:
URL: https://github.com/apache/datafusion/issues/17920

   ### Describe the bug
   
   When querying a Hive partitioned table, trying to project any of the 
partition columns after ser/de fails on deserialize. This query, using the test 
dataset `partitioned_table_json` fails over Ballista:
   
   ```sql
   select id, part from partitioned_table_json
   ```
   
   ### To Reproduce
   
   `amazon_reviews` is a table Hive partitioned on (`marketplace`, 
`review_date`):
   
   ```sql
   explain select marketplace, review_date, count(*) from amazon_reviews group 
by marketplace, review_date;
   ```
   
   Produces plan:
   
https://gist.githubusercontent.com/mach-kernel/a20d00c8e6595cfc4332476bb857e251/raw/0a5c630ab883a6d092c3b9a21bc13d7ae99ca2d7/agg_plan.json
   
   On deserialize, column lookup fails here because TableScan schema does not 
include partition colums:
   
https://github.com/apache/datafusion/blob/182d5dc5e456322664da921f446018a0549e60bc/datafusion/proto/src/logical_plan/mod.rs#L382-L390
   
   Related: 
   
   https://github.com/apache/datafusion/issues/15718
   https://github.com/apache/datafusion/pull/15737
   
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to