xudong963 commented on code in PR #15865:
URL: https://github.com/apache/datafusion/pull/15865#discussion_r2065855163


##########
datafusion/core/src/datasource/listing/table.rs:
##########
@@ -1129,7 +1130,17 @@ impl ListingTable {
         let (file_group, inexact_stats) =
             get_files_with_limit(files, limit, 
self.options.collect_stat).await?;
 
-        let file_groups = 
file_group.split_files(self.options.target_partitions);
+        let mut file_groups = 
file_group.split_files(self.options.target_partitions);
+        let (schema_mapper, _) = 
DefaultSchemaAdapterFactory::from_schema(self.schema())

Review Comment:
   @alamb While I was working on 
https://github.com/apache/datafusion/pull/15852, I found in fact, for listing 
table, doesn't have the issue described in 
https://github.com/apache/datafusion/issues/15689, that is, all files here have 
the same schema because when creating table, all fetched files already use the 
`SchemaMapper` to reorder their schema, see here: 
https://github.com/apache/datafusion/blob/main/datafusion/datasource-parquet/src/opener.rs#L206.
   
   What we should fix is let the file schema match the listing table schema, 
usually, if users specify the partition col, table schema will have the extra 
partition col infos, so I moved the mapper down the 
`compute_all_files_statistics` method in the commit: 
https://github.com/apache/datafusion/pull/15852/commits/689fc669c47581b86d6e4c12d73210f997c4cb10.
 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to