BlakeOrth commented on code in PR #17050:
URL: https://github.com/apache/datafusion/pull/17050#discussion_r2267448996


##########
datafusion/core/src/datasource/listing_table_factory.rs:
##########
@@ -63,16 +63,29 @@ impl TableProviderFactory for ListingTableFactory {
             ))?
             .create(session_state, &cmd.options)?;
 
-        let file_extension = get_extension(cmd.location.as_str());
+        let mut table_path = ListingTableUrl::parse(&cmd.location)?;
+        let file_extension = match table_path.is_collection() {
+            true => "",
+            false => &get_extension(cmd.location.as_str()),
+        };

Review Comment:
   I'm happy to leave a comment or see if things work with a bit more of an in 
depth fix that sets the extension in a "more correct" way. Which would you 
prefer?
   
   The test assertion that pushed me towards this specific solution is here:
   
https://github.com/apache/datafusion/blob/ff9db020d782f25310b77f247af08beda1a80459/datafusion/core/src/datasource/listing_table_factory.rs#L324-L331
   
   On the surface it seems like actually setting the expected full file 
extension should be equivalent. Unless I'm mistaken the current behavior of the 
code is to list all of the files (empty extension) and filter by the set glob 
pattern `*.<file_type>.<compression_type>`. It seems like the results from 
setting the file extension correctly should be equivalent based on the 
extension/glob filtering code here:
   
https://github.com/apache/datafusion/blob/ff9db020d782f25310b77f247af08beda1a80459/datafusion/datasource/src/url.rs#L264-L272
    



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to