BlakeOrth commented on code in PR #17050: URL: https://github.com/apache/datafusion/pull/17050#discussion_r2267448996
########## datafusion/core/src/datasource/listing_table_factory.rs: ########## @@ -63,16 +63,29 @@ impl TableProviderFactory for ListingTableFactory { ))? .create(session_state, &cmd.options)?; - let file_extension = get_extension(cmd.location.as_str()); + let mut table_path = ListingTableUrl::parse(&cmd.location)?; + let file_extension = match table_path.is_collection() { + true => "", + false => &get_extension(cmd.location.as_str()), + }; Review Comment: I'm happy to leave a comment or see if things work with a bit more of an in depth fix that sets the extension in a "more correct" way. Which would you prefer? The test assertion that pushed me towards this specific solution is here: https://github.com/apache/datafusion/blob/ff9db020d782f25310b77f247af08beda1a80459/datafusion/core/src/datasource/listing_table_factory.rs#L324-L331 On the surface it seems like actually setting the expected full file extension should be equivalent. Unless I'm mistaken the current behavior of the code is to list all of the files (empty extension) and filter by the set glob pattern `*.<file_type>.<compression_type>`. It seems like the results from setting the file extension correctly should be equivalent based on the extension/glob filtering code here: https://github.com/apache/datafusion/blob/ff9db020d782f25310b77f247af08beda1a80459/datafusion/datasource/src/url.rs#L264-L272 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org