devinjdangelo commented on issue #9684:
URL: 
https://github.com/apache/arrow-datafusion/issues/9684#issuecomment-2007077634

   > Given I can see that the default behavior may not be ideal, perhaps we can 
add a configuration setting that controls how non-existent paths that don't end 
with / are handled
   
   We previously had `single_file_output` which was a statement level config 
that determined if the path should be treated as a file or directory. We 
intentionally removed this in 
https://github.com/apache/arrow-datafusion/pull/9041 in favor of inference 
based on the path ending in '/'. 
   
   We could add a session level config to control how a path is interpreted as 
@alamb suggests. Perhaps we could also improve the inference logic by 
additionally checking for the presence of a valid file extension before 
concluding a path is a file. E.g.:
   
   1. `tmp/dataset/` -> is a folder since it ends in `/`
   2. `tmp/dataset` -> is still a folder since it does not end in `/` but has 
no valid file extension
   3. `tmp/file.parquet` -> is a file since it does not end in `/` and has a 
valid file extension `.parquet`
   4. `tmp/file.parquet/` -> is a folder since it ends in `/`
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to