tiago-ssantos opened a new issue, #178:
URL: https://github.com/apache/arrow-rs-object-store/issues/178

   **Describe the bug**
   In DataFusion, when listing all files 
(https://github.com/apache/arrow-datafusion/blob/c8a3d589889dd1e67047de89db8b4ff56f90f04c/datafusion/core/src/datasource/listing/url.rs#L151)
 using an LocalFileSystem object store the result is different depending the OS.
   
   **To Reproduce**
   Having a folder:
   
![image](https://user-images.githubusercontent.com/46322886/228520559-ad5572a4-4522-42d6-b948-7378dcf43135.png)
   
   and requesting to list the content of the folder using:
   ```
   async fn list_test() {
       let path = Path::from_filesystem_path("./files").unwrap();
   
       let integration = LocalFileSystem::new();
       let list_stream = integration.list(Some(&path)).await.unwrap();
   
       let res: Vec<_> = list_stream.try_collect().await.unwrap();
       res.iter().for_each(|file| println!("{0}", file.location));
   }
   ```
   in windows/ubuntu the result is:
   ```
   files/file1.parquet
   files/file3.parquet
   ```
   
   but in macOS Ventura:
   ```
   files/file3.parquet
   files/file1.parquet
   ```
   
   
   **Expected behavior**
   We expect that the result would be the same. This code is called when 
inferring the schema 
(https://github.com/apache/arrow-datafusion/blob/c8a3d589889dd1e67047de89db8b4ff56f90f04c/datafusion/core/src/datasource/listing/table.rs#L431)
 and the ordering for multiple files is important, as it does a merge of the 
schemas of all the files.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to