This is an automated email from the ASF dual-hosted git repository.

alenka pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new 396b4759bf GH-37555: [Python] Update get_file_info_selector to ignore 
base directory (#37558)
396b4759bf is described below

commit 396b4759bfed70ad1f5d7724baaa7ee81654c6ea
Author: Alenka Frim <[email protected]>
AuthorDate: Thu Sep 14 12:42:48 2023 +0200

    GH-37555: [Python] Update get_file_info_selector to ignore base directory 
(#37558)
    
    ### Rationale for this change
    
    There has been some changes in the way fsspec lists the directories with 
new version 2023.9.0, see https://github.com/fsspec/filesystem_spec/pull/1329, 
which caused our tests to start failing.
    
    ### What changes are included in this PR?
    
    This PR updates the `get_file_info_selector` in 
[FSSpecHandler](https://arrow.apache.org/docs/_modules/pyarrow/fs.html#FSSpecHandler)
 class to keep the behaviour of our spec.
    
    ### Are there any user-facing changes?
    
    No.
    
    * Closes: #37555
    
    Authored-by: AlenkaF <[email protected]>
    Signed-off-by: AlenkaF <[email protected]>
---
 python/pyarrow/fs.py | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/python/pyarrow/fs.py b/python/pyarrow/fs.py
index 567bea8ac0..36655c7d12 100644
--- a/python/pyarrow/fs.py
+++ b/python/pyarrow/fs.py
@@ -356,7 +356,12 @@ class FSSpecHandler(FileSystemHandler):
             selector.base_dir, maxdepth=maxdepth, withdirs=True, detail=True
         )
         for path, info in selected_files.items():
-            infos.append(self._create_file_info(path, info))
+            _path = path.strip("/")
+            base_dir = selector.base_dir.strip("/")
+            # Need to exclude base directory from selected files if present
+            # (fsspec filesystems, see GH-37555)
+            if _path != base_dir:
+                infos.append(self._create_file_info(path, info))
 
         return infos
 

Reply via email to