westonpace commented on issue #10634:
URL: https://github.com/apache/arrow/issues/10634#issuecomment-872502967


   Ok, yes, I looked a little further.  I did not realize that the dataset code 
calls CreateDir even if the bucket already exists (it uses CreateDir to test if 
the bucket exists).  So this is ARROW-13228.  If you are able to wait for 
version 5.0.0 (~a month out) then you can get a fix there.  Alternatively you 
can use the latest nightly build or build from source.
   
   Another workaround may be to use s3fs, PyFilesystem, and FSSpecHandler:
   
   ```
   import s3fs
   import pyarrow.fs
   s3fs_instance = s3fs.S3FileSystem()
   filesystem = pyarrow.fs.PyFileSystem(pyarrow.fs.FSSpecHandler(s3fs_instance))
   ```
   
   A final workaround could be to create your own filesystem implementation 
that wraps a pyarrow.fs.S3FileSystem instance (e.g. proxy pattern) and for 
`create_dir` it simply returns True.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to