luispcunha opened a new issue, #36725:
URL: https://github.com/apache/arrow/issues/36725

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Using seek to end of file with files in a Google Cloud Storage filesystem 
doesn't work. 
   Code to reproduce the error:
   ```python
   from pyarrow import fs
   import zipfile
   
   filesystem = fs.GcsFileSystem()
   with filesystem.open_input_file("bucket/archive.zip") as f:
     size = f.size()
     f.seek(-1, 2) # works
     f.seek(0, 2) # fails
   ```
   
   Traceback:
   
   ```python
   Traceback (most recent call last):
     File ".../test.py", line 18, in <module>
       zip = zipfile.ZipFile(f)
             ^^^^^^^^^^^^^^^^^^
     File ".../lib/python3.11/zipfile.py", line 1302, in __init__
       self._RealGetContents()
     File ".../lib/python3.11/zipfile.py", line 1365, in _RealGetContents
       endrec = _EndRecData(fp)
                ^^^^^^^^^^^^^^^
     File ".../lib/python3.11/zipfile.py", line 292, in _EndRecData
       fpin.seek(0, 2)
     File "pyarrow/io.pxi", line 323, in pyarrow.lib.NativeFile.seek
     File "pyarrow/error.pxi", line 100, in pyarrow.lib.check_status
    pyarrow.lib.ArrowInvalid: google::cloud::Status(OUT_OF_RANGE: Permanent 
error ReadObjectNotWrapped: <?xml version='1.0' 
encoding='UTF-8'?><Error><Code>InvalidRange</Code><Message>The requested range 
cannot be satisfied.</Message><Details>bytes=485-</Details></Error>)
   ```
   
   The behavior is the same in several zip files I've tried. Performing the 
same operation with the same zip file on a local file system works as expected.
   I came across this while trying to use the `zipfile` module on a Google 
Cloud Storage filesystem file, because `zipfile` calls `f.seek(0, 2)` to 
determine the size of the file. My end goal is to achieve something as follows:
   
   ```python
   from pyarrow import fs
   import zipfile
   
   filesystem = fs.GcsFileSystem()
   with filesystem.open_input_file("bucket/archive.zip") as f:
     archive = zipfile.ZipFile(f)
     members = zip.namelist()
   ```
   
   **Environment:** PyArrow 12.0.1, Python 3.11, macOS X
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to