zpz commented on issue #35318:
URL: https://github.com/apache/arrow/issues/35318#issuecomment-1545939249

   with pyarrow 12.0.0, I got this error after 333 files:
   
   ```
   error after 0.4718797499954235 seconds
   <class 'pyarrow.lib.ArrowException'>
   Unknown error: google::cloud::Status(UNKNOWN: Permanent error 
ReadObjectNotWrapped: WaitForHandles(): unexpected error code in curl_multi_*, 
[12]=Unrecoverable error in select/poll)
   ('Unknown error: google::cloud::Status(UNKNOWN: Permanent error 
ReadObjectNotWrapped: WaitForHandles(): unexpected error code in curl_multi_*, 
[12]=Unrecoverable error in select/poll)',)
   
   
   Traceback (most recent call last):
     File "/home/docker-user/sunny/tests/manual/parq.py", line 67, in <module>
       main()
     File "/home/docker-user/sunny/tests/manual/parq.py", line 41, in main
       n = len(batch)
     File "/usr/local/lib/python3.10/site-packages/biglist/_parquet.py", line 
171, in __len__
       return self.num_rows
     File "/usr/local/lib/python3.10/site-packages/biglist/_parquet.py", line 
203, in num_rows
       return self.metadata.num_rows
     File "/usr/local/lib/python3.10/site-packages/biglist/_parquet.py", line 
199, in metadata
       return self.file.metadata
     File "/usr/local/lib/python3.10/site-packages/biglist/_parquet.py", line 
194, in file
       self._file = self.load_file(self.path, lazy=self.lazy)
     File "/usr/local/lib/python3.10/site-packages/biglist/_parquet.py", line 
97, in load_file
       file = ParquetFile(pp, filesystem=ff)
     File "/usr/local/lib/python3.10/site-packages/pyarrow/parquet/core.py", 
line 334, in __init__
       self.reader.open(
     File "pyarrow/_parquet.pyx", line 1220, in 
pyarrow._parquet.ParquetReader.open
     File "pyarrow/error.pxi", line 138, in pyarrow.lib.check_status
   pyarrow.lib.ArrowException: Unknown error: google::cloud::Status(UNKNOWN: 
Permanent error ReadObjectNotWrapped: WaitForHandles(): unexpected error code 
in curl_multi_*, [12]=Unrecoverable error in select/poll)
   ```
   
   I'm looping through 
https://github.com/zpz/biglist/blob/7910c60524aeeee19a037245a61fc58d8638e600/src/biglist/_parquet.py#L49
 objects each getting a GCS path. In the loop I call `len(obj)`, which calls 
its `load_file` with `lazy=True`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to