zpz commented on issue #35318:
URL: https://github.com/apache/arrow/issues/35318#issuecomment-1545939249
with pyarrow 12.0.0, I got this error after 333 files:
```
error after 0.4718797499954235 seconds
<class 'pyarrow.lib.ArrowException'>
Unknown error: google::cloud::Status(UNKNOWN: Permanent error
ReadObjectNotWrapped: WaitForHandles(): unexpected error code in curl_multi_*,
[12]=Unrecoverable error in select/poll)
('Unknown error: google::cloud::Status(UNKNOWN: Permanent error
ReadObjectNotWrapped: WaitForHandles(): unexpected error code in curl_multi_*,
[12]=Unrecoverable error in select/poll)',)
Traceback (most recent call last):
File "/home/docker-user/sunny/tests/manual/parq.py", line 67, in <module>
main()
File "/home/docker-user/sunny/tests/manual/parq.py", line 41, in main
n = len(batch)
File "/usr/local/lib/python3.10/site-packages/biglist/_parquet.py", line
171, in __len__
return self.num_rows
File "/usr/local/lib/python3.10/site-packages/biglist/_parquet.py", line
203, in num_rows
return self.metadata.num_rows
File "/usr/local/lib/python3.10/site-packages/biglist/_parquet.py", line
199, in metadata
return self.file.metadata
File "/usr/local/lib/python3.10/site-packages/biglist/_parquet.py", line
194, in file
self._file = self.load_file(self.path, lazy=self.lazy)
File "/usr/local/lib/python3.10/site-packages/biglist/_parquet.py", line
97, in load_file
file = ParquetFile(pp, filesystem=ff)
File "/usr/local/lib/python3.10/site-packages/pyarrow/parquet/core.py",
line 334, in __init__
self.reader.open(
File "pyarrow/_parquet.pyx", line 1220, in
pyarrow._parquet.ParquetReader.open
File "pyarrow/error.pxi", line 138, in pyarrow.lib.check_status
pyarrow.lib.ArrowException: Unknown error: google::cloud::Status(UNKNOWN:
Permanent error ReadObjectNotWrapped: WaitForHandles(): unexpected error code
in curl_multi_*, [12]=Unrecoverable error in select/poll)
```
I'm looping through
https://github.com/zpz/biglist/blob/7910c60524aeeee19a037245a61fc58d8638e600/src/biglist/_parquet.py#L49
objects each getting a GCS path. In the loop I call `len(obj)`, which calls
its `load_file` with `lazy=True`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]