thvasilo opened a new issue, #37001:
URL: https://github.com/apache/arrow/issues/37001

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   I'm trying to use a 
[localstack](https://docs.localstack.cloud/getting-started/quickstart/)-created 
S3 bucket as way to test my application without interacting with S3.
   
   To do that I launch an S3 endpoint using `localstack start -d` and create my 
PyArrow S3FS using:
   
   ```
   s3_fs = fs.S3FileSystem(endpoint_override="localhost:4566")
   ```
   
   When I try interacting with files on the simulated bucket however I get the 
following:
   
   
   ```
   In [223]: nrows = pq.read_metadata(f"{file_bucket}/{file_key}", 
filesystem=s3_fs).num_rows
   ---------------------------------------------------------------------------
   OSError                                   Traceback (most recent call last)
   <ipython-input-223-a51dff0bbcaa> in <module>
   ----> 1 nrows = pq.read_metadata(f"{file_bucket}/{file_key}", 
filesystem=s3_fs).num_rows
   
   /[...]/lib/python3.7/site-packages/pyarrow/parquet/core.py in 
read_metadata(where, memory_map, decryption_properties, filesystem)
      3479     file_ctx = nullcontext()
      3480     if filesystem is not None:
   -> 3481         file_ctx = where = filesystem.open_input_file(where)
      3482
      3483     with file_ctx:
   
   [...]/python3.7/site-packages/pyarrow/_fs.pyx in 
pyarrow._fs.FileSystem.open_input_file()
   [...]/lib/python3.7/site-packages/pyarrow/error.pxi in 
pyarrow.lib.pyarrow_internal_check_status()
   [...]/lib/python3.7/site-packages/pyarrow/error.pxi in 
pyarrow.lib.check_status()
   
   OSError: When reading information for key 'redacted/path/to/file' in bucket 
'example-bucket': AWS Error NETWO
   RK_CONNECTION during HeadObject operation: curlCode: 60, SSL peer 
certificate or SSH remote key was not OK
   ```
   
   Another user seems to have the same problem when using on-prem S3, and had 
to use `s3fs` along with `PyFileSystem, FSSpecHandler`  to resolve it: 
https://discuss.ray.io/t/ssl-peer-certificate-or-ssh-remote-key-was-not-ok/11091/2
   
   Fully reproducible example:
   
   ```
   pip install localstack awscli-local pyarrow
   localstack start -d
   awslocal s3 mb example-bucket
   python <<HEREDOC
   import numpy as np
   import pandas as pd
   import pyarrow as pa
   import pyarrow.parquet as pq
   df = pd.DataFrame({'one': [-1, np.nan, 2.5],
                      'two': ['foo', 'bar', 'baz'],
                      'three': [True, False, True]},
                      index=list('abc'))
   table = pa.Table.from_pandas(df)
   pq.write_table(table, 'example.parquet')
   HEREDOC
   awslocal s3 cp example.parquet s3://example-bucket/
   python <<HEREDOC
   from pyarrow import fs
   import pyarrow.parquet as pq
   s3_fs = fs.S3FileSystem(endpoint_override="localhost:4566")
   pq.read_metadata("example-bucket/example.parquet", filesystem=s3_fs)
   HEREDOC
   ```
   
   Would result in:
   
   ```
   Traceback (most recent call last):
     File "<stdin>", line 4, in <module>
     File "[.../]lib/python3.9/site-packages/pyarrow/parquet/core.py", line 
3481, in read_metadata
       file_ctx = where = filesystem.open_input_file(where)
     File "pyarrow/_fs.pyx", line 770, in pyarrow._fs.FileSystem.open_input_file
     File "pyarrow/error.pxi", line 144, in 
pyarrow.lib.pyarrow_internal_check_status
     File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status
   OSError: When reading information for key 'example.parquet' in bucket 
'example-bucket': AWS Error NETWORK_CONNECTION during HeadObject operation: 
curlCode: 60, SSL peer certificate or SSH remote key was not OK
   ```
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to