Timothy Luna created ARROW-16437: ------------------------------------ Summary: Mocking tests with moto not currently feasible. Key: ARROW-16437 URL: https://issues.apache.org/jira/browse/ARROW-16437 Project: Apache Arrow Issue Type: Improvement Components: Python Affects Versions: 7.0.0 Environment: Ubuntu environment Python 3.9 PyArrow 7.0.0 moto 3.1.7 Reporter: Timothy Luna
## Unable to use moto to mock S3 for testing purposes. I've been using AWSWrangler as a loading utility in a custom application and am attempting to remove it as a dependency because PyArrow Dataset is capable of providing all the s3 functionality I need. The issue stems from the fact that when PyArrow attempts to determine the FileSystem type it appears to be sidestepping moto and is failing with: ```sh ============================================================================================================================ FAILURES ============================================================================================================================ _____________________________________________________________________________________________________________________ test__pull_cached_data _____________________________________________________________________________________________________________________ @pytest.mark.usefixtures("s3") def test__pull_cached_data(): """Tests pull cached data, both happy and sad.""" # Here we're going to make a folder, transfer in some files, # and pull them! with tempfile.TemporaryDirectory() as t: # This commented code functions. # from awswrangler.s3 import read_parquet # sillything = read_parquet('s3://test-bucket/test_metadata.parquet') > a = ds.dataset('s3://test-bucket/test_metadata.parquet') tests/custom_application/loading/test_load.py:156: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/local/lib/python3.9/site-packages/pyarrow/dataset.py:667: in dataset return _filesystem_dataset(source, **kwargs) /usr/local/lib/python3.9/site-packages/pyarrow/dataset.py:412: in _filesystem_dataset fs, paths_or_selector = _ensure_single_source(source, filesystem) /usr/local/lib/python3.9/site-packages/pyarrow/dataset.py:373: in _ensure_single_source filesystem, path = _resolve_filesystem_and_path(path, filesystem) /usr/local/lib/python3.9/site-packages/pyarrow/fs.py:179: in _resolve_filesystem_and_path filesystem, path = FileSystem.from_uri(path) pyarrow/_fs.pyx:350: in pyarrow._fs.FileSystem.from_uri ??? pyarrow/error.pxi:143: in pyarrow.lib.pyarrow_internal_check_status ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E OSError: When resolving region for bucket 'test-bucket': AWS Error [code 99]: curlCode: 35, SSL connect error pyarrow/error.pxi:114: OSError --------------------------------------------------------------------------------------------------------------------- Captured stderr setup ---------------------------------------------------------------------------------------------------------------------- INFO:botocore.credentials:Found credentials in environment variables. ----------------------------------------------------------------------------------------------------------------------- Captured log setup ----------------------------------------------------------------------------------------------------------------------- INFO botocore.credentials:credentials.py:1114 Found credentials in environment variables. ==================================================================================================================== short test summary info ===================================================================================================================== FAILED tests/custom_application/loading/test_load.py::test__pull_cached_data - OSError: When resolving region for bucket 'test-bucket': AWS Error [code 99]: curlCode: 35, SSL connect error ======================================================================================================================= 1 failed in 17.82s ======================================================================================================================= ``` Please let me know if you need additional information! -- This message was sent by Atlassian Jira (v8.20.7#820007)