Joris Van den Bossche created ARROW-10998:
---------------------------------------------

             Summary: [C++] Filesystems: detect if URI is passed where a file 
path is required and raise informative error
                 Key: ARROW-10998
                 URL: https://issues.apache.org/jira/browse/ARROW-10998
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++, Python
            Reporter: Joris Van den Bossche


Currently, when passing a URI to a filesystem method (except for {{from_uri}}) 
or other functions that accept a filesystem object, you can get a rather 
cryptic error message (eg in this case about "No response body" for S3, in the 
example below). 

Ideally, the filesystem object knows its own prefix "scheme", and so can detect 
if a user is passing a URI instead of file path, and we can provide a nicer 
error message.

Example with S3:

{code:python}
>>> from pyarrow.fs import S3FileSystem
>>> fs = S3FileSystem(region="us-east-2")
>>> fs.get_file_info('s3://ursa-labs-taxi-data/2016/01/')
...
OSError: When getting information for key '/ursa-labs-taxi-data/2016/01' in 
bucket 's3:': AWS Error [code 100]: No response body.

>>> import pyarrow.parquet as pq
>>> table = pq.read_table('s3://ursa-labs-taxi-data/2016/01/data.parquet', 
>>> filesystem=fs)
...
OSError: When getting information for key 
'/ursa-labs-taxi-data/2016/01/data.parquet' in bucket 's3:': AWS Error [code 
100]: No response body.
{code}

With a local filesystem, you actually get a not found file:

{code: python}
>>> fs = LocalFileSystem()
>>> fs.get_file_info("file:///home")
<FileInfo for 'file:///home': type=FileType.NotFound>
{code}

cc [~apitrou]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to