Colin Jermain created ARROW-15982:
-------------------------------------

             Summary: [Python] parquet.read_table fails to parse home directory 
path
                 Key: ARROW-15982
                 URL: https://issues.apache.org/jira/browse/ARROW-15982
             Project: Apache Arrow
          Issue Type: Bug
    Affects Versions: 7.0.0
            Reporter: Colin Jermain


{{pyarrow.parquet.read_table}} fails to parse a path with the home directory in 
it. For example {{"~/test.parquet"}} returns a {{{}FileNotFoundError{}}}, while 
{{"/home/user/test.parquet"}} reads the file correctly.
{code:java}
$ python -c "import pyarrow.parquet; 
pyarrow.parquet.read_table('~/test.parquet')"  
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File ".../lib/python3.8/site-packages/pyarrow/parquet.py", line 1960, in 
read_table
    dataset = _ParquetDatasetV2(
  File ".../lib/python3.8/site-packages/pyarrow/parquet.py", line 1781, in 
__init__
    self._dataset = ds.dataset(path_or_paths, filesystem=filesystem,
  File ".../lib/python3.8/site-packages/pyarrow/dataset.py", line 667, in 
dataset
    return _filesystem_dataset(source, **kwargs)
  File ".../lib/python3.8/site-packages/pyarrow/dataset.py", line 412, in 
_filesystem_dataset
    fs, paths_or_selector = _ensure_single_source(source, filesystem)
  File ".../lib/python3.8/site-packages/pyarrow/dataset.py", line 388, in 
_ensure_single_source
    raise FileNotFoundError(path)
FileNotFoundError: ~/test.parquet
{code}
The fix for this issue should be as simple as applying {{os.path.expanduser}} 
in the right places.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to