lidavidm opened a new pull request #10483:
URL: https://github.com/apache/arrow/pull/10483
This adds a bit more context to the error messages, though maybe this is a
bit wordy?
```
>>> ds.dataset('dataset4', format="ipc")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/lidavidm/Code/upstream/arrow-12827/python/pyarrow/dataset.py",
line 655, in dataset
return _filesystem_dataset(source, **kwargs)
File "/home/lidavidm/Code/upstream/arrow-12827/python/pyarrow/dataset.py",
line 410, in _filesystem_dataset
return factory.finish(schema)
File "pyarrow/_dataset.pyx", line 2262, in
pyarrow._dataset.DatasetFactory.finish
return Dataset.wrap(GetResultValue(result))
File "pyarrow/error.pxi", line 141, in
pyarrow.lib.pyarrow_internal_check_status
return check_status(status)
File "pyarrow/error.pxi", line 97, in pyarrow.lib.check_status
raise ArrowInvalid(message)
pyarrow.lib.ArrowInvalid: Error creating dataset. Could not read schema from
'dataset4/foo.parquet': Could not open IPC input source 'dataset4/foo.parquet':
File is too small: 9. Is this a 'ipc' file?
>>> ds.dataset('dataset5', format="parquet")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/lidavidm/Code/upstream/arrow-12827/python/pyarrow/dataset.py",
line 655, in dataset
return _filesystem_dataset(source, **kwargs)
File "/home/lidavidm/Code/upstream/arrow-12827/python/pyarrow/dataset.py",
line 410, in _filesystem_dataset
return factory.finish(schema)
File "pyarrow/_dataset.pyx", line 2262, in
pyarrow._dataset.DatasetFactory.finish
return Dataset.wrap(GetResultValue(result))
File "pyarrow/error.pxi", line 141, in
pyarrow.lib.pyarrow_internal_check_status
return check_status(status)
File "pyarrow/error.pxi", line 112, in pyarrow.lib.check_status
raise IOError(message)
OSError: Error creating dataset. Could not read schema from
'dataset5/foo.parquet': Could not open Parquet input source
'dataset5/foo.parquet': Invalid: Parquet magic bytes not found in footer.
Either the file is corrupted or this is not a parquet file.. Is this a
'parquet' file?
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]