lidavidm opened a new pull request #10483:
URL: https://github.com/apache/arrow/pull/10483


   This adds a bit more context to the error messages, though maybe this is a 
bit wordy?
   
   ```
   >>> ds.dataset('dataset4', format="ipc")
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/home/lidavidm/Code/upstream/arrow-12827/python/pyarrow/dataset.py", 
line 655, in dataset
       return _filesystem_dataset(source, **kwargs)
     File "/home/lidavidm/Code/upstream/arrow-12827/python/pyarrow/dataset.py", 
line 410, in _filesystem_dataset
       return factory.finish(schema)
     File "pyarrow/_dataset.pyx", line 2262, in 
pyarrow._dataset.DatasetFactory.finish
       return Dataset.wrap(GetResultValue(result))
     File "pyarrow/error.pxi", line 141, in 
pyarrow.lib.pyarrow_internal_check_status
       return check_status(status)
     File "pyarrow/error.pxi", line 97, in pyarrow.lib.check_status
       raise ArrowInvalid(message)
   pyarrow.lib.ArrowInvalid: Error creating dataset. Could not read schema from 
'dataset4/foo.parquet': Could not open IPC input source 'dataset4/foo.parquet': 
File is too small: 9. Is this a 'ipc' file?
   >>> ds.dataset('dataset5', format="parquet")
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/home/lidavidm/Code/upstream/arrow-12827/python/pyarrow/dataset.py", 
line 655, in dataset
       return _filesystem_dataset(source, **kwargs)
     File "/home/lidavidm/Code/upstream/arrow-12827/python/pyarrow/dataset.py", 
line 410, in _filesystem_dataset
       return factory.finish(schema)
     File "pyarrow/_dataset.pyx", line 2262, in 
pyarrow._dataset.DatasetFactory.finish
       return Dataset.wrap(GetResultValue(result))
     File "pyarrow/error.pxi", line 141, in 
pyarrow.lib.pyarrow_internal_check_status
       return check_status(status)
     File "pyarrow/error.pxi", line 112, in pyarrow.lib.check_status
       raise IOError(message)
   OSError: Error creating dataset. Could not read schema from 
'dataset5/foo.parquet': Could not open Parquet input source 
'dataset5/foo.parquet': Invalid: Parquet magic bytes not found in footer. 
Either the file is corrupted or this is not a parquet file.. Is this a 
'parquet' file?
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to