Nlte opened a new issue #56:
URL: https://github.com/apache/arrow-cookbook/issues/56
I'm getting 1 fail test when running the `make pytest` target.
```
Document: io
------------
**********************************************************************
File "io.rst", line 799, in default
Failed example:
dataset = ds.dataset("s3://ursa-labs-taxi-data/2011",
partitioning=["month"])
for f in dataset.files[:10]:
print(f)
print("...")
Exception raised:
Traceback (most recent call last):
File
"/usr/local/Cellar/[email protected]/3.9.6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/doctest.py",
line 1336, in __run
exec(compile(example.source, filename, "single",
File "<doctest default[0]>", line 1, in <module>
dataset = ds.dataset("s3://ursa-labs-taxi-data/2011",
File
"/Users/nathanael.leaute/Documents/github/arrow-cookbook/venv/lib/python3.9/site-packages/pyarrow/dataset.py",
line 655, in dataset
return _filesystem_dataset(source, **kwargs)
File
"/Users/nathanael.leaute/Documents/github/arrow-cookbook/venv/lib/python3.9/site-packages/pyarrow/dataset.py",
line 410, in _filesystem_dataset
return factory.finish(schema)
File "pyarrow/_dataset.pyx", line 2402, in
pyarrow._dataset.DatasetFactory.finish
File "pyarrow/error.pxi", line 143, in
pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 114, in pyarrow.lib.check_status
OSError: Error creating dataset. Could not read schema from
'ursa-labs-taxi-data/2011/01/data.parquet': Could not open Parquet input source
'ursa-labs-taxi-data/2011/01/data.parquet': AWS Error [code 15]: Access Denied.
Is this a 'parquet' file?
**********************************************************************
1 items had failures:
1 of 27 in default
27 tests in 1 items.
26 passed and 1 failed.
***Test Failed*** 1 failures.
```
It seems like the ACL on the ursa-labs-taxi-data bucket doesn't allow public
access. I don't know if you want to open up the bucket / prefix to the public
and incur that aws bandwidth costs though. Those are definitely a thing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]