Antoine Pitrou created ARROW-17633:
--------------------------------------
Summary: [Python][CI] test_write_dataset_max_rows_per_file is flaky
Key: ARROW-17633
URL: https://issues.apache.org/jira/browse/ARROW-17633
Project: Apache Arrow
Issue Type: Bug
Components: Continuous Integration, Python
Reporter: Antoine Pitrou
I am starting to see intermittent but frequent CI failures in
{{test_write_dataset_max_rows_per_file}}.
Is {{write_dataset}} supposed to create the base directory?
{code}
=================================== FAILURES ===================================
_____________________ test_write_dataset_max_rows_per_file _____________________
tempdir =
PosixPath('/tmp/pytest-of-root/pytest-0/test_write_dataset_max_rows_pe0')
@pytest.mark.parquet
def test_write_dataset_max_rows_per_file(tempdir):
directory = tempdir / 'ds'
max_rows_per_file = 10
max_rows_per_group = 10
num_of_columns = 2
num_of_records = 35
record_batch = _generate_data_and_columns(num_of_columns,
num_of_records)
> ds.write_dataset(record_batch, directory, format="parquet",
max_rows_per_file=max_rows_per_file,
max_rows_per_group=max_rows_per_group)
usr/local/lib/python3.9/site-packages/pyarrow/tests/test_dataset.py:3919:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
usr/local/lib/python3.9/site-packages/pyarrow/dataset.py:988: in write_dataset
_filesystemdataset_write(
pyarrow/_dataset.pyx:2811: in pyarrow._dataset._filesystemdataset_write
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E FileNotFoundError: [Errno 2] Failed to open local file
'/tmp/pytest-of-root/pytest-0/test_write_dataset_max_rows_pe0/ds/part-1.parquet'.
Detail: [errno 2] No such file or directory
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)