David Li created ARROW-15285:
--------------------------------
Summary: [C++] write_dataset with delete_matching occasionally
fails with "Path does not exist"
Key: ARROW-15285
URL: https://issues.apache.org/jira/browse/ARROW-15285
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: David Li
The reproducer in ARROW-15265, once the bug there is fixed, now occasionally
fails with this:
{noformat}
Traceback (most recent call last):
File "/home/lidavidm/Code/upstream/arrow-15265/python/test.py", line 37, in
<module>
ds.write_dataset(
File "/home/lidavidm/Code/upstream/arrow-15265/python/pyarrow/dataset.py",
line 931, in write_dataset
_filesystemdataset_write(
File "pyarrow/_dataset.pyx", line 2658, in
pyarrow._dataset._filesystemdataset_write
check_status(CFileSystemDataset.Write(c_options, c_scanner))
File "pyarrow/error.pxi", line 114, in pyarrow.lib.check_status
raise IOError(message)
OSError: Path does not exist 'my-bucket/test8.parquet/col1=c' {noformat}
The path is different each time it fails (i.e. it's not deterministic). This is
relatively rare (2 out of 100 runs when I checked just now)
--
This message was sent by Atlassian Jira
(v8.20.1#820001)