David Li created ARROW-18060: -------------------------------- Summary: [C++] Writing a dataset with 0 rows doesn't create any files Key: ARROW-18060 URL: https://issues.apache.org/jira/browse/ARROW-18060 Project: Apache Arrow Issue Type: Improvement Components: C++ Affects Versions: 9.0.0 Reporter: David Li
If the input data has no rows, no files get created. This is potentially unexpected as it looks like "nothing happened". It might be nicer to create an empty file. With partitioning, though, that then gets weird (there's no partition values) so maybe an error might make more sense instead. Reproduction in Python {code:python} import tempfile from pathlib import Path import pyarrow import pyarrow.dataset print("PyArrow version:", pyarrow.__version__) table = pyarrow.table([ [], ], schema=pyarrow.schema([ ("ints", "int64"), ])) with tempfile.TemporaryDirectory() as d: pyarrow.dataset.write_dataset(table, d, format="feather") print(list(Path(d).iterdir())) {code} Output {noformat} > python repro.py PyArrow version: 9.0.0 [] {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)