David Li created ARROW-18060:
--------------------------------

             Summary: [C++] Writing a dataset with 0 rows doesn't create any 
files
                 Key: ARROW-18060
                 URL: https://issues.apache.org/jira/browse/ARROW-18060
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
    Affects Versions: 9.0.0
            Reporter: David Li


If the input data has no rows, no files get created. This is potentially 
unexpected as it looks like "nothing happened". It might be nicer to create an 
empty file. With partitioning, though, that then gets weird (there's no 
partition values) so maybe an error might make more sense instead.

Reproduction in Python
{code:python}
import tempfile
from pathlib import Path

import pyarrow
import pyarrow.dataset

print("PyArrow version:", pyarrow.__version__)

table = pyarrow.table([
    [],
], schema=pyarrow.schema([
    ("ints", "int64"),
]))

with tempfile.TemporaryDirectory() as d:
    pyarrow.dataset.write_dataset(table, d, format="feather")
    print(list(Path(d).iterdir()))
{code}
Output
{noformat}
> python repro.py
PyArrow version: 9.0.0
[] {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to