GitHub user mesejo created a discussion: write_dataset only writes the first
batch
I have this snippet of code:
```python
from pathlib import Path
import pyarrow as pa
import pyarrow.dataset as ds
tempdir = Path(__file__).parent / 'temp'
table = pa.table({"a": range(1024)})
batches = table.to_batches(max_chunksize=2)
ds.write_dataset(batches, tempdir, format="parquet", preserve_order=True,
use_threads=True)
```
I was expecting it to write all the batches, but it is only writing the first
batch:
```bash
$ ls temp
part-0.parquet
```
Is this the expected output? I'm working with Python 3.11 and pyarrow 21.0.0
```bash
$ ipython
Python 3.11.11 (main, Feb 5 2025, 19:11:07) [Clang 19.1.6 ]
Type 'copyright', 'credits' or 'license' for more information
IPython 9.4.0 -- An enhanced Interactive Python. Type '?' for help.
Tip: You can change the editing mode of IPython to behave more like vi, or
emacs.
In [1]: import pyarrow as pa
In [2]: pa.__version__
Out[2]: '21.0.0'
In [3]:
```
GitHub link: https://github.com/apache/arrow/discussions/47683
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]