GitHub user mesejo created a discussion: write_dataset only writes the first 
batch

I have this snippet of code:

```python
from pathlib import Path

import pyarrow as pa
import pyarrow.dataset as ds


tempdir = Path(__file__).parent / 'temp'

table = pa.table({"a": range(1024)})
batches = table.to_batches(max_chunksize=2)
ds.write_dataset(batches, tempdir, format="parquet", preserve_order=True, 
use_threads=True)
```

I was expecting it to write all the batches, but it is only writing the first 
batch:
```bash
$ ls temp
part-0.parquet
``` 

Is this the expected output? I'm working with Python 3.11 and pyarrow 21.0.0

```bash
$ ipython
Python 3.11.11 (main, Feb  5 2025, 19:11:07) [Clang 19.1.6 ]
Type 'copyright', 'credits' or 'license' for more information
IPython 9.4.0 -- An enhanced Interactive Python. Type '?' for help.
Tip: You can change the editing mode of IPython to behave more like vi, or 
emacs.

In [1]: import pyarrow as pa

In [2]: pa.__version__
Out[2]: '21.0.0'

In [3]: 
```
 


GitHub link: https://github.com/apache/arrow/discussions/47683

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to