GitHub user adamreeve added a comment to the discussion: write_dataset only writes the first batch
Hi @mesejo. What's the contents of `part-0.parquet`? I'd expect it to contain all 1024 rows. `write_dataset` doesn't partition data into separate files for each input batch. Maybe you want to use the `max_rows_per_file` parameter if you're trying to reduce the size of files? GitHub link: https://github.com/apache/arrow/discussions/47683#discussioncomment-14581051 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
