pitrou commented on PR #36286:
URL: https://github.com/apache/arrow/pull/36286#issuecomment-1611501785
Thanks for the explanation @mapleFU . Perhaps we can add an optional
argument to `parquet::arrow::FileWriterImpl::WriteTable` to tell it whether to
act more like `WriteRecordBatch`?
Something like:
```c++
/// \brief Write a Table to Parquet.
///
/// If `use_buffering` is false, then any pending row group is closed
/// at the beginning and at the end of this call.
/// If `use_buffering` is true, this function reuses an existing
/// buffered row group until the chunk size is met, and leaves
/// the last row group open for further writes.
/// It is recommended to set `use_buffering` to true to minimize
/// the number of row groups, especially when calling `WriteTable`
/// with small tables.
///
/// \param table Arrow table to write.
/// \param chunk_size maximum number of rows to write per row group.
/// \param use_buffering Whether to potentially buffer data.
virtual ::arrow::Status WriteTable(
const ::arrow::Table& table, int64_t chunk_size =
DEFAULT_MAX_ROW_GROUP_LENGTH,
bool use_buffering = false) = 0;
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]