alamb commented on code in PR #9093:
URL: https://github.com/apache/arrow-rs/pull/9093#discussion_r3021837200


##########
parquet/src/arrow/array_reader/builder.rs:
##########
@@ -96,15 +96,26 @@ pub struct ArrayReaderBuilder<'a> {
     parquet_metadata: Option<&'a ParquetMetaData>,
     /// metrics
     metrics: &'a ArrowReaderMetrics,
+    /// Batch size for pre-allocating internal buffers
+    batch_size: usize,
 }
 
 impl<'a> ArrayReaderBuilder<'a> {
-    pub fn new(row_groups: &'a dyn RowGroups, metrics: &'a ArrowReaderMetrics) 
-> Self {
+    /// Create a new `ArrayReaderBuilder`
+    ///
+    /// `batch_size` is used to pre-allocate internal buffers with the 
expected capacity,
+    /// avoiding reallocations when reading the first batch of data.
+    pub fn new(
+        row_groups: &'a dyn RowGroups,
+        metrics: &'a ArrowReaderMetrics,
+        batch_size: usize,

Review Comment:
   This is a public API and thus this change is a breaking API change.
   
   Maybe we could avoid changing the API via a new `with` method instead
   
   something like
   ```rust
   let reader = ArrayReaderBuilder::new(row_groups, metrics)
     .with_batch_size(batch_size)
   ```
   
   🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to