alamb commented on code in PR #9093:
URL: https://github.com/apache/arrow-rs/pull/9093#discussion_r3093206305


##########
parquet/src/arrow/record_reader/mod.rs:
##########
@@ -208,20 +219,31 @@ where
 
     /// Try to read one batch of data returning the number of records read
     fn read_one_batch(&mut self, batch_size: usize) -> Result<usize> {
+        // Update capacity hint to the largest batch size seen
+        if batch_size > self.capacity_hint {
+            self.capacity_hint = batch_size;
+        }
+
+        // Lazily initialize buffer on first read
+        let capacity_hint = self.capacity_hint;

Review Comment:
   It would be nice to avoid the allocation on a zero sized read (end of stream)
   
   ```suggestion
       fn read_one_batch(&mut self, batch_size: usize) -> Result<usize> {
           if batch_size == 0 {
               return Ok(0);
           }
   
           // Update capacity hint to the largest batch size seen
           if batch_size > self.capacity_hint {
               self.capacity_hint = batch_size;
           }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to