tustvold commented on code in PR #4280:
URL: https://github.com/apache/arrow-rs/pull/4280#discussion_r1207411974


##########
parquet/src/arrow/arrow_writer/mod.rs:
##########
@@ -152,43 +147,75 @@ impl<W: Write> ArrowWriter<W> {
         self.writer.flushed_row_groups()
     }
 
-    /// Enqueues the provided `RecordBatch` to be written
+    /// Returns the length in bytes of the current in progress row group
+    pub fn in_progress_size(&self) -> usize {

Review Comment:
   Aah, this is an oversight on my part. The reported size will just be the 
size of the flushed pages, any data buffered but not yet flushed to a page will 
not be counted. This should be at most 1 page per column, and so should be less 
than the max page size (1MB) per column 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to