zeroshade commented on code in PR #36163:
URL: https://github.com/apache/arrow/pull/36163#discussion_r1234553594


##########
go/parquet/pqarrow/file_writer.go:
##########
@@ -134,6 +134,13 @@ func (fw *FileWriter) RowGroupTotalBytesWritten() int64 {
        return 0
 }
 
+// WriteBuffered allows to write records and decide where to break your row 
group
+// based on the TotalBytesWritten rather than on the max row group len.
+// If using Records, this should be paired with NewBufferedRowGroup,
+// while Write will always write a new record as a row group in and of itself.
+//
+// Performance-wise WriteBuffered might be more favorable than Write
+// especially if dealing with lots of records that have only a small amount of 
rows.

Review Comment:
    Mention that the tradeoff is that more memory will be utilized to keep the 
whole row group buffered in memory before it starts writing (since Parquet 
files must write an entire column before writing the next column).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to