candiduslynx commented on code in PR #36163: URL: https://github.com/apache/arrow/pull/36163#discussion_r1234761708
########## go/parquet/pqarrow/file_writer.go: ########## @@ -181,7 +188,10 @@ func (fw *FileWriter) WriteBuffered(rec arrow.Record) error { } // Write an arrow Record Batch to the file, respecting the MaxRowGroupLength in the writer -// properties to determine whether or not a new row group is created while writing. +// properties to determine whether a new row group is created or not while writing. +// +// Performance-wise Write might be more favorable than WriteBuffered +// especially if dealing with records that have a lot of rows. Review Comment: Done in 9c4ca94137a143986e69feb07a0ab94baee7e334 ########## go/parquet/pqarrow/file_writer.go: ########## @@ -134,6 +134,13 @@ func (fw *FileWriter) RowGroupTotalBytesWritten() int64 { return 0 } +// WriteBuffered allows to write records and decide where to break your row group +// based on the TotalBytesWritten rather than on the max row group len. +// If using Records, this should be paired with NewBufferedRowGroup, +// while Write will always write a new record as a row group in and of itself. +// +// Performance-wise WriteBuffered might be more favorable than Write +// especially if dealing with lots of records that have only a small amount of rows. Review Comment: Done in 9c4ca94137a143986e69feb07a0ab94baee7e334 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org