DuanWeiFan opened a new pull request, #548:
URL: https://github.com/apache/arrow-go/pull/548

   ### Rationale for this change
   As discussed in [issue](https://github.com/apache/arrow-go/issues/511), 
pqarrow parquet writer when writing more than 1 row group, it cannot track the 
bytesWritten & compressedbytesWritten for closed row group.
   The method `RowGroupTotalCompressedBytes` & `RowGroupTotalBytesWritten` only 
returns the **current** row group.
   The ideal is to introduce `TotalCompressedBytes()` & `TotalBytesWritten()` 
as an enhancement for user to track all row group written bytes & compressed 
bytes.
   
   ### What changes are included in this PR?
   As part of making this change, I notice the totalCompressedBytes is not 
being populated correctly in `row_group_writer.go`.
   It should follow the same approach as `bytesWritten` where it tracks all the 
closed column_writer.
   
   
   ### Are these changes tested?
   Yes. I added test cases to ensure those metrics are populated correctly
   
   
   ### Are there any user-facing changes?
   Yes. There are two new methods introduced:
   1. TotalCompressedBytes()
   2. TotalBytesWritten()
   
   It won't impact on existing users. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to