wgtmac commented on code in PR #33897:
URL: https://github.com/apache/arrow/pull/33897#discussion_r1118309649
##########
cpp/src/parquet/file_writer.h:
##########
@@ -52,10 +52,12 @@ class PARQUET_EXPORT RowGroupWriter {
virtual int current_column() const = 0;
virtual void Close() = 0;
- // total bytes written by the page writer
+ // total uncompressed bytes written by the page writer
virtual int64_t total_bytes_written() const = 0;
// total bytes still compressed but not written
Review Comment:
```suggestion
/// \brief total bytes still compressed but not written by the page
writer
```
##########
cpp/src/parquet/file_writer.h:
##########
@@ -52,10 +52,12 @@ class PARQUET_EXPORT RowGroupWriter {
virtual int current_column() const = 0;
virtual void Close() = 0;
- // total bytes written by the page writer
+ // total uncompressed bytes written by the page writer
Review Comment:
```suggestion
/// \brief total uncompressed bytes written by the page writer
```
##########
cpp/src/parquet/file_writer.h:
##########
@@ -90,8 +92,13 @@ class PARQUET_EXPORT RowGroupWriter {
*/
int64_t num_rows() const;
+ /// \brief total uncompressed bytes written by the page writer
int64_t total_bytes_written() const;
+ /// \brief total bytes still compressed but not written
+ /// It will always be 0 in un-buffered mode.
Review Comment:
```suggestion
/// \brief total bytes still compressed but not written by the page writer.
/// It will always return 0 from the SerializedPageWriter.
```
##########
cpp/src/parquet/column_writer.h:
##########
@@ -104,12 +104,15 @@ class PARQUET_EXPORT PageWriter {
// Return the number of uncompressed bytes written (including header size)
virtual int64_t WriteDictionaryPage(const DictionaryPage& page) = 0;
+ /// \brief The total number of bytes written as serialized data and
Review Comment:
Simply put
- written: compressed and flushed to sink
- compressed: compressed only and not flushed to sink
Am I correct?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]