wgtmac commented on code in PR #33897:
URL: https://github.com/apache/arrow/pull/33897#discussion_r1118309649


##########
cpp/src/parquet/file_writer.h:
##########
@@ -52,10 +52,12 @@ class PARQUET_EXPORT RowGroupWriter {
     virtual int current_column() const = 0;
     virtual void Close() = 0;
 
-    // total bytes written by the page writer
+    // total uncompressed bytes written by the page writer
     virtual int64_t total_bytes_written() const = 0;
     // total bytes still compressed but not written

Review Comment:
   ```suggestion
       /// \brief total bytes still compressed but not written by the page 
writer
   ```



##########
cpp/src/parquet/file_writer.h:
##########
@@ -52,10 +52,12 @@ class PARQUET_EXPORT RowGroupWriter {
     virtual int current_column() const = 0;
     virtual void Close() = 0;
 
-    // total bytes written by the page writer
+    // total uncompressed bytes written by the page writer

Review Comment:
   ```suggestion
       /// \brief total uncompressed bytes written by the page writer
   ```



##########
cpp/src/parquet/file_writer.h:
##########
@@ -90,8 +92,13 @@ class PARQUET_EXPORT RowGroupWriter {
    */
   int64_t num_rows() const;
 
+  /// \brief total uncompressed bytes written by the page writer
   int64_t total_bytes_written() const;
+  /// \brief total bytes still compressed but not written
+  /// It will always be 0 in un-buffered mode.

Review Comment:
   ```suggestion
     /// \brief total bytes still compressed but not written by the page writer.
     /// It will always return 0 from the SerializedPageWriter.
   ```



##########
cpp/src/parquet/column_writer.h:
##########
@@ -104,12 +104,15 @@ class PARQUET_EXPORT PageWriter {
   // Return the number of uncompressed bytes written (including header size)
   virtual int64_t WriteDictionaryPage(const DictionaryPage& page) = 0;
 
+  /// \brief The total number of bytes written as serialized data and

Review Comment:
   Simply put
   - written: compressed and flushed to sink
   - compressed: compressed only and not flushed to sink
   
   Am I correct?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to