JigaoLuo commented on code in PR #8257:
URL: https://github.com/apache/arrow-rs/pull/8257#discussion_r2333288803


##########
parquet/src/column/writer/mod.rs:
##########
@@ -1104,12 +1104,23 @@ impl<'a, E: ColumnValueEncoder> GenericColumnWriter<'a, 
E> {
                     rep_levels_byte_len + def_levels_byte_len + 
values_data.buf.len();
 
                 // Data Page v2 compresses values only.
-                match self.compressor {
+                let is_compressed = match self.compressor {
                     Some(ref mut cmpr) => {
+                        let buffer_len = buffer.len();
                         cmpr.compress(&values_data.buf, &mut buffer)?;
+                        if uncompressed_size <= buffer.len() - buffer_len {

Review Comment:
   I’ll follow up once there’s a plan for this future ticket. Just a note from 
me —the score is still a highly empirical value (and depends on the 
decompression speed), so I’ve embedded it into my rewriter and have been 
extensively rewriting Parquet files based on it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to