JigaoLuo commented on code in PR #8257: URL: https://github.com/apache/arrow-rs/pull/8257#discussion_r2333288803
########## parquet/src/column/writer/mod.rs: ########## @@ -1104,12 +1104,23 @@ impl<'a, E: ColumnValueEncoder> GenericColumnWriter<'a, E> { rep_levels_byte_len + def_levels_byte_len + values_data.buf.len(); // Data Page v2 compresses values only. - match self.compressor { + let is_compressed = match self.compressor { Some(ref mut cmpr) => { + let buffer_len = buffer.len(); cmpr.compress(&values_data.buf, &mut buffer)?; + if uncompressed_size <= buffer.len() - buffer_len { Review Comment: I’ll follow up once there’s a plan for this future ticket. Just a note from me —the score is still a highly empirical value (and depends on the decompression speed), so I’ve embedded it into my rewriter and have been extensively rewriting Parquet files based on it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org