lilianm commented on code in PR #8527:
URL: https://github.com/apache/arrow-rs/pull/8527#discussion_r2397549308


##########
parquet/src/column/writer/mod.rs:
##########
@@ -1073,6 +1073,7 @@ impl<'a, E: ColumnValueEncoder> GenericColumnWriter<'a, 
E> {
                 if let Some(ref mut cmpr) = self.compressor {
                     let mut compressed_buf = 
Vec::with_capacity(uncompressed_size);
                     cmpr.compress(&buffer[..], &mut compressed_buf)?;
+                    compressed_buf.shrink_to_fit();

Review Comment:
   The cost of copy is pretty insignifiant  because memcpy speed it's around 
10000MB/s and compression speed it's around 600MB/s. Underlayer vector use 
shink method  
https://doc.rust-lang.org/alloc/alloc/trait.Allocator.html#method.shrink. In 
standard malloc threadhold for switch to mmap allocation it's 128k and for 
shrink the system only unmap page and no need memory copy.
   
   In V2 page buffer is not reserved
   
   For no compress page when compression it's bad i can be a good idea to apply 
for V1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to