etseidl commented on PR #6281:
URL: https://github.com/apache/arrow-rs/pull/6281#issuecomment-2302741114

   I agree that results from the compression bench are needed. Running your 
test code in release mode speeds things up dramatically.
   ```
   Debug
   init with macro, num_elems:10000, cost:7.771µs
   init with resize, num_elems:20000, cost:130.553µs
   init with set_len, num_elems:30000, cost:7.661µs
   
   Release
   init with macro, num_elems:10000, cost:100ns
   init with resize, num_elems:20000, cost:7.49µs
   init with set_len, num_elems:30000, cost:80ns
   ```
   `resize` is still much slower, but is 20X faster with optimization enabled. 
I'd be curious to see how overall throughput through the decompressor is 
affected by this change.
   
   FWIW I tried replacing another 
[use](https://github.com/apache/arrow-rs/blob/30db5dce9ca0996457063f1b5308076a6372c438/parquet/src/arrow/array_reader/fixed_len_byte_array.rs#L467)
 of `resize` with the `reserve/write_bytes/set_len` proposed here and saw a 
modest (2-4%) speedup in decoding times.
   ```
   group                                                                        
                           new_resize                             to_prim
   -----                                                                        
                           ----------                             -------
   arrow_array_reader/BYTE_STREAM_SPLIT/Decimal128Array/byte_stream_split 
encoded, mandatory, no NULLs     1.00    411.7±2.05µs        ? ?/sec    1.04    
429.1±7.35µs        ? ?/sec
   arrow_array_reader/BYTE_STREAM_SPLIT/Decimal128Array/byte_stream_split 
encoded, optional, half NULLs    1.00    523.3±2.65µs        ? ?/sec    1.02    
532.7±4.72µs        ? ?/sec
   arrow_array_reader/BYTE_STREAM_SPLIT/Decimal128Array/byte_stream_split 
encoded, optional, no NULLs      1.00    414.0±3.41µs        ? ?/sec    1.04    
428.6±6.21µs        ? ?/sec
   arrow_array_reader/BYTE_STREAM_SPLIT/Float16Array/byte_stream_split encoded, 
mandatory, no NULLs        1.00     52.1±0.26µs        ? ?/sec    1.04     
54.1±0.78µs        ? ?/sec
   arrow_array_reader/BYTE_STREAM_SPLIT/Float16Array/byte_stream_split encoded, 
optional, half NULLs       1.00    109.8±0.65µs        ? ?/sec    1.04    
113.7±1.61µs        ? ?/sec
   arrow_array_reader/BYTE_STREAM_SPLIT/Float16Array/byte_stream_split encoded, 
optional, no NULLs         1.00     56.7±1.02µs        ? ?/sec    1.04     
59.0±0.52µs        ? ?/sec
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to