[GitHub] [arrow] jorgecarleitao opened a new pull request #8796: [Rust] [Experiment] Vec vs current allocations

GitBox Sat, 28 Nov 2020 13:15:59 -0800


jorgecarleitao opened a new pull request #8796:
URL: https://github.com/apache/arrow/pull/8796



   @nevi-me , @alamb  @jhorstmann , I have been playing around with the buffers 
on the arrow crate, and just for the fun, tried to replace all our `memory` 
logic by a simple `Vec<u8>`. Perhaps unsurprisingly to you, but a bit to me, 
this leads to a significant improvement over almost all benches. I.e. even 
though memory alignment is good for some kernels, overall our allocations and 
memory handling seems to be much worse than `Vec`. 
   
   I am not proposing that we drop the alignment over cache lines as it 
theoretically more sound. However, practically (and based on our 
microbenchmarks alone), there seems to be a good case here. Maybe this behavior 
is different if we use `simd` feature gate?
   
   Here are the results ordered from worse to best (results not significant are 
not shown):
   
   |  benchmark | variation (%) |
   |-------------- | -------------- | 
   | nlike_utf8 scalar ends with | 15.8 | 
   | sum nulls 512 | 13.5 | 
   | struct_array_from_vec 1024 | 8.0 | 
   | array_slice 512 | 6.9 | 
   | nlike_utf8 scalar contains | 5.7 | 
   | cast timestamp_ns to timestamp_s 512 | 5.1 | 
   | record_batches_to_csv | 3.9 | 
   | sort nulls 2^12 | 3.5 | 
   | sort nulls 2^10 | 3.0 | 
   | min string 512 | 3.0 | 
   | cast timestamp_ms to i64 512 | 2.3 | 
   | nlike_utf8 scalar equals | 2.1 | 
   | struct_array_from_vec 512 | 1.8 | 
   | take str 1024 | 1.4 | 
   | nlike_utf8 scalar complex | 1.3 | 
   | array_slice 128 | 0.9 | 
   | like_utf8 scalar complex | 0.5 | 
   | like_utf8 scalar contains | 0.5 | 
   | min 512 | -1.0 | 
   | sort 2^12 | -1.1 | 
   | sort 2^10 | -1.1 | 
   | like_utf8 scalar equals | -1.3 | 
   | like_utf8 scalar starts with | -1.4 | 
   | limit 512, 512 | -1.4 | 
   | cast time32s to time32ms 512 | -1.9 | 
   | subtract 512 | -2.2 | 
   | filter context f32 high selectivity | -2.2 | 
   | add 512 | -2.7 | 
   | struct_array_from_vec 256 | -2.7 | 
   | divide_nulls_512 | -2.9 | 
   | sum 512 | -3.0 | 
   | add_nulls_512 | -3.1 | 
   | take str nulls 512 | -3.5 | 
   | multiply 512 | -3.6 | 
   | filter context u8 very low selectivity | -3.9 | 
   | array_slice 2048 | -4.3 | 
   | cast date64 to date32 512 | -4.5 | 
   | take i32 nulls 1024 | -4.8 | 
   | min nulls string 512 | -5.1 | 
   | take i32 1024 | -5.3 | 
   | array_string_from_vec 256 | -5.5 | 
   | array_string_from_vec 128 | -5.7 | 
   | filter context u8 w NULLs very low selectivity | -6.4 | 
   | filter context u8 low selectivity | -6.6 | 
   | filter u8 high selectivity | -7.1 | 
   | filter context u8 w NULLs high selectivity | -7.2 | 
   | filter u8 very low selectivity | -7.4 | 
   | struct_array_from_vec 128 | -7.4 | 
   | cast int64 to int32 512 | -7.8 | 
   | cast date32 to date64 512 | -8.2 | 
   | take i32 nulls 512 | -8.3 | 
   | equal_string_nulls_512 | -8.4 | 
   | take i32 512 | -8.5 | 
   | buffer_bit_ops and | -9.4 | 
   | equal_512 | -9.5 | 
   | take str 512 | -9.6 | 
   | cast time64ns to time32s 512 | -9.7 | 
   | take bool 1024 | -10.2 | 
   | filter context u8 high selectivity | -11.2 | 
   | filter u8 low selectivity | -11.2 | 
   | equal_string_512 | -12.2 | 
   | array_from_vec 256 | -12.5 | 
   | take bool 512 | -12.8 | 
   | cast time32s to time64us 512 | -15.6 | 
   | buffer_bit_ops or | -17.0 | 
   | eq scalar Float32 | -17.6 | 
   | lt_eq scalar Float32 | -17.9 | 
   | lt scalar Float32 | -18.2 | 
   | array_from_vec 512 | -19.4 | 
   | gt_eq scalar Float32 | -19.5 | 
   | take bool nulls 1024 | -19.7 | 
   | lt_eq Float32 | -19.8 | 
   | eq Float32 | -19.9 | 
   | gt_eq Float32 | -20.2 | 
   | filter context u8 w NULLs low selectivity | -20.4 | 
   | neq scalar Float32 | -21.1 | 
   | gt scalar Float32 | -21.5 | 
   | and | -21.8 | 
   | or | -22.1 | 
   | not | -22.6 | 
   | take bool nulls 512 | -22.7 | 
   | cast int32 to int64 512 | -23.0 | 
   | min nulls 512 | -23.2 | 
   | array_from_vec 128 | -23.4 | 
   | cast float64 to uint64 512 | -24.3 | 
   | neq Float32 | -24.8 | 
   | lt Float32 | -24.9 | 
   | gt Float32 | -25.6 | 
   | cast float64 to float32 512 | -25.9 | 
   | cast int32 to float64 512 | -27.6 | 
   | equal_nulls_512 | -28.0 | 
   | cast int32 to uint32 512 | -30.4 | 
   | cast int32 to float32 512 | -33.3 | 
   | cast float32 to int32 512 | -35.0 |


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow] jorgecarleitao opened a new pull request #8796: [Rust] [Experiment] Vec vs current allocations

Reply via email to