Yordan Pavlov created ARROW-8831:
------------------------------------
Summary: [Rust] incomplete SIMD implementation in simd_compare_op
Key: ARROW-8831
URL: https://issues.apache.org/jira/browse/ARROW-8831
Project: Apache Arrow
Issue Type: Improvement
Components: Rust
Affects Versions: 0.17.0
Reporter: Yordan Pavlov
Currently the simd_compare_op function defined here
[https://github.com/apache/arrow/blob/master/rust/arrow/src/compute/kernels/comparison.rs#L204]
is only about 10% faster compared to the non-SIMD implementation and is
taking approximately the same time for types of different length (which
indicates that the SIMD implementation is not complete). Below are results from
benchmarks with Int8 and Float32 types:
eq Int8 time: [947.53 us 947.81 us 948.05 us]
eq Int8 simd time: [855.02 us 858.26 us 862.48 us]
neq Int8 time: [904.09 us 907.34 us 911.44 us]
neq Int8 simd time: [848.49 us 849.28 us 850.28 us]
lt Int8 time: [900.87 us 902.65 us 904.86 us]
lt Int8 simd time: [850.32 us 850.96 us 851.90 us]
lt_eq Int8 time: [974.68 us 983.03 us 991.98 us]
lt_eq Int8 simd time: [851.83 us 852.22 us 852.74 us]
gt Int8 time: [908.48 us 911.76 us 914.72 us]
gt Int8 simd time: [851.93 us 852.43 us 853.04 us]
gt_eq Int8 time: [981.53 us 983.37 us 986.31 us]
gt_eq Int8 simd time: [855.59 us 856.83 us 858.61 us]
eq Float32 time: [911.46 us 911.70 us 912.01 us]
eq Float32 simd time: [884.74 us 885.97 us 887.74 us]
neq Float32 time: [904.26 us 904.73 us 905.27 us]
neq Float32 simd time: [884.40 us 892.32 us 901.98 us]
lt Float32 time: [907.90 us 908.54 us 909.34 us]
lt Float32 simd time: [883.23 us 886.05 us 889.31 us]
lt_eq Float32 time: [911.44 us 911.62 us 911.82 us]
lt_eq Float32 simd time: [882.78 us 886.78 us 891.05 us]
gt Float32 time: [906.88 us 907.96 us 909.32 us]
gt Float32 simd time: [879.78 us 883.03 us 886.63 us]
gt_eq Float32 time: [924.72 us 926.03 us 928.29 us]
gt_eq Float32 simd time: [884.80 us 885.93 us 887.35 us]
In the benchmark results above, notice how both the SIMD and non-SIMD
operations take similar amounts of time for types of different size (Int8 and
Float32). This is normal for a non-SIMD implementation but is not normal for a
SIMD implementation as SIMD operations can be executed on more values of
smaller size.
This pull request attempts to fix that:
[https://github.com/apache/arrow/pull/7204]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)