tyrelr commented on pull request #8973: URL: https://github.com/apache/arrow/pull/8973#issuecomment-749581718
Yep, I forgot that simd wasn't a default feature so I didn't test that locally. I'll take a look at that tonight, hopefully. This will touch a few lines already modified by PR #8975, but this change is simple enough it shouldn't cause concern. @Dandandan I ran master vs. current head over night (twice to try to distinguish environmental performance from code performance). Using critcmp to filter to out anything smaller than a 10% difference, it looks like this: ``` group head-2c823df03-ARROW-10989 head-2c823df03-ARROW-10989-run2 master-ARROW-10989 master-ARROW-10989-run2 ----- -------------------------- ------------------------------- ------------------ ----------------------- add 512 1.30 408.5±3.37ns ? B/sec 1.00 315.0±1.97ns ? B/sec 3.46 1088.8±5.62ns ? B/sec 3.64 1146.1±4.42ns ? B/sec add_nulls_512 1.09 423.8±3.18ns ? B/sec 1.00 389.3±2.58ns ? B/sec 2.98 1158.3±6.43ns ? B/sec 2.95 1149.9±6.95ns ? B/sec array_from_vec 128 1.08 444.3±3.26ns ? B/sec 1.11 458.3±1.97ns ? B/sec 1.00 411.4±2.09ns ? B/sec 1.05 432.2±2.82ns ? B/sec bench_primitive 1.00 1109.8±9.80µs 3.5 GB/sec 1.03 1146.2±6.90µs 3.4 GB/sec 3.01 3.3±0.01ms 1198.7 MB/sec 2.94 3.3±0.01ms 1225.9 MB/sec cast float64 to float32 512 1.01 2.8±0.03µs ? B/sec 1.00 2.8±0.02µs ? B/sec 1.11 3.1±0.03µs ? B/sec 1.00 2.8±0.02µs ? B/sec cast int32 to int32 512 1.00 26.9±0.38ns ? B/sec 1.00 26.8±0.19ns ? B/sec 0.99 26.8±0.15ns ? B/sec 1.11 30.0±0.16ns ? B/sec cast time32s to time32ms 512 1.00 965.9±11.12ns ? B/sec 1.05 1012.6±9.16ns ? B/sec 1.75 1687.8±8.08ns ? B/sec 1.68 1621.7±7.75ns ? B/sec cast time64ns to time32s 512 1.10 11.1±0.12µs ? B/sec 1.00 10.1±0.16µs ? B/sec 1.00 10.1±0.04µs ? B/sec 1.00 10.1±0.11µs ? B/sec cast timestamp_ms to timestamp_ns 512 1.14 1481.4±11.07ns ? B/sec 1.00 1304.2±10.03ns ? B/sec 1.41 1840.6±9.37ns ? B/sec 1.45 1894.2±10.44ns ? B/sec divide 512 1.04 1830.0±12.34ns ? B/sec 1.00 1752.3±9.17ns ? B/sec 1.03 1797.1±9.00ns ? B/sec 1.27 2.2±0.01µs ? B/sec eq scalar Float32 1.00 64.2±0.52µs ? B/sec 1.01 64.9±0.26µs ? B/sec 1.07 68.9±0.29µs ? B/sec 1.11 71.3±0.26µs ? B/sec filter context f32 low selectivity 1.10 129.4±2.24µs ? B/sec 1.00 117.1±0.63µs ? B/sec 1.01 118.7±0.56µs ? B/sec 1.01 118.7±0.51µs ? B/sec min nulls 512 1.13 2.1±0.02µs ? B/sec 1.01 1872.1±12.28ns ? B/sec 1.00 1858.1±25.37ns ? B/sec 1.12 2.1±0.02µs ? B/sec multiply 512 1.00 403.0±2.35ns ? B/sec 1.21 487.6±4.20ns ? B/sec 2.86 1152.5±9.71ns ? B/sec 2.86 1151.0±6.00ns ? B/sec subtract 512 1.00 385.3±4.18ns ? B/sec 1.09 418.9±2.99ns ? B/sec 3.24 1246.6±6.37ns ? B/sec 3.00 1157.5±6.40ns ? B/sec take bool nulls 1024 1.02 2.6±0.02µs ? B/sec 1.00 2.6±0.02µs ? B/sec 1.91 4.9±0.05µs ? B/sec 1.91 4.9±0.05µs ? B/sec take bool nulls 512 1.03 1449.2±18.90ns ? B/sec 1.00 1412.8±11.94ns ? B/sec 1.27 1788.1±14.12ns ? B/sec 1.34 1899.4±32.11ns ? B/sec take i32 512 1.00 924.2±5.01ns ? B/sec 1.00 922.8±6.28ns ? B/sec 1.11 1022.4±5.66ns ? B/sec 1.00 925.5±9.49ns ? B/sec take i32 nulls 512 1.09 1067.0±6.65ns ? B/sec 1.00 975.3±5.77ns ? B/sec 1.09 1065.0±5.21ns ? B/sec 1.11 1080.0±6.80ns ? B/sec take str 1024 1.00 4.6±0.02µs ? B/sec 1.04 4.8±0.03µs ? B/sec 1.14 5.3±0.04µs ? B/sec 1.06 4.9±0.03µs ? B/sec take str 512 1.00 2.8±0.01µs ? B/sec 1.04 2.9±0.02µs ? B/sec 1.14 3.2±0.03µs ? B/sec 1.01 2.8±0.02µs ? B/sec take str null indices 512 1.00 2.8±0.01µs ? B/sec 1.04 2.9±0.02µs ? B/sec 1.16 3.2±0.03µs ? B/sec 1.03 2.9±0.02µs ? B/sec ``` A few sum benchmarks are significantly faster (finished in half to a third the time). I am surprised by a performance increase in take bool nulls 1025/512 and take str null indices 512... I wouldn't expect those to use primitive arrays at all. I'll look into why that changes at the same time I look into fixing the simd compilation. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
