mapleFU commented on PR #40335:
URL: https://github.com/apache/arrow/pull/40335#issuecomment-1984118180
After: (On My AMD 3800x)
```
BM_ByteStreamSplitDecode_Float_Sse2/1024 268 ns 268 ns
2597941 bytes_per_second=14.2391Gi/s
BM_ByteStreamSplitDecode_Float_Sse2/4096 1056 ns 1056 ns
659104 bytes_per_second=14.4464Gi/s
BM_ByteStreamSplitDecode_Float_Sse2/32768 8464 ns 8464 ns
82631 bytes_per_second=14.4228Gi/s
BM_ByteStreamSplitDecode_Float_Sse2/65536 17016 ns 17016 ns
41237 bytes_per_second=14.3476Gi/s
BM_ByteStreamSplitDecode_Double_Sse2/1024 863 ns 863 ns
811078 bytes_per_second=8.84518Gi/s
BM_ByteStreamSplitDecode_Double_Sse2/4096 3546 ns 3546 ns
196919 bytes_per_second=8.60728Gi/s
BM_ByteStreamSplitDecode_Double_Sse2/32768 28309 ns 28309 ns
24734 bytes_per_second=8.62408Gi/s
BM_ByteStreamSplitDecode_Double_Sse2/65536 56551 ns 56551 ns
12353 bytes_per_second=8.63435Gi/s
BM_ByteStreamSplitEncode_Float_Sse2/1024 349 ns 349 ns
2002774 bytes_per_second=10.9233Gi/s
BM_ByteStreamSplitEncode_Float_Sse2/4096 1381 ns 1381 ns
506294 bytes_per_second=11.053Gi/s
BM_ByteStreamSplitEncode_Float_Sse2/32768 11064 ns 11064 ns
63779 bytes_per_second=11.0334Gi/s
BM_ByteStreamSplitEncode_Float_Sse2/65536 26332 ns 26332 ns
26807 bytes_per_second=9.27175Gi/s
BM_ByteStreamSplitEncode_Double_Sse2/1024 963 ns 963 ns
728497 bytes_per_second=7.92249Gi/s
BM_ByteStreamSplitEncode_Double_Sse2/4096 4125 ns 4125 ns
170152 bytes_per_second=7.39747Gi/s
BM_ByteStreamSplitEncode_Double_Sse2/32768 34597 ns 34597 ns
20206 bytes_per_second=7.05663Gi/s
BM_ByteStreamSplitEncode_Double_Sse2/65536 69679 ns 69680 ns
10420 bytes_per_second=7.00753Gi/s
BM_ByteStreamSplitDecode_Float_Avx2/1024 230 ns 230 ns
3037165 bytes_per_second=16.5785Gi/s
BM_ByteStreamSplitDecode_Float_Avx2/4096 909 ns 909 ns
765138 bytes_per_second=16.7792Gi/s
BM_ByteStreamSplitDecode_Float_Avx2/32768 7275 ns 7275 ns
96407 bytes_per_second=16.7795Gi/s
BM_ByteStreamSplitDecode_Float_Avx2/65536 14672 ns 14672 ns
47858 bytes_per_second=16.6396Gi/s
BM_ByteStreamSplitDecode_Double_Avx2/1024 643 ns 643 ns
1086091 bytes_per_second=11.8583Gi/s
BM_ByteStreamSplitDecode_Double_Avx2/4096 2715 ns 2715 ns
257699 bytes_per_second=11.242Gi/s
BM_ByteStreamSplitDecode_Double_Avx2/32768 21646 ns 21646 ns
32293 bytes_per_second=11.2788Gi/s
BM_ByteStreamSplitDecode_Double_Avx2/65536 43594 ns 43594 ns
16003 bytes_per_second=11.2006Gi/s
BM_ByteStreamSplitEncode_Float_Avx2/1024 740 ns 740 ns
940892 bytes_per_second=5.15611Gi/s
BM_ByteStreamSplitEncode_Float_Avx2/4096 2891 ns 2891 ns
242620 bytes_per_second=5.27845Gi/s
BM_ByteStreamSplitEncode_Float_Avx2/32768 23174 ns 23174 ns
30344 bytes_per_second=5.26759Gi/s
BM_ByteStreamSplitEncode_Float_Avx2/65536 47080 ns 47080 ns
15025 bytes_per_second=5.1856Gi/s
BM_ByteStreamSplitEncode_Double_Avx2/1024 962 ns 962 ns
714181 bytes_per_second=7.92957Gi/s
BM_ByteStreamSplitEncode_Double_Avx2/4096 4206 ns 4206 ns
166235 bytes_per_second=7.25527Gi/s
BM_ByteStreamSplitEncode_Double_Avx2/32768 34696 ns 34696 ns
20041 bytes_per_second=7.03653Gi/s
BM_ByteStreamSplitEncode_Double_Avx2/65536 82677 ns 82677 ns
8268 bytes_per_second=5.90586Gi/s
```
Before:
```
BM_ByteStreamSplitDecode_Float_Sse2/1024 527 ns 527 ns
1918166 bytes_per_second=7.2438Gi/s
BM_ByteStreamSplitDecode_Float_Sse2/4096 1789 ns 1789 ns
532823 bytes_per_second=8.52931Gi/s
BM_ByteStreamSplitDecode_Float_Sse2/32768 11182 ns 11182 ns
77306 bytes_per_second=10.9164Gi/s
BM_ByteStreamSplitDecode_Float_Sse2/65536 30606 ns 30605 ns
20814 bytes_per_second=7.97704Gi/s
BM_ByteStreamSplitDecode_Double_Sse2/1024 1282 ns 1282 ns
730335 bytes_per_second=5.95065Gi/s
BM_ByteStreamSplitDecode_Double_Sse2/4096 5093 ns 5093 ns
137810 bytes_per_second=5.99156Gi/s
BM_ByteStreamSplitDecode_Double_Sse2/32768 42888 ns 42888 ns
13550 bytes_per_second=5.6925Gi/s
BM_ByteStreamSplitDecode_Double_Sse2/65536 93657 ns 93649 ns
8164 bytes_per_second=5.21396Gi/s
BM_ByteStreamSplitEncode_Float_Sse2/1024 655 ns 655 ns
1123042 bytes_per_second=5.82213Gi/s
BM_ByteStreamSplitEncode_Float_Sse2/4096 2577 ns 2577 ns
250103 bytes_per_second=5.92139Gi/s
BM_ByteStreamSplitEncode_Float_Sse2/32768 18899 ns 18899 ns
36646 bytes_per_second=6.45902Gi/s
BM_ByteStreamSplitEncode_Float_Sse2/65536 40659 ns 40659 ns
20018 bytes_per_second=6.00463Gi/s
BM_ByteStreamSplitEncode_Double_Sse2/1024 1081 ns 1078 ns
521342 bytes_per_second=7.07835Gi/s
BM_ByteStreamSplitEncode_Double_Sse2/4096 4089 ns 4084 ns
168537 bytes_per_second=7.47223Gi/s
BM_ByteStreamSplitEncode_Double_Sse2/32768 32269 ns 32237 ns
21543 bytes_per_second=7.57334Gi/s
BM_ByteStreamSplitEncode_Double_Sse2/65536 65524 ns 65427 ns
10961 bytes_per_second=7.46294Gi/s
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]