mapleFU commented on PR #39807:
URL: https://github.com/apache/arrow/pull/39807#issuecomment-2004406456
On my win wsl Ubuntu22, AMD 3800X with gcc11.4, Release (-O2):
After:
```
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:8192/PerReadBytes:8192
15113 ns 15153 ns 45795
bytes_per_second=79.4275Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:8192
121673 ns 121761 ns 5686
bytes_per_second=75.8248Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:65536
122201 ns 122270 ns 5718
bytes_per_second=75.5095Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:8192
1918363 ns 1918733 ns 357
bytes_per_second=75.2846Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:65536
1925816 ns 1926215 ns 362
bytes_per_second=74.9922Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:1048576
1949610 ns 1950057 ns 363 bytes_per_second=74.0753Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:8192/PerReadBytes:8192
15292 ns 15331 ns 44844
bytes_per_second=78.5026Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:8192
122159 ns 122242 ns 5717
bytes_per_second=75.5267Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:65536
122144 ns 122221 ns 5687
bytes_per_second=75.5394Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:8192
1916707 ns 1917085 ns 361 bytes_per_second=75.3494Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:65536
1936225 ns 1936580 ns 356 bytes_per_second=74.5909Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:1048576
1974165 ns 1974662 ns 354 bytes_per_second=73.1523Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:8192/PerReadBytes:8192
7642 ns 7655 ns 91056
bytes_per_second=154.365Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:8192
39511 ns 39541 ns 17486
bytes_per_second=233.445Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:65536
40372 ns 40405 ns 17402
bytes_per_second=228.45Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:8192
742301 ns 742596 ns 942
bytes_per_second=196.959Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:65536
741678 ns 741958 ns 946
bytes_per_second=197.129Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:1048576
749815 ns 750119 ns 938 bytes_per_second=194.984Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:8192/PerReadBytes:8192
8083 ns 8097 ns 86602
bytes_per_second=145.937Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:8192
40381 ns 40418 ns 17284
bytes_per_second=228.379Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:65536
40977 ns 41010 ns 16777
bytes_per_second=225.084Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:8192
743131 ns 743474 ns 943 bytes_per_second=196.727Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:65536
753482 ns 753796 ns 920 bytes_per_second=194.033Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:1048576
763397 ns 763723 ns 914 bytes_per_second=191.511Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:8192/PerReadBytes:8192
2627 ns 2658 ns 259151
bytes_per_second=801.929Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:8192
14589 ns 14627 ns 47447 bytes_per_second=1.21655Gi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:65536
15057 ns 15091 ns 47015 bytes_per_second=1.17918Gi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:8192
292254 ns 292447 ns 2399 bytes_per_second=973.26Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:65536
296420 ns 296544 ns 2361 bytes_per_second=959.815Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:1048576
299178 ns 299350 ns 2342 bytes_per_second=950.818Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:8192/PerReadBytes:8192
3133 ns 3165 ns 222458 bytes_per_second=673.399Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:8192
15510 ns 15550 ns 44675 bytes_per_second=1.14436Gi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:65536
15672 ns 15700 ns 44787 bytes_per_second=1.1334Gi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:8192
296992 ns 297173 ns 2367 bytes_per_second=957.784Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:65536
302407 ns 302596 ns 2294 bytes_per_second=940.617Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:1048576
304691 ns 304865 ns 2288 bytes_per_second=933.618Mi/s
```
Before:
```
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:8192/PerReadBytes:8192
15091 ns 15129 ns 44783
bytes_per_second=79.5491Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:8192
124276 ns 124365 ns 5609
bytes_per_second=74.2374Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:65536
125119 ns 125202 ns 5581
bytes_per_second=73.7413Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:8192
1967307 ns 1967803 ns 357
bytes_per_second=73.4073Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:65536
1966845 ns 1967298 ns 358
bytes_per_second=73.4262Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:1048576
1969676 ns 1970107 ns 355 bytes_per_second=73.3215Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:8192/PerReadBytes:8192
15471 ns 15510 ns 44616
bytes_per_second=77.5999Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:8192
125884 ns 125971 ns 5556
bytes_per_second=73.2907Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:65536
123910 ns 123991 ns 5612
bytes_per_second=74.4612Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:8192
2012901 ns 2013295 ns 354 bytes_per_second=71.7486Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:65536
2019119 ns 2019770 ns 349 bytes_per_second=71.5186Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:1048576
1979235 ns 1979722 ns 346 bytes_per_second=72.9654Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:8192/PerReadBytes:8192
7864 ns 7883 ns 88901
bytes_per_second=149.899Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:8192
40161 ns 40195 ns 17269
bytes_per_second=229.646Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:65536
40392 ns 40423 ns 17157
bytes_per_second=228.351Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:8192
762881 ns 763268 ns 942
bytes_per_second=191.625Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:65536
749413 ns 749734 ns 929
bytes_per_second=195.084Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:1048576
763498 ns 763866 ns 918 bytes_per_second=191.475Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:8192/PerReadBytes:8192
8482 ns 8500 ns 81298
bytes_per_second=139.019Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:8192
40456 ns 40489 ns 16916
bytes_per_second=227.979Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:65536
41042 ns 41078 ns 16997
bytes_per_second=224.709Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:8192
741352 ns 741649 ns 930 bytes_per_second=197.211Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:65536
749918 ns 750239 ns 936 bytes_per_second=194.953Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:1048576
758674 ns 758991 ns 930 bytes_per_second=192.705Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:8192/PerReadBytes:8192
2645 ns 2675 ns 261762
bytes_per_second=796.956Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:8192
14682 ns 14715 ns 46999 bytes_per_second=1.20928Gi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:65536
15221 ns 15248 ns 45742 bytes_per_second=1.16704Gi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:8192
301442 ns 301624 ns 2331 bytes_per_second=943.649Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:65536
307876 ns 308057 ns 2276 bytes_per_second=923.943Mi/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:1048576
304182 ns 304369 ns 2301 bytes_per_second=935.136Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:8192/PerReadBytes:8192
3257 ns 3288 ns 211478 bytes_per_second=648.168Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:8192
16088 ns 16128 ns 43704 bytes_per_second=1.10335Gi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:65536
16290 ns 16319 ns 42819 bytes_per_second=1.09042Gi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:8192
305555 ns 305746 ns 2311 bytes_per_second=930.925Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:65536
308251 ns 308442 ns 2274 bytes_per_second=922.79Mi/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:1048576
310809 ns 311000 ns 2257 bytes_per_second=915.199Mi/s
```
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]