mapleFU commented on PR #39807:
URL: https://github.com/apache/arrow/pull/39807#issuecomment-2002870632
Under LLVM-17, MacOS M1 Pro, Release (-O2):
After:
```
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:8192/PerReadBytes:8192
14066 ns 14042 ns 50325
bytes_per_second=85.509M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:8192
81058 ns 80930 ns 8516
bytes_per_second=114.127M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:65536
85914 ns 85865 ns 7871
bytes_per_second=107.568M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:8192
1383077 ns 1380249 ns 511
bytes_per_second=104.614M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:65536
1381771 ns 1379589 ns 504
bytes_per_second=104.664M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:1048576
1449293 ns 1445271 ns 484 bytes_per_second=99.9072M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:8192/PerReadBytes:8192
17520 ns 17086 ns 40610
bytes_per_second=70.2738M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:8192
90540 ns 86818 ns 8047
bytes_per_second=106.387M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:65536
92686 ns 90816 ns 7614 bytes_per_second=101.704M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:8192
1416457 ns 1387700 ns 510 bytes_per_second=104.052M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:65536
1403328 ns 1397628 ns 505 bytes_per_second=103.313M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:1048576
1469190 ns 1460000 ns 481 bytes_per_second=98.8993M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:8192/PerReadBytes:8192
13032 ns 12953 ns 54602
bytes_per_second=91.2257M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:8192
57312 ns 57157 ns 12273
bytes_per_second=162.463M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:65536
65626 ns 64273 ns 10869
bytes_per_second=144.477M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:8192
925974 ns 925072 ns 746
bytes_per_second=158.115M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:65536
961608 ns 959083 ns 750
bytes_per_second=152.508M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:1048576
1029553 ns 1028537 ns 680 bytes_per_second=142.21M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:8192/PerReadBytes:8192
20305 ns 17293 ns 46128
bytes_per_second=68.3272M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:8192
60087 ns 59985 ns 11039
bytes_per_second=154.805M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:65536
74347 ns 69853 ns 10346 bytes_per_second=132.936M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:8192
1114509 ns 992978 ns 721 bytes_per_second=147.302M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:65536
960557 ns 959252 ns 710 bytes_per_second=152.481M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:1048576
1042081 ns 1027100 ns 700 bytes_per_second=142.409M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:8192/PerReadBytes:8192
7225 ns 7220 ns 88111 bytes_per_second=300.086M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:8192
25710 ns 23944 ns 30632 bytes_per_second=760.71M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:65536
30453 ns 30158 ns 24247 bytes_per_second=603.954M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:8192
398575 ns 396622 ns 1771 bytes_per_second=715.976M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:65536
403529 ns 400651 ns 1783 bytes_per_second=708.777M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:1048576
467985 ns 463299 ns 1513 bytes_per_second=612.934M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:8192/PerReadBytes:8192
9285 ns 9243 ns 71174 bytes_per_second=234.419M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:8192
29829 ns 28100 ns 25100 bytes_per_second=648.201M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:65536
36288 ns 34810 ns 20527 bytes_per_second=523.245M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:8192
402043 ns 398348 ns 1715 bytes_per_second=712.874M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:65536
422659 ns 416023 ns 1623 bytes_per_second=682.587M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:1048576
498678 ns 489495 ns 1440 bytes_per_second=580.132M/s
```
Before:
```
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:8192/PerReadBytes:8192
14371 ns 14325 ns 46902
bytes_per_second=83.8181M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:8192
77623 ns 77507 ns 9161
bytes_per_second=119.168M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:65536
87373 ns 87304 ns 8358
bytes_per_second=105.795M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:8192
1383045 ns 1382063 ns 504
bytes_per_second=104.476M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:65536
1370321 ns 1369469 ns 512
bytes_per_second=105.437M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:1048576
1433147 ns 1432126 ns 493 bytes_per_second=100.824M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:8192/PerReadBytes:8192
16448 ns 16447 ns 41847
bytes_per_second=73.0014M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:8192
82950 ns 82436 ns 8123
bytes_per_second=112.042M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:65536/PerReadBytes:65536
88021 ns 87920 ns 7805 bytes_per_second=105.053M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:8192
1384970 ns 1383970 ns 506 bytes_per_second=104.332M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:65536
1386657 ns 1385639 ns 509 bytes_per_second=104.207M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::GZIP>/InputBytes:1048576/PerReadBytes:1048576
1444835 ns 1443245 ns 490 bytes_per_second=100.047M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:8192/PerReadBytes:8192
13062 ns 12891 ns 50916
bytes_per_second=91.6604M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:8192
56024 ns 55910 ns 11993
bytes_per_second=166.088M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:65536
65663 ns 64584 ns 11142
bytes_per_second=143.781M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:8192
963169 ns 941353 ns 734
bytes_per_second=155.381M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:65536
1048394 ns 972503 ns 733
bytes_per_second=150.403M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:1048576
1031212 ns 1028287 ns 687 bytes_per_second=142.244M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:8192/PerReadBytes:8192
16712 ns 16258 ns 43318
bytes_per_second=72.6775M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:8192
60878 ns 60527 ns 11269
bytes_per_second=153.417M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:65536/PerReadBytes:65536
68693 ns 67494 ns 10350 bytes_per_second=137.581M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:8192
950998 ns 946565 ns 722 bytes_per_second=154.525M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:65536
964300 ns 962337 ns 733 bytes_per_second=151.992M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::ZSTD>/InputBytes:1048576/PerReadBytes:1048576
1029719 ns 1028186 ns 665 bytes_per_second=142.258M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:8192/PerReadBytes:8192
7108 ns 7084 ns 92116 bytes_per_second=305.886M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:8192
22935 ns 22908 ns 27823 bytes_per_second=795.094M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:65536
30680 ns 30548 ns 24209 bytes_per_second=596.256M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:8192
389600 ns 389246 ns 1755 bytes_per_second=729.544M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:65536
395812 ns 395273 ns 1705 bytes_per_second=718.419M/s
CompressionInputZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:1048576
457402 ns 456781 ns 1518 bytes_per_second=621.68M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:8192/PerReadBytes:8192
9360 ns 9350 ns 76763 bytes_per_second=231.729M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:8192
27402 ns 27259 ns 25881 bytes_per_second=668.181M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:65536/PerReadBytes:65536
32157 ns 32138 ns 20886 bytes_per_second=566.753M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:8192
444164 ns 441893 ns 1583 bytes_per_second=642.625M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:65536
446398 ns 446078 ns 1550 bytes_per_second=636.597M/s
CompressionInputNonZeroCopyBenchmark<::arrow::Compression::LZ4_FRAME>/InputBytes:1048576/PerReadBytes:1048576
493388 ns 493005 ns 1300 bytes_per_second=576.001M/s
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]