On Thu, 14 Nov 2024 00:44:35 GMT, Volodymyr Paprotski <vpaprot...@openjdk.org> wrote:
>> Measuring throughput with JMH parameters `-f 1 -i 2 -wi 3 -r 20 -w 30 -p >> algorithm=AES/CBC/NoPadding -p dataSize=30000000 -p provider=SunJCE -p >> keyLength=128 org.openjdk.bench.javax.crypto.full.AESBench` >> >> Before: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) >> Mode Cnt Score Error Units >> AESBench.decrypt AES/CBC/NoPadding 30000000 128 SunJCE >> thrpt 2 25.383 ops/s >> AESBench.decrypt2 AES/CBC/NoPadding 30000000 128 SunJCE >> thrpt 2 32.230 ops/s >> AESBench.encrypt AES/CBC/NoPadding 30000000 128 SunJCE >> thrpt 2 20.489 ops/s >> AESBench.encrypt2 AES/CBC/NoPadding 30000000 128 SunJCE >> thrpt 2 21.383 ops/s >> >> >> After: >> >> Benchmark (algorithm) (dataSize) (keyLength) (provider) >> Mode Cnt Score Error Units >> AESBench.decrypt AES/CBC/NoPadding 30000000 128 SunJCE >> thrpt 2 215.144 ops/s >> AESBench.decrypt2 AES/CBC/NoPadding 30000000 128 SunJCE >> thrpt 2 411.265 ops/s >> AESBench.encrypt AES/CBC/NoPadding 30000000 128 SunJCE >> thrpt 2 64.341 ops/s >> AESBench.encrypt2 AES/CBC/NoPadding 30000000 128 SunJCE >> thrpt 2 73.114 ops/s >> >> >> I have not deterministically proven why chunking works: before the change, >> the CBC intrinsic is not being used; and after chunking, it is. There is >> quite a bit of GC activity in the default AESBench, so `encrypt2/decrypt2` >> versions isolate just crypto (see comment below). > > Volodymyr Paprotski has updated the pull request incrementally with one > additional commit since the last revision: > > comments from Kevin Thanks for the reviews! Re @artur-oracle > Please include the benchmarking tests in this PR. I want to clarify why I have not included the test, since I have included the diff as the first comment. There are three changes in that diff: - encrypt2/decrypt2: these are probably fine to be added to the benchmark permanently - reduce the set size from 128 to 8: this is perhaps fine to include to, but the benchmark regularly is used for much smaller payloads. (See the `Param` for payload in the test). I did not want to change existing results I know people are tracking. - Increased heap size to 20G. I do not know your infrastructure, but that seems like a dangerous thing to do without consulting the build team Re @mcpowers > Any measurable change in existing AES/CBC benchmarks with smaller payloads? Since you asked, here is a wall of text :) Before: Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units AESBench.decrypt AES/CBC/NoPadding 256 128 SunJCE thrpt 3 15466658.213 ± 894512.313 ops/s AESBench.decrypt AES/CBC/NoPadding 2048 128 SunJCE thrpt 3 4311452.996 ± 18242.611 ops/s AESBench.decrypt AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 513129.485 ± 3396.273 ops/s AESBench.decrypt AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 510214.982 ± 4344.772 ops/s AESBench.decrypt2 AES/CBC/NoPadding 256 128 SunJCE thrpt 3 19270331.648 ± 479823.535 ops/s AESBench.decrypt2 AES/CBC/NoPadding 2048 128 SunJCE thrpt 3 12881063.065 ± 7450.889 ops/s AESBench.decrypt2 AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 1854688.581 ± 3139.717 ops/s AESBench.decrypt2 AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 1853681.576 ± 2428.282 ops/s AESBench.encrypt AES/CBC/NoPadding 256 128 SunJCE thrpt 3 7172724.563 ± 61647.697 ops/s AESBench.encrypt AES/CBC/NoPadding 2048 128 SunJCE thrpt 3 918720.063 ± 877.626 ops/s AESBench.encrypt AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 112995.798 ± 57.118 ops/s AESBench.encrypt AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 113001.811 ± 254.675 ops/s AESBench.encrypt2 AES/CBC/NoPadding 256 128 SunJCE thrpt 3 8249489.798 ± 9262.345 ops/s AESBench.encrypt2 AES/CBC/NoPadding 2048 128 SunJCE thrpt 3 1070631.891 ± 71.539 ops/s AESBench.encrypt2 AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 134439.301 ± 69.769 ops/s AESBench.encrypt2 AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 134441.136 ± 6.637 ops/s After Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units AESBench.decrypt AES/CBC/NoPadding 256 128 SunJCE thrpt 3 15565078.411 ± 1036985.429 ops/s AESBench.decrypt AES/CBC/NoPadding 2048 128 SunJCE thrpt 3 4320714.508 ± 81474.132 ops/s AESBench.decrypt AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 511840.239 ± 1967.440 ops/s AESBench.decrypt AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 511477.714 ± 1697.375 ops/s AESBench.decrypt2 AES/CBC/NoPadding 256 128 SunJCE thrpt 3 21913765.368 ± 106973.286 ops/s AESBench.decrypt2 AES/CBC/NoPadding 2048 128 SunJCE thrpt 3 12918625.945 ± 142872.155 ops/s AESBench.decrypt2 AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 1855977.481 ± 3097.924 ops/s AESBench.decrypt2 AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 1855141.602 ± 2071.454 ops/s AESBench.encrypt AES/CBC/NoPadding 256 128 SunJCE thrpt 3 7148105.241 ± 1121822.184 ops/s AESBench.encrypt AES/CBC/NoPadding 2048 128 SunJCE thrpt 3 914586.023 ± 13531.625 ops/s AESBench.encrypt AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 112926.287 ± 232.451 ops/s AESBench.encrypt AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 113047.201 ± 84.197 ops/s AESBench.encrypt2 AES/CBC/NoPadding 256 128 SunJCE thrpt 3 8249585.271 ± 7846.941 ops/s AESBench.encrypt2 AES/CBC/NoPadding 2048 128 SunJCE thrpt 3 1070618.927 ± 3435.745 ops/s AESBench.encrypt2 AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 134456.819 ± 87.493 ops/s AESBench.encrypt2 AES/CBC/NoPadding 16384 128 SunJCE thrpt 3 134455.033 ± 14.600 ops/s (All within 1% except decrypt2 with datasize=256, might be noise, but looks like it got better too) Re @ferakocz > I think this is exactly the reason for the speedup. It takes quite a few > calls before hotspot switches to the intrinsic. The confusion I had, that despite giving it a LOT more warmup, it still would not switch to the intrinsic! Spent some weeks digging. (Though to be fair, I am new to a lot of the codebase, so 'weeks' is also learning) > Was this a problem in a real-world application, or just in the benchmark? It was reported to me via a benchmark, but I am not sure if that was the 'cleaned up' report. ------------- PR Comment: https://git.openjdk.org/jdk/pull/22086#issuecomment-2479676549