Re: RFR: 8259498: Reduce overhead of MD5 and SHA digests [v2]
On Fri, 15 Jan 2021 23:36:35 GMT, Claes Redestad wrote: >> - The MD5 intrinsics added by >> [JDK-8250902](https://bugs.openjdk.java.net/browse/JDK-8250902) shows that >> the `int[] x` isn't actually needed. This also applies to the SHA intrinsics >> from which the MD5 intrinsic takes inspiration >> - Using VarHandles we can simplify the code in `ByteArrayAccess` enough to >> make it acceptable to use inline and replace the array in MD5 wholesale. >> This improves performance both in the presence and the absence of the >> intrinsic optimization. >> - Doing the exact same thing in the SHA impls would be unwieldy (64+ element >> arrays), but allocating the array lazily gets most of the speed-up in the >> presence of an intrinsic while being neutral in its absence. >> >> Baseline: >> (digesterName) (length)Cnt Score >> Error Units >> MessageDigests.digestMD516 15 >> 2714.307 ± 21.133 ops/ms >> MessageDigests.digestMD5 1024 15 >> 318.087 ±0.637 ops/ms >> MessageDigests.digest SHA-116 15 >> 1387.266 ± 40.932 ops/ms >> MessageDigests.digest SHA-1 1024 15 >> 109.273 ±0.149 ops/ms >> MessageDigests.digestSHA-25616 15 >> 995.566 ± 21.186 ops/ms >> MessageDigests.digestSHA-256 1024 15 >> 89.104 ±0.079 ops/ms >> MessageDigests.digestSHA-51216 15 >> 803.030 ± 15.722 ops/ms >> MessageDigests.digestSHA-512 1024 15 >> 115.611 ±0.234 ops/ms >> MessageDigests.getAndDigest MD516 15 >> 2190.367 ± 97.037 ops/ms >> MessageDigests.getAndDigest MD5 1024 15 >> 302.903 ±1.809 ops/ms >> MessageDigests.getAndDigestSHA-116 15 >> 1262.656 ± 43.751 ops/ms >> MessageDigests.getAndDigestSHA-1 1024 15 >> 104.889 ±3.554 ops/ms >> MessageDigests.getAndDigest SHA-25616 15 >> 914.541 ± 55.621 ops/ms >> MessageDigests.getAndDigest SHA-256 1024 15 >> 85.708 ±1.394 ops/ms >> MessageDigests.getAndDigest SHA-51216 15 >> 737.719 ± 53.671 ops/ms >> MessageDigests.getAndDigest SHA-512 1024 15 >> 112.307 ±1.950 ops/ms >> >> GC: >> MessageDigests.getAndDigest:·gc.alloc.rate.norm MD516 15 >> 312.011 ±0.005B/op >> MessageDigests.getAndDigest:·gc.alloc.rate.normSHA-116 15 >> 584.020 ±0.006B/op >> MessageDigests.getAndDigest:·gc.alloc.rate.norm SHA-25616 15 >> 544.019 ±0.016B/op >> MessageDigests.getAndDigest:·gc.alloc.rate.norm SHA-51216 15 >> 1056.037 ±0.003B/op >> >> Target: >> Benchmark (digesterName) (length)Cnt >> Score Error Units >> MessageDigests.digestMD516 15 >> 3134.462 ± 43.685 ops/ms >> MessageDigests.digestMD5 1024 15 >> 323.667 ±0.633 ops/ms >> MessageDigests.digest SHA-116 15 >> 1418.742 ± 38.223 ops/ms >> MessageDigests.digest SHA-1 1024 15 >> 110.178 ±0.788 ops/ms >> MessageDigests.digestSHA-25616 15 >> 1037.949 ± 21.214 ops/ms >> MessageDigests.digestSHA-256 1024 15 >> 89.671 ±0.228 ops/ms >> MessageDigests.digestSHA-51216 15 >> 812.028 ± 39.489 ops/ms >> MessageDigests.digestSHA-512 1024 15 >> 116.738 ±0.249 ops/ms >> MessageDigests.getAndDigest MD516 15 >> 2314.379 ± 229.294 ops/ms >> MessageDigests.getAndDigest MD5 1024 15 >> 307.835 ±5.730 ops/ms >> MessageDigests.getAndDigestSHA-116 15 >> 1326.887 ± 63.263 ops/ms >> MessageDigests.getAndDigestSHA-1 1024 15 >> 106.611 ±2.292 ops/ms >> MessageDigests.getAndDigest SHA-25616 15 >> 961.589 ± 82.052 ops/ms >> MessageDigests.getAndDigest SHA-256 1024 15 >> 88.646 ±0.194 ops/ms >> MessageDigests.getAndDigest SHA-51216 15 >> 775.417 ± 56.775 ops/ms >> MessageDigests.getAndDigest SHA-512 1024 15 >> 112.904 ±2.014 ops/ms >>
Re: RFR: 8259498: Reduce overhead of MD5 and SHA digests [v2]
> - The MD5 intrinsics added by > [JDK-8250902](https://bugs.openjdk.java.net/browse/JDK-8250902) shows that > the `int[] x` isn't actually needed. This also applies to the SHA intrinsics > from which the MD5 intrinsic takes inspiration > - Using VarHandles we can simplify the code in `ByteArrayAccess` enough to > make it acceptable to use inline and replace the array in MD5 wholesale. This > improves performance both in the presence and the absence of the intrinsic > optimization. > - Doing the exact same thing in the SHA impls would be unwieldy (64+ element > arrays), but allocating the array lazily gets most of the speed-up in the > presence of an intrinsic while being neutral in its absence. > > Baseline: > (digesterName) (length)Cnt Score > Error Units > MessageDigests.digestMD516 15 > 2714.307 ± 21.133 ops/ms > MessageDigests.digestMD5 1024 15 > 318.087 ±0.637 ops/ms > MessageDigests.digest SHA-116 15 > 1387.266 ± 40.932 ops/ms > MessageDigests.digest SHA-1 1024 15 > 109.273 ±0.149 ops/ms > MessageDigests.digestSHA-25616 15 > 995.566 ± 21.186 ops/ms > MessageDigests.digestSHA-256 1024 15 > 89.104 ±0.079 ops/ms > MessageDigests.digestSHA-51216 15 > 803.030 ± 15.722 ops/ms > MessageDigests.digestSHA-512 1024 15 > 115.611 ±0.234 ops/ms > MessageDigests.getAndDigest MD516 15 > 2190.367 ± 97.037 ops/ms > MessageDigests.getAndDigest MD5 1024 15 > 302.903 ±1.809 ops/ms > MessageDigests.getAndDigestSHA-116 15 > 1262.656 ± 43.751 ops/ms > MessageDigests.getAndDigestSHA-1 1024 15 > 104.889 ±3.554 ops/ms > MessageDigests.getAndDigest SHA-25616 15 > 914.541 ± 55.621 ops/ms > MessageDigests.getAndDigest SHA-256 1024 15 > 85.708 ±1.394 ops/ms > MessageDigests.getAndDigest SHA-51216 15 > 737.719 ± 53.671 ops/ms > MessageDigests.getAndDigest SHA-512 1024 15 > 112.307 ±1.950 ops/ms > > GC: > MessageDigests.getAndDigest:·gc.alloc.rate.norm MD516 15 > 312.011 ±0.005B/op > MessageDigests.getAndDigest:·gc.alloc.rate.normSHA-116 15 > 584.020 ±0.006B/op > MessageDigests.getAndDigest:·gc.alloc.rate.norm SHA-25616 15 > 544.019 ±0.016B/op > MessageDigests.getAndDigest:·gc.alloc.rate.norm SHA-51216 15 > 1056.037 ±0.003B/op > > Target: > Benchmark (digesterName) (length)Cnt > Score Error Units > MessageDigests.digestMD516 15 > 3134.462 ± 43.685 ops/ms > MessageDigests.digestMD5 1024 15 > 323.667 ±0.633 ops/ms > MessageDigests.digest SHA-116 15 > 1418.742 ± 38.223 ops/ms > MessageDigests.digest SHA-1 1024 15 > 110.178 ±0.788 ops/ms > MessageDigests.digestSHA-25616 15 > 1037.949 ± 21.214 ops/ms > MessageDigests.digestSHA-256 1024 15 > 89.671 ±0.228 ops/ms > MessageDigests.digestSHA-51216 15 > 812.028 ± 39.489 ops/ms > MessageDigests.digestSHA-512 1024 15 > 116.738 ±0.249 ops/ms > MessageDigests.getAndDigest MD516 15 > 2314.379 ± 229.294 ops/ms > MessageDigests.getAndDigest MD5 1024 15 > 307.835 ±5.730 ops/ms > MessageDigests.getAndDigestSHA-116 15 > 1326.887 ± 63.263 ops/ms > MessageDigests.getAndDigestSHA-1 1024 15 > 106.611 ±2.292 ops/ms > MessageDigests.getAndDigest SHA-25616 15 > 961.589 ± 82.052 ops/ms > MessageDigests.getAndDigest SHA-256 1024 15 > 88.646 ±0.194 ops/ms > MessageDigests.getAndDigest SHA-51216 15 > 775.417 ± 56.775 ops/ms > MessageDigests.getAndDigest SHA-512 1024 15 > 112.904 ±2.014 ops/ms > > GC > MessageDigests.getAndDigest:·gc.alloc.rate.norm MD516 15 > 232.009 ±0.006B/op >
Re: RFR: 8259498: Reduce overhead of MD5 and SHA digests [v2]
On Fri, 15 Jan 2021 23:21:00 GMT, Valerie Peng wrote: >> Claes Redestad has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains 20 additional >> commits since the last revision: >> >> - Copyrights >> - Merge branch 'master' into improve_md5 >> - Remove unused Unsafe import >> - Harmonize MD4 impl, remove now-redundant checks from ByteArrayAccess (VHs >> do bounds checks, most of which will be optimized away) >> - Merge branch 'master' into improve_md5 >> - Apply allocation avoiding optimizations to all SHA versions sharing >> structural similarities with MD5 >> - Remove unused reverseBytes imports >> - Copyrights >> - Fix copy-paste error >> - Various fixes (IDE stopped IDEing..) >> - ... and 10 more: >> https://git.openjdk.java.net/jdk/compare/6e03c8d3...cafa3e49 > > test/micro/org/openjdk/bench/java/util/UUIDBench.java line 2: > >> 1: /* >> 2: * Copyright (c) 2020, 2021, Oracle and/or its affiliates. All rights >> reserved. > > nit: other files should also have this 2021 update. It seems most of them are > not updated and still uses 2020. fixed - PR: https://git.openjdk.java.net/jdk/pull/1855