On Mon, 3 Feb 2025 23:56:18 GMT, Jamil Nimeh <jni...@openjdk.org> wrote:
>> This enhancement makes a change to the ChaCha20 block function intrinsic on >> aarch64, moving away from the block parallel implementation and to the >> quarter-round parallel implementation that was done on x86_64. Assembly >> language profiling yielded an 11% improvement in throughput. When put >> together as an intrinsic and hooked into the JCE ChaCha20 cipher, the gains >> are more modest, somewhere in the 2-4% range depending on job size, but >> still an improvement. > > Jamil Nimeh has updated the pull request incrementally with one additional > commit since the last revision: > > Add explanatory comment and reference for quarter round intrinsic Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/23397#pullrequestreview-2592206831