On Wed, 11 Oct 2023 20:58:23 GMT, Srinivas Vamsi Parasa <spar...@openjdk.org> wrote:
>> The goal of this PR is to address the follow-up comments to the SIMD >> accelerated sort PR (#14227) which implemented AVX512 intrinsics for >> Arrays.sort() methods. >> The proposed changes are: >> >> 1) Restriction of the AVX512 sort acceleration to only Intel CPUs. A >> performance regression (due to micro-architectural differences) was reported >> for AMD Zen4 CPUs in the comments section of PR. >> 2) Addressing the build failure due to a bug in GCC 12 (which was fixed in >> version 12.3.1). The details of the bug are at: >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105593 >> 3) Minor changes in Javadoc strings > > Srinivas Vamsi Parasa has updated the pull request incrementally with one > additional commit since the last revision: > > Revert @ForceInline annotations for small array sort methods The answer for slow performance of AVX512 version of x86-simd-sort on Zen 4 is most probably explained in AMD manuals which could be found at: https://www.amd.com/en/search/documentation/hub.html#q=software%20optimization%20guide%20for%20the%20amd%20microarchitecture&f-amd_document_type=Software%20Optimization%20Guides [Software Optimization Guide for the AMD Zen4 Microarchitecture](https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/software-optimization-guides/57647.zip) has following remark in "2.11.2 Code recommendations" chapter: > Avoid the memory destination form of COMPRESS instructions. These forms are > implemented using microcode and achieve a lower store bandwidth than their > register destination forms which use fastpath macro ops. [Software Optimization Guide for the AMD Zen5 Microarchitecture](https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/software-optimization-guides/58455.zip) doesn't have any remark about COMPRESS instructions. Could you add some code that disables the AVX512 version on Zen4, but keeps it enabled on Zen5 and future Zen architectures? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16124#issuecomment-2495483841