On Thu, 17 Feb 2022 17:43:43 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:
>> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> >> TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain ratio | >> Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio >> -- | -- | -- | -- | -- | -- | -- >> 1024.00 | 510.41 | 1811.66 | 3.55 | 510.40 | 502.65 | 0.98 >> 2048.00 | 293.52 | 984.37 | 3.35 | 304.96 | 177.88 | 0.58 >> 1024.00 | 825.94 | 3387.64 | 4.10 | 750.77 | 1925.15 | 2.56 >> 2048.00 | 411.91 | 1942.87 | 4.72 | 412.22 | 1034.13 | 2.51 >> >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional > commit since the last revision: > > 8279508: Fixing for windows failure. src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4146: > 4144: vaddpd(xtmp1, src , xtmp1, vec_enc); > 4145: vrndscalepd(dst, xtmp1, 0x4, vec_enc); > 4146: evcvtpd2qq(dst, dst, vec_enc); Why do we need vrndscalepd in between, could we not directly use cvtpd2qq after vaddpd? ------------- PR: https://git.openjdk.java.net/jdk/pull/7094