On Wed, 19 Jan 2022 17:38:25 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:
>> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing framework. >> >> Following are the performance number of a JMH micro included with the patch >> >> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server) >> >> | | BASELINE AVX2 | WithOpt AVX2 | Gain (opt/baseline) | Baseline AVX3 | >> Withopt AVX3 | Gain (opt/baseline) >> -- | -- | -- | -- | -- | -- | -- | -- >> Benchmark | ARRAYLEN | Score (ops/ms) | Score (ops/ms) | | Score (ops/ms) >> | Score (ops/ms) | >> FpRoundingBenchmark.test_round_double | 1024 | 518.532 | 1364.066 | >> 2.630630318 | 512.908 | 4292.11 | 8.368186887 >> FpRoundingBenchmark.test_round_double | 2048 | 270.137 | 830.986 | >> 3.076165057 | 273.159 | 2459.116 | 9.002507697 >> FpRoundingBenchmark.test_round_float | 1024 | 752.436 | 7780.905 | >> 10.34095259 | 752.49 | 9506.694 | 12.63364829 >> FpRoundingBenchmark.test_round_float | 2048 | 389.499 | 4113.046 | >> 10.55983712 | 389.63 | 4863.673 | 12.48279907 >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > Jatin Bhateja has updated the pull request incrementally with one additional > commit since the last revision: > > 8279508: Adding a test for scalar intrinsification. The JVM currently initializes the x86 mxcsr to round to nearest even, see below in stubGenerator_x86_64.cpp: // Round to nearest (even), 64-bit mode, exceptions masked StubRoutines::x86::_mxcsr_std = 0x1F80; The above works for Math.rint which is specified to be round to nearest even. Please see: https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html : section 4.8.4 The rounding mode needed for Math.round is round to positive infinity which needs a different x86 mxcsr initialization(0x5F80). ------------- PR: https://git.openjdk.java.net/jdk/pull/7094