On Tue, 20 Jul 2021 09:57:07 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:
>> Current VectorAPI Java side implementation expresses rotateLeft and >> rotateRight operation using following operations:- >> >> vec1 = lanewise(VectorOperators.LSHL, n) >> vec2 = lanewise(VectorOperators.LSHR, n) >> res = lanewise(VectorOperations.OR, vec1 , vec2) >> >> This patch moves above handling from Java side to C2 compiler which >> facilitates dismantling the rotate operation if target ISA does not support >> a direct rotate instruction. >> >> AVX512 added vector rotate instructions vpro[rl][v][dq] which operate over >> long and integer type vectors. For other cases (i.e. sub-word type vectors >> or for targets which do not support direct rotate operations ) instruction >> sequence comprising of vector SHIFT (LEFT/RIGHT) and vector OR is emitted. >> >> Please find below the performance data for included JMH benchmark. >> Machine: Cascade Lake Server (Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz) >> >> >> <html xmlns:v="urn:schemas-microsoft-com:vml" >> xmlns:o="urn:schemas-microsoft-com:office:office" >> xmlns:x="urn:schemas-microsoft-com:office:excel" >> xmlns="http://www.w3.org/TR/REC-html40"> >> >> <head> >> >> <meta name=ProgId content=Excel.Sheet> >> <meta name=Generator content="Microsoft Excel 15"> >> <link id=Main-File rel=Main-File >> href="file:///C:/Users/jatinbha/AppData/Local/Temp/msohtmlclip1/01/clip.htm"> >> <link rel=File-List >> href="file:///C:/Users/jatinbha/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml"> >> <style> >> >> </style> >> </head> >> >> <body link="#0563C1" vlink="#954F72"> >> >> >> >> Benchmark | (bits) | (shift) | (size) | Baseline Score (ops/ms) | With Opts >> (ops/ms) | Gain >> -- | -- | -- | -- | -- | -- | -- >> RotateBenchmark.testRotateLeftB | 128 | 7 | 256 | 3939.136 | 3836.133 | >> 0.973851372 >> RotateBenchmark.testRotateLeftB | 128 | 7 | 512 | 1984.231 | 1918.27 | >> 0.966757399 >> RotateBenchmark.testRotateLeftB | 128 | 15 | 256 | 3925.165 | 4043.842 | >> 1.030234907 >> RotateBenchmark.testRotateLeftB | 128 | 15 | 512 | 1962.723 | 1936.551 | >> 0.986665464 >> RotateBenchmark.testRotateLeftB | 128 | 31 | 256 | 3945.6 | 3817.883 | >> 0.967630525 >> RotateBenchmark.testRotateLeftB | 128 | 31 | 512 | 1944.458 | 1914.229 | >> 0.984453766 >> RotateBenchmark.testRotateLeftB | 256 | 7 | 256 | 4612.149 | 4514.874 | >> 0.978908964 >> RotateBenchmark.testRotateLeftB | 256 | 7 | 512 | 2296.252 | 2270.237 | >> 0.988670669 >> RotateBenchmark.testRotateLeftB | 256 | 15 | 256 | 4576.628 | 4515.53 | >> 0.986649996 >> RotateBenchmark.testRotateLeftB | 256 | 15 | 512 | 2288.278 | 2270.923 | >> 0.992415694 >> RotateBenchmark.testRotateLeftB | 256 | 31 | 256 | 4624.243 | 4511.46 | >> 0.975610495 >> RotateBenchmark.testRotateLeftB | 256 | 31 | 512 | 2305.459 | 2273.788 | >> 0.986262605 >> RotateBenchmark.testRotateLeftB | 512 | 7 | 256 | 7748.283 | 7777.105 | >> 1.003719792 >> RotateBenchmark.testRotateLeftB | 512 | 7 | 512 | 3906.214 | 3912.647 | >> 1.001646863 >> RotateBenchmark.testRotateLeftB | 512 | 15 | 256 | 7764.653 | 7763.482 | >> 0.999849188 >> RotateBenchmark.testRotateLeftB | 512 | 15 | 512 | 3916.061 | 3919.363 | >> 1.000843194 >> RotateBenchmark.testRotateLeftB | 512 | 31 | 256 | 7779.754 | 7770.239 | >> 0.998776954 >> RotateBenchmark.testRotateLeftB | 512 | 31 | 512 | 3916.471 | 3912.718 | >> 0.999041739 >> RotateBenchmark.testRotateLeftI | 128 | 7 | 256 | 4043.39 | 13461.814 | >> 3.329338501 >> RotateBenchmark.testRotateLeftI | 128 | 7 | 512 | 1996.217 | 6455.425 | >> 3.233829288 >> RotateBenchmark.testRotateLeftI | 128 | 15 | 256 | 4028.614 | 13077.277 | >> 3.246098286 >> RotateBenchmark.testRotateLeftI | 128 | 15 | 512 | 1997.612 | 6452.918 | >> 3.230315997 >> RotateBenchmark.testRotateLeftI | 128 | 31 | 256 | 4123.357 | 13079.045 | >> 3.171940969 >> RotateBenchmark.testRotateLeftI | 128 | 31 | 512 | 2003.356 | 6452.716 | >> 3.22095324 >> RotateBenchmark.testRotateLeftI | 256 | 7 | 256 | 7666.949 | 25658.625 | >> 3.34665393 >> RotateBenchmark.testRotateLeftI | 256 | 7 | 512 | 3855.826 | 12278.106 | >> 3.18429981 >> RotateBenchmark.testRotateLeftI | 256 | 15 | 256 | 7670.901 | 24625.466 | >> 3.210244272 >> RotateBenchmark.testRotateLeftI | 256 | 15 | 512 | 3765.786 | 12272.771 | >> 3.259019764 >> RotateBenchmark.testRotateLeftI | 256 | 31 | 256 | 7660.599 | 25678.864 | >> 3.352069988 >> RotateBenchmark.testRotateLeftI | 256 | 31 | 512 | 3773.401 | 12006.469 | >> 3.181869353 >> RotateBenchmark.testRotateLeftI | 512 | 7 | 256 | 11900.948 | 31242.989 | >> 2.625252123 >> RotateBenchmark.testRotateLeftI | 512 | 7 | 512 | 5830.878 | 15727.149 | >> 2.697217983 >> RotateBenchmark.testRotateLeftI | 512 | 15 | 256 | 12171.847 | 33180.067 | >> 2.72596813 >> RotateBenchmark.testRotateLeftI | 512 | 15 | 512 | 5830.544 | 16740.182 | >> 2.871118372 >> RotateBenchmark.testRotateLeftI | 512 | 31 | 256 | 11909.553 | 31250.882 | >> 2.624018047 >> RotateBenchmark.testRotateLeftI | 512 | 31 | 512 | 5846.747 | 15738.831 | >> 2.691895339 >> RotateBenchmark.testRotateLeftL | 128 | 7 | 256 | 2047.243 | 6888.484 | >> 3.364761291 >> RotateBenchmark.testRotateLeftL | 128 | 7 | 512 | 1005.029 | 3245.931 | >> 3.229688895 >> RotateBenchmark.testRotateLeftL | 128 | 15 | 256 | 1996.921 | 6985.256 | >> 3.498013191 >> RotateBenchmark.testRotateLeftL | 128 | 15 | 512 | 986.906 | 3217.778 | >> 3.260470602 >> RotateBenchmark.testRotateLeftL | 128 | 31 | 256 | 1999.06 | 6977.672 | >> 3.490476524 >> RotateBenchmark.testRotateLeftL | 128 | 31 | 512 | 987.258 | 3236.63 | >> 3.278403416 >> RotateBenchmark.testRotateLeftL | 256 | 7 | 256 | 3752.412 | 12995.954 | >> 3.4633601 >> RotateBenchmark.testRotateLeftL | 256 | 7 | 512 | 1824.093 | 5809.576 | >> 3.184912173 >> RotateBenchmark.testRotateLeftL | 256 | 15 | 256 | 3759.99 | 13262.631 | >> 3.52730486 >> RotateBenchmark.testRotateLeftL | 256 | 15 | 512 | 1823.393 | 5803.872 | >> 3.183006626 >> RotateBenchmark.testRotateLeftL | 256 | 31 | 256 | 3757.134 | 13284.633 | >> 3.535842214 >> RotateBenchmark.testRotateLeftL | 256 | 31 | 512 | 1822.192 | 5824.178 | >> 3.196248255 >> RotateBenchmark.testRotateLeftL | 512 | 7 | 256 | 5794.005 | 15567.753 | >> 2.686872552 >> RotateBenchmark.testRotateLeftL | 512 | 7 | 512 | 2969.393 | 7694.79 | >> 2.591368 >> RotateBenchmark.testRotateLeftL | 512 | 15 | 256 | 5817.292 | 15726.597 | >> 2.703422314 >> RotateBenchmark.testRotateLeftL | 512 | 15 | 512 | 2944.655 | 7664.954 | >> 2.603005785 >> RotateBenchmark.testRotateLeftL | 512 | 31 | 256 | 5822.131 | 16718.64 | >> 2.871567129 >> RotateBenchmark.testRotateLeftL | 512 | 31 | 512 | 2944.763 | 7657.814 | >> 2.600485676 >> RotateBenchmark.testRotateLeftS | 128 | 7 | 256 | 8006.155 | 7976.701 | >> 0.99632108 >> RotateBenchmark.testRotateLeftS | 128 | 7 | 512 | 4031.753 | 4003.43 | >> 0.992975016 >> RotateBenchmark.testRotateLeftS | 128 | 15 | 256 | 8003.879 | 7952.752 | >> 0.993612222 >> RotateBenchmark.testRotateLeftS | 128 | 15 | 512 | 4026.359 | 4014.757 | >> 0.997118488 >> RotateBenchmark.testRotateLeftS | 128 | 31 | 256 | 8000.842 | 7995.733 | >> 0.999361442 >> RotateBenchmark.testRotateLeftS | 128 | 31 | 512 | 4044.421 | 4007.426 | >> 0.990852832 >> RotateBenchmark.testRotateLeftS | 256 | 7 | 256 | 15078.471 | 15034.395 | >> 0.997076892 >> RotateBenchmark.testRotateLeftS | 256 | 7 | 512 | 7236.509 | 7620.334 | >> 1.053040078 >> RotateBenchmark.testRotateLeftS | 256 | 15 | 256 | 15093.661 | 15024.17 | >> 0.995396014 >> RotateBenchmark.testRotateLeftS | 256 | 15 | 512 | 7308.568 | 7724.381 | >> 1.056893909 >> RotateBenchmark.testRotateLeftS | 256 | 31 | 256 | 15332.233 | 15432.113 | >> 1.006514381 >> RotateBenchmark.testRotateLeftS | 256 | 31 | 512 | 7317.18 | 7626.679 | >> 1.042297579 >> RotateBenchmark.testRotateLeftS | 512 | 7 | 256 | 24079.012 | 23939.263 | >> 0.994196232 >> RotateBenchmark.testRotateLeftS | 512 | 7 | 512 | 11441.41 | 11921.21 | >> 1.041935391 >> RotateBenchmark.testRotateLeftS | 512 | 15 | 256 | 23563.675 | 23590.959 | >> 1.001157884 >> RotateBenchmark.testRotateLeftS | 512 | 15 | 512 | 11418.634 | 11949.391 | >> 1.046481654 >> RotateBenchmark.testRotateLeftS | 512 | 31 | 256 | 24035.69 | 23595.385 | >> 0.9816812 >> RotateBenchmark.testRotateLeftS | 512 | 31 | 512 | 11668.091 | 11899.536 | >> 1.019835721 >> RotateBenchmark.testRotateRightB | 128 | 7 | 256 | 3852.421 | 3816.521 | >> 0.990681185 >> RotateBenchmark.testRotateRightB | 128 | 7 | 512 | 1956.766 | 1923.638 | >> 0.983070025 >> RotateBenchmark.testRotateRightB | 128 | 15 | 256 | 3899.136 | 4038.945 | >> 1.035856405 >> RotateBenchmark.testRotateRightB | 128 | 15 | 512 | 1957.733 | 2030.973 | >> 1.037410617 >> RotateBenchmark.testRotateRightB | 128 | 31 | 256 | 3902.5 | 4043.736 | >> 1.03619116 >> RotateBenchmark.testRotateRightB | 128 | 31 | 512 | 1957.728 | 1920.434 | >> 0.980950367 >> RotateBenchmark.testRotateRightB | 256 | 7 | 256 | 4565.887 | 4515.083 | >> 0.988873137 >> RotateBenchmark.testRotateRightB | 256 | 7 | 512 | 2300.057 | 2278.065 | >> 0.990438498 >> RotateBenchmark.testRotateRightB | 256 | 15 | 256 | 4570.754 | 4527.692 | >> 0.990578797 >> RotateBenchmark.testRotateRightB | 256 | 15 | 512 | 2300.524 | 2268.659 | >> 0.986148808 >> RotateBenchmark.testRotateRightB | 256 | 31 | 256 | 4577.569 | 4513.29 | >> 0.98595783 >> RotateBenchmark.testRotateRightB | 256 | 31 | 512 | 2304.335 | 2273.178 | >> 0.986478962 >> RotateBenchmark.testRotateRightB | 512 | 7 | 256 | 7772.483 | 7842.671 | >> 1.009030319 >> RotateBenchmark.testRotateRightB | 512 | 7 | 512 | 3907.265 | 3917.325 | >> 1.002574691 >> RotateBenchmark.testRotateRightB | 512 | 15 | 256 | 7855.653 | 7865.25 | >> 1.001221668 >> RotateBenchmark.testRotateRightB | 512 | 15 | 512 | 3909.845 | 3976.813 | >> 1.017128045 >> RotateBenchmark.testRotateRightB | 512 | 31 | 256 | 7746.765 | 7870.159 | >> 1.015928455 >> RotateBenchmark.testRotateRightB | 512 | 31 | 512 | 3919.596 | 3981.934 | >> 1.01590419 >> RotateBenchmark.testRotateRightI | 128 | 7 | 256 | 4125.151 | 13056.878 | >> 3.165187893 >> RotateBenchmark.testRotateRightI | 128 | 7 | 512 | 2045.201 | 6501.447 | >> 3.17887924 >> RotateBenchmark.testRotateRightI | 128 | 15 | 256 | 4111.736 | 13318.124 | >> 3.23905134 >> RotateBenchmark.testRotateRightI | 128 | 15 | 512 | 2055.355 | 6497.289 | >> 3.161151723 >> RotateBenchmark.testRotateRightI | 128 | 31 | 256 | 4109.353 | 13073.3 | >> 3.181352393 >> RotateBenchmark.testRotateRightI | 128 | 31 | 512 | 2055.431 | 6463.902 | >> 3.14479153 >> RotateBenchmark.testRotateRightI | 256 | 7 | 256 | 7804.976 | 24585.962 | >> 3.150036848 >> RotateBenchmark.testRotateRightI | 256 | 7 | 512 | 3815.818 | 11985.145 | >> 3.140911071 >> RotateBenchmark.testRotateRightI | 256 | 15 | 256 | 7644.977 | 25863.841 | >> 3.383115606 >> RotateBenchmark.testRotateRightI | 256 | 15 | 512 | 3822.508 | 12280.58 | >> 3.212702236 >> RotateBenchmark.testRotateRightI | 256 | 31 | 256 | 7709.635 | 25655.108 | >> 3.327668301 >> RotateBenchmark.testRotateRightI | 256 | 31 | 512 | 3801.5 | 12271.65 | >> 3.228107326 >> RotateBenchmark.testRotateRightI | 512 | 7 | 256 | 12223.711 | 31239.788 | >> 2.555671351 >> RotateBenchmark.testRotateRightI | 512 | 7 | 512 | 5973.571 | 16740.852 | >> 2.802486486 >> RotateBenchmark.testRotateRightI | 512 | 15 | 256 | 12205.47 | 31248.025 | >> 2.560165647 >> RotateBenchmark.testRotateRightI | 512 | 15 | 512 | 5966.513 | 15728.168 | >> 2.6360737 >> RotateBenchmark.testRotateRightI | 512 | 31 | 256 | 12209.405 | 33181.105 | >> 2.71766765 >> RotateBenchmark.testRotateRightI | 512 | 31 | 512 | 5981.527 | 15727.496 | >> 2.629344647 >> RotateBenchmark.testRotateRightL | 128 | 7 | 256 | 2054.509 | 6980.849 | >> 3.397818652 >> RotateBenchmark.testRotateRightL | 128 | 7 | 512 | 997.375 | 3242.374 | >> 3.250907633 >> RotateBenchmark.testRotateRightL | 128 | 15 | 256 | 2051.459 | 6892.389 | >> 3.359749817 >> RotateBenchmark.testRotateRightL | 128 | 15 | 512 | 1002.906 | 3223.342 | >> 3.21400211 >> RotateBenchmark.testRotateRightL | 128 | 31 | 256 | 2044.749 | 6984.157 | >> 3.415654929 >> RotateBenchmark.testRotateRightL | 128 | 31 | 512 | 1004.273 | 3237.496 | >> 3.22372104 >> RotateBenchmark.testRotateRightL | 256 | 7 | 256 | 3811.551 | 13347.75 | >> 3.501920872 >> RotateBenchmark.testRotateRightL | 256 | 7 | 512 | 1892.883 | 5840.85 | >> 3.085689924 >> RotateBenchmark.testRotateRightL | 256 | 15 | 256 | 3821.705 | 14034.823 | >> 3.672398314 >> RotateBenchmark.testRotateRightL | 256 | 15 | 512 | 1799.193 | 5817.533 | >> 3.233412424 >> RotateBenchmark.testRotateRightL | 256 | 31 | 256 | 3816.666 | 14022.31 | >> 3.673968327 >> RotateBenchmark.testRotateRightL | 256 | 31 | 512 | 1796.649 | 5824.13 | >> 3.241662673 >> RotateBenchmark.testRotateRightL | 512 | 7 | 256 | 5943.986 | 15586.254 | >> 2.622188881 >> RotateBenchmark.testRotateRightL | 512 | 7 | 512 | 3022.686 | 7662.241 | >> 2.534911334 >> RotateBenchmark.testRotateRightL | 512 | 15 | 256 | 5958.008 | 15726.859 | >> 2.639616966 >> RotateBenchmark.testRotateRightL | 512 | 15 | 512 | 2998.469 | 7654.703 | >> 2.552870482 >> RotateBenchmark.testRotateRightL | 512 | 31 | 256 | 5937.491 | 15741.207 | >> 2.651154671 >> RotateBenchmark.testRotateRightL | 512 | 31 | 512 | 3014.699 | 7656.837 | >> 2.539834657 >> RotateBenchmark.testRotateRightS | 128 | 7 | 256 | 8172.896 | 8003.474 | >> 0.979270261 >> RotateBenchmark.testRotateRightS | 128 | 7 | 512 | 4111.074 | 4047.267 | >> 0.984479238 >> RotateBenchmark.testRotateRightS | 128 | 15 | 256 | 8225.79 | 8040.421 | >> 0.9774649 >> RotateBenchmark.testRotateRightS | 128 | 15 | 512 | 4129.801 | 4011.919 | >> 0.971455767 >> RotateBenchmark.testRotateRightS | 128 | 31 | 256 | 8176.102 | 8052.686 | >> 0.984905276 >> RotateBenchmark.testRotateRightS | 128 | 31 | 512 | 4117.735 | 4046.522 | >> 0.982705784 >> RotateBenchmark.testRotateRightS | 256 | 7 | 256 | 15213.617 | 15169.51 | >> 0.997100821 >> RotateBenchmark.testRotateRightS | 256 | 7 | 512 | 7530.289 | 7625.581 | >> 1.012654494 >> RotateBenchmark.testRotateRightS | 256 | 15 | 256 | 15238.384 | 15069.978 | >> 0.988948566 >> RotateBenchmark.testRotateRightS | 256 | 15 | 512 | 7275.098 | 7620.764 | >> 1.047513587 >> RotateBenchmark.testRotateRightS | 256 | 31 | 256 | 15299.821 | 15043.765 | >> 0.983264118 >> RotateBenchmark.testRotateRightS | 256 | 31 | 512 | 7273.028 | 7630.97 | >> 1.04921499 >> RotateBenchmark.testRotateRightS | 512 | 7 | 256 | 23998.152 | 23920.046 | >> 0.996745333 >> RotateBenchmark.testRotateRightS | 512 | 7 | 512 | 11582.679 | 11916.382 | >> 1.02881052 >> RotateBenchmark.testRotateRightS | 512 | 15 | 256 | 23982.797 | 23434.756 | >> 0.977148579 >> RotateBenchmark.testRotateRightS | 512 | 15 | 512 | 11629.806 | 11918.759 | >> 1.0248459 >> RotateBenchmark.testRotateRightS | 512 | 31 | 256 | 23988.549 | 23475.629 | >> 0.978618132 >> RotateBenchmark.testRotateRightS | 512 | 31 | 512 | 11650.146 | 11916.47 | >> 1.022860143 >> >> >> >> </body> >> >> </html> > > Jatin Bhateja has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 19 commits: > > - 8266054: Re-designing benchmark to remove noise. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8266054 > - 8266054: Formal argument name change to be more appropriate. > - 8266054: Review comments resolution. > - 8266054: Incorporating styling changes based on reviews. > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8266054 > - Merge http://github.com/openjdk/jdk into JDK-8266054 > - Merge http://github.com/openjdk/jdk into JDK-8266054 > - Merge http://github.com/openjdk/jdk into JDK-8266054 > - Merge branch 'JDK-8266054' of http://github.com/jatin-bhateja/jdk into > JDK-8266054 > - ... and 9 more: > https://git.openjdk.java.net/jdk/compare/a8f15427...b20404e2 No comma after "2021" in `test/micro/org/openjdk/bench/jdk/incubator/vector/RotateBenchmark.java`. ------------- PR: https://git.openjdk.java.net/jdk/pull/3720