Hi Tony, I have updated my code to address the performance regression in smaller data sizes. Please find the updated webrev and JBS link attached.
JBS link: https://bugs.openjdk.java.net/browse/JDK-8214074 Webrev link: http://cr.openjdk.java.net/~svkamath/ghash/webrev02/ Thank you very much for your time and assistance. Please let me know if you have any questions. Regards, Smita -----Original Message----- From: Anthony Scarpino [mailto:anthony.scarp...@oracle.com] Sent: Wednesday, December 5, 2018 10:56 AM To: Kamath, Smita <smita.kam...@intel.com>; 'Vladimir Kozlov' <vladimir.koz...@oracle.com> Cc: Viswanathan, Sandhya <sandhya.viswanat...@intel.com>; OpenJDK Security <security-dev@openjdk.java.net>; hotspot compiler <hotspot-compiler-...@openjdk.java.net> Subject: Re: RFR(S)JDK-8214074: Ghash optimization using AVX instructions Smita, I did some performance testing on the changes and noticed that while the larger data sizes perform better, there is a drop in the smaller data sizes. 16byte data sizes saw a drop from 2.4m ops/sec to 1.7m ops/sec and 64bytes dropped from 2.1m ops/sec to 1.5m ops/sec. At 256 bytes it is roughly equal. At 16k is the best performance increase from 60k ops/sec to 130 ops/sec Are the AVX instructions slower to setup and therefore affect smaller data sizes? Or maybe the larger array allocation in GHASH.java? Tony On 11/29/18 12:08 PM, Anthony Scarpino wrote: > [removed core-libs, added security-dev] > > I'm fine with the jdk changes and solaris-sparc appears to work fine > with the existing security and intrinsic tests. I do not feel > comfortable reviewing the hotspot changes, so either Vladimir or > someone else from the hotspot team would need to look at it. > > Tony > > On 11/20/18 6:45 PM, Kamath, Smita wrote: >> Hi Tony, >> >> Thanks for your comments. >> >> 1) This intrinsic is also used with solaris-sparc, has there been a >> sanity check by anyone to make sure this does not break the sparc >> intrinsic? It may work as the code is now since the sparc intrinsic >> will only use the first two longs in the subkeyHtbl. >> Would it be possible to help with this sanity check? I don't have >> the required set-up to test this scenario. >> >> 2) I have changed the code so that subkeyH corresponds to the first >> two entries of subkeyHtbl. Please find the updated webrev link. >> >> Bug link: https://bugs.openjdk.java.net/browse/JDK-8214074 >> >> Webrev link: http://cr.openjdk.java.net/~svkamath/ghash/webrev01/ >> >> Thanks and Regards, >> Smita >> >> >> >> -----Original Message----- >> From: Anthony Scarpino [mailto:anthony.scarp...@oracle.com] >> Sent: Tuesday, November 20, 2018 2:05 PM >> To: Kamath, Smita <smita.kam...@intel.com>; 'Vladimir Kozlov' >> <vladimir.koz...@oracle.com> >> Cc: Viswanathan, Sandhya <sandhya.viswanat...@intel.com>; >> core-libs-...@openjdk.java.net; hotspot compiler >> <hotspot-compiler-...@openjdk.java.net> >> Subject: Re: RFR(S)JDK-8214074: Ghash optimization using AVX >> instructions >> >> On 11/19/18 12:50 PM, Kamath, Smita wrote: >>> Hi Vladimir, >>> >>> I'd like to contribute an optimization for GHASH Algorithm using AVX >>> Instructions. I have tested this optimization on SKX x86_64 platform >>> and it shows ~20-30% performance improvement for larger message >>> sizes (for example 8k). >>> >>> I, smita.kam...@intel.com <mailto:smita.kam...@intel.com> , Shay >>> Gueuron, (shay.gue...@intel.com <mailto:shay.gue...@intel.com>)and >>> Regev Shemy (regev.sh...@intel.com <mailto:regev.sh...@intel.com>) >>> are contributors to this code. >>> >>> Link to Bug: https://bugs.openjdk.java.net/browse/JDK-8214074 >>> >>> Link to webrev: http://cr.openjdk.java.net/~svkamath/ghash/webrev/ >>> >>> For testing the implementation, I have executed TestAESMain.java. I >>> have executed Jtreg tests and tested this code on 64 bit Windows and >>> Linux platforms. >>> >>> Please review and let me know if you have any comments. >>> >>> Thanks and Regards, >>> >>> Smita >>> >> >> Hi, >> >> This intrinsic is also used with solaris-sparc, has there been a >> sanity check by anyone to make sure this does not break the sparc >> intrinsic? >> It may work as the code is now since the sparc intrinsic will only >> use the first two longs in the subkeyHtbl. >> >> In that same idea, have you consider combining the subkeyH to be the >> first two of subkeyHtbl for the non-avx operations? It would >> eliminate an extra two longs per instance. >> >> Tony >> >