On Thu, 27 Oct 2022 05:10:59 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:

>> vpaprotsk has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   extra whitespace character
>
> src/java.base/share/classes/com/sun/crypto/provider/Poly1305.java line 175:
> 
>> 173:             // Choice of 1024 is arbitrary, need enough data blocks to 
>> amortize conversion overhead
>> 174:             // and not affect platforms without intrinsic support
>> 175:             int blockMultipleLength = (len/BLOCK_LENGTH) * BLOCK_LENGTH;
> 
> Since Poly processes 16 byte chunks, a strength reduced version of above 
> expression could be len & (~(BLOCK_LEN-1)

I guess I got no issue with either version.. I was mostly thinking about code 
clarity? I think your version is 'more reliable' so just gonna switch it, 
thanks.

> test/micro/org/openjdk/bench/javax/crypto/full/Poly1305DigestBench.java line 
> 94:
> 
>> 92:             throw new RuntimeException(ex);
>> 93:         }
>> 94:     }
> 
> On CLX patch shows performance regression of about 10% for block size 
> 1024-2048+.
> 
> CLX (Non-IFMA target)
> 
> Baseline (JDK-20):-
> 
> Benchmark                   (dataSize)  (provider)   Mode  Cnt        Score   
> Error  Units
> Poly1305DigestBench.digest          64              thrpt    2  3128928.978   
>        ops/s
> Poly1305DigestBench.digest         256              thrpt    2  1526452.083   
>        ops/s
> Poly1305DigestBench.digest        1024              thrpt    2   509267.401   
>        ops/s
> Poly1305DigestBench.digest        2048              thrpt    2   305784.922   
>        ops/s
> Poly1305DigestBench.digest        4096              thrpt    2   142175.885   
>        ops/s
> Poly1305DigestBench.digest        8192              thrpt    2    72142.906   
>        ops/s
> Poly1305DigestBench.digest       16384              thrpt    2    36357.000   
>        ops/s
> Poly1305DigestBench.digest     1048576              thrpt    2      676.142   
>        ops/s
> 
> 
> Withopt:
> Benchmark                   (dataSize)  (provider)   Mode  Cnt        Score   
> Error  Units
> Poly1305DigestBench.digest          64              thrpt    2  3136204.416   
>        ops/s
> Poly1305DigestBench.digest         256              thrpt    2  1683221.124   
>        ops/s
> Poly1305DigestBench.digest        1024              thrpt    2   457432.172   
>        ops/s
> Poly1305DigestBench.digest        2048              thrpt    2   277563.817   
>        ops/s
> Poly1305DigestBench.digest        4096              thrpt    2   149393.357   
>        ops/s
> Poly1305DigestBench.digest        8192              thrpt    2    79463.734   
>        ops/s
> Poly1305DigestBench.digest       16384              thrpt    2    41083.730   
>        ops/s
> Poly1305DigestBench.digest     1048576              thrpt    2      705.419   
>        ops/s

Odd, I measured it on `11th Gen Intel(R) Core(TM) i7-11700 @ 2.50GHz`, will go 
again

-------------

PR: https://git.openjdk.org/jdk/pull/10582

Reply via email to