Re: com.sun.crypto.provider.GHASH performance fix

Tim Whittington Thu, 20 Nov 2014 14:39:02 -0800

On 19/08/2014, at 12:32 am, Florian Weimer <fwei...@redhat.com> wrote:


> This change addresses a severe performance regression, first introduced in 
> JDK 8, triggered by the negotiation of a GCM cipher suite in the TLS 
> implementation.  This regression is a result of the poor performance of the 
> implementation of the GHASH function.
> 
> I first tried to eliminate just the allocations in blockMult while still 
> retaining the byte arrays.  This did not substantially increase performance 
> in my micro-benchmark.  I then replaced the 16-byte arrays with longs, 
> replaced the inner loops with direct bit fiddling on the longs, eliminated 
> data-dependent conditionals (which are generally frowned upon in 
> cryptographic algorithms due to the risk of timing attacks), and split the 
> main loop in two, one for each half of the hash state.  This is the result:
> 
>  <https://fweimer.fedorapeople.org/openjdk/ghash-performance/>
> 
> Performance is roughly ten times faster.  My test download over HTTPS is no 
> longer CPU-bound, and GHASH hardly shows up in profiles anymore. (That's why 
> I didn't consider further changes, lookup tables in particular.)  
> Micro-benchmarking shows roughly a ten-fold increase in throughput, but this 
> is probably underestimating it because of the high allocation rate of the old 
> code.
> 

Hi Florian

It looks like your GHASH implementation as posted isn’t passing the tests in 
TestGHASH.java.
The existing JDK implementation does, and the Bouncy Castle GHASH produces the 
same results.

Can you reproduce that?

cheers
tim


> The performance improvement on 32-bit architectures is probably a bit less, 
> but I suspect that using four ints instead of two longs would penalize 64-bit 
> architectures.
> 
> -- 
> Florian Weimer / Red Hat Product Security

Re: com.sun.crypto.provider.GHASH performance fix

Reply via email to