Re: [openssl.org #3054] [PATCH] Efficient and side channel analysis resistant 1024-bit and 2048-bit modular exponentiation, optimizing RSA, DSA and DH of compatible sizes, for AVX2 capable x86_64 platforms

Andy Polyakov via RT Fri, 05 Jul 2013 15:17:19 -0700

Hi,

> This patch is a contribution to OpenSSL.
> 
> It offers an efficient and constant-time implementation of 1024-bit
> and 2048-bit Modular Exponentiation. When the patch is applied to the
> OpenSSL library, it accelerates RSA1024 (verify), RSA2048 (verify and
> sign), DSA1024 (verify and sign), DSA2048 (verify and sign), DH1024
> (GenKey, ComKey), DH2048 (GenKey, ComKey), SRP (server and client
> side)
> 
> This extends the patch offered in [1].
> 
> The implementation is based on the "Redundant Representation" method
> (see [2]), that can accelerate modular exponentiation on sufficiently
> wide SIMD architecture. It uses the soon-to-come AVX2 instructions,
> and is intended to run on the coming Intel(R) architecture Codename
> "Haswell".


There was no code attached, but it's of lesser relevance for the moment. 
What I'd like to discuss is following. Note that multiplication 
subroutine committed in RT#2850 uses only one loop executed 9 times. 
Original suggestion was to use two 7+2 loops with correction to avoid 
overflow in between. In committed code the overflow problem is handled 
by correcting smaller amount of digits but "in-line", i.e. directly in 
loop body. Rationale is that loop underutilizes computational resources 
and it can be done without negative effect on performance. But then, as 
it's possible to correct for overflow "in-line", it should be possible 
to implement even 2048-bit procedure with 29-bit digits. It would 
naturally take two corrections per loop iteration, but loop is twice as 
"heavy", so it should work out exactly as well. Advantage is obviously 
lesser amount of digits and consequently smaller number of loop 
revolutions, 71/74 more specifically.

> References:
> [1] S. Gueron, V. Krasnov: "[PATCH] Efficient and side channel analysis 
> resistant 
> 512-bit and 1024-bit modular exponentiation for optimizing RSA1024 and 
> RSA2048 
> on x86_64 platforms", 
> http://rt.openssl.org/Ticket/Display.html?id=2582&user=guest&pass=guest  
> 
> [2] Shay Gueron, Vlad Krasnov, "Software Implementation of Modular 
> Exponentiation, 
> Using Advanced Vector Instructions Architectures", Proceedings of The 
> International 
> Workshop on the Arithmetic of Finite Fields (WAIFI 2012), LNCS 7369: 119-135 
> (2012).


______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Re: [openssl.org #3054] [PATCH] Efficient and side channel analysis resistant 1024-bit and 2048-bit modular exponentiation, optimizing RSA, DSA and DH of compatible sizes, for AVX2 capable x86_64 platforms

Reply via email to