Re: [openssl.org #2045] [PATCH] Use Intel AES-NI automatically where available.

2010-07-20 Thread David Woodhouse via RT
On Mon, 2009-09-14 at 23:13 +0200, David Woodhouse via RT wrote:
> For now, let's at least address the major disadvantage of the engine,
> which is that it doesn't even get _used_ unless someone registers it.

Updated patch, following the discussion in PR#2305:

On Tue, 2010-07-20 at 00:59 +0200, Stephen Henson via RT wrote:
> The change suggested in PR#2045 has problems if the ENGINE_add() call
> fails: it ends up adding a reference to a freed up ENGINE which is
> likely to subsequently contain garbage and generally spoil your whole
> day.
> 
> This will happen if an ENGINE with the same name is added multiple
> times, for example different libraries, in your case curl and mod_ssl.
> ...
> 1. The patch in PR#2045 should check the return value of ENGINE_add()
> so you now have:
> 
> if (ENGINE_add(toadd))
>  ENGINE_register_complete(toadd);

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation

Index: crypto/engine/eng_aesni.c
===
RCS file: /home/dwmw2/openssl-cvs/openssl/crypto/engine/eng_aesni.c,v
retrieving revision 1.7
diff -u -p -r1.7 eng_aesni.c
--- crypto/engine/eng_aesni.c	22 May 2010 00:20:42 -	1.7
+++ crypto/engine/eng_aesni.c	20 Jul 2010 08:11:06 -
@@ -104,7 +104,8 @@ void ENGINE_load_aesni (void)
 	ENGINE *toadd = ENGINE_aesni();
 	if (!toadd)
 		return;
-	ENGINE_add (toadd);
+	if (ENGINE_add (toadd))
+		ENGINE_register_complete (toadd);
 	ENGINE_free (toadd);
 	ERR_clear_error ();
 #endif


Re: [openssl.org #2045] [PATCH] Use Intel AES-NI automatically where available.

2010-03-28 Thread Jean-Marc Desperrier

 On 26/03/2010 18:31, Andy Polyakov wrote:

>  My patch (unapplied for 6 months now) would at least fix the problem of
>  the AESNI engine not being used automatically,

The reason for low priority is that the code is in development, lack of
hardware...
Hum ? Maybe the openssl team doesn't have the hardware to test the code 
yet, but there's now a good number of Arrandale and Clarkdale out now.

Asus is selling notebooks with Arrandale since January.

__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   majord...@openssl.org


Re: [openssl.org #2045] [PATCH] Use Intel AES-NI automatically where available.

2010-03-27 Thread David Woodhouse
On Fri, 2010-03-26 at 18:31 +0100, Andy Polyakov wrote:
> > My question was about the inconsistency between, for example,
> > SSE-optimised and AESNI-optimised functions. Both are implemented as
> > perlasm; that's not relevant. What _is_ relevant, however, is that the
> > SSE optimisations end up in the 'core' AES_encrypt() function which is
> > tested by 'openssl speed aes', while the AESNI version is in an engine
> > and isn't even used by default unless the application explicitly asks
> > for it.
> > 
> > My patch (unapplied for 6 months now) would at least fix the problem of
> > the AESNI engine not being used automatically,
> 
> The reason for low priority is that the code is in development, 

The current state is actually fairly stable; it hasn't had a significant
change since last May. There's more development to come (for CCM, other
forms of combined MAC+ENC operation as cache-locality optimisations,
etc.) but all that depends on other changes in OpenSSL -- so what we
have at the moment is going into active use as it is.

Hence my backporting it to 0.9.8 and 1.0.0 and making it available to
users and Linux distributions. I'd be happy with our current code
getting into those branches (it's just a new engine; no ABI changes¹)
and letting the new stuff happen in later 1.0.x or 1.1.0).

In the meantime, it does kind of suck that I can't even say 'just use
OpenSSL HEAD; it works there'.

> lack of hardware...

The hardware is coming, and it's going to be common. I'm in the
unenviable position that I need to be enabling this _early_ in
slow-moving OS distributions such as RHEL6, and other products with a
lifetime measured in years, so that it'll work properly on the hardware
when it _does_ arrive. That's why I'm working on the backports and
getting them into distributions right now, rather than waiting.

If it's an issue that nobody in the core OpenSSL team has appropriate
hardware RIGHT NOW, then we may perhaps be able to do something about
that... let me know if you want me to follow that up. No promises,
though.

> > but I still don't quite understand why it should be an engine while
> > SSE support is not. I'd like to understand the logic.

 

Thank you very much for the explanation. I'd read through the previous
threads from the time it was originally submitted, but had failed to
pick out the important details. It makes a lot more sense to me now.

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation

¹ That is, as long as I fudge around the 64-bit OPENSSL_ia32_cap issue,
  there are no ABI changes :) 

__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   majord...@openssl.org


Re: [openssl.org #2045] [PATCH] Use Intel AES-NI automatically where available.

2010-03-26 Thread Andy Polyakov
PM,

> Though I am not a member of the OpenSSL team, I totally agree with you. 
> As for the AES, the Westmere CPUs have also a new instruction for the 
> GHASH (pclmulqdq / _mm_clmulepi64_si128).

It's not an instruction *for* GHASH, it's an instruction that among
other thing can be used for GHASH.

> This as well is only available 
> as intrinsic or in native assembler.
> 
> So, when I offered some weeks ago a contribution regarding the GHASH for 
> the GCM, (now with a fallback from pclmulqdq to SSE2 to native C), I was 
> instructed that (at least inline) assembler or intrinsics are not an 
> option for OpenSSL. 
> 
>> Inline assembler (or exotic intrinsics) is not considered
>> as viable option for MMX/SSE (or any code bigger than couple of
>> instructions), perlasm code is.

Do you think it's groundless and totally unreasonable? Well, it's more
of a rhetorical question...

> As all major compilers for Intel CPUs support intrinsics and,

Well, I have neither of them. Of course it's not *completely* true, but
 OpenSSL does not assume that availability of such compilers is
universal. And I know it's appreciated.

> if used 
> correctly, optimize to the same instructions as direct assembler,

It's just that we have seen one too many compiler bugs, have observed
how performance varies among compiler versions, how compilers do poor
job allocating registers (and just poor job)...

> IMHO 
> these policies should be reconsidered to keep OpenSSL competitive.

??? Are OpenSSL assembler modules slower than compiler generated code?
Is perlasm less portable than intrinsics?

> For good reasons perlasm is not an option for a company like Intel.

??? OpenSSL is not Intel...

> To get 
> a solution, I now use a self-patched version of OpenSSL with intrinsics 
> which fulfills my and my customer's requirements.

Then fight for it (but not in this thread!). Your SSE2 code was
effectively dismissed, because it was slower. At least it was my
conclusion and as you didn't refute it, I assume it was correct. Well, I
still wouldn't actually accept intrinsic-based code, but ideas could
have been used in assembler. Cheers. A.
__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   majord...@openssl.org


Re: [openssl.org #2045] [PATCH] Use Intel AES-NI automatically where available.

2010-03-26 Thread Andy Polyakov
> My question was about the inconsistency between, for example,
> SSE-optimised and AESNI-optimised functions. Both are implemented as
> perlasm; that's not relevant. What _is_ relevant, however, is that the
> SSE optimisations end up in the 'core' AES_encrypt() function which is
> tested by 'openssl speed aes', while the AESNI version is in an engine
> and isn't even used by default unless the application explicitly asks
> for it.
> 
> My patch (unapplied for 6 months now) would at least fix the problem of
> the AESNI engine not being used automatically,

The reason for low priority is that the code is in development, lack of
hardware...

> but I still don't quite
> understand why it should be an engine while SSE support is not. I'd like
> to understand the logic.

The original reason for suggesting the engine was alignment requirement
for key schedule imposed by submitter [from Intel]. The requirement
couldn't (and can't) be tolerated in common code. Another reason for
favoring engine is option to implement algorithms that otherwise
wouldn't make sense to implement. Simplest example in this particular
case is ECB. It's no point implementing dedicated ECB subroutine in
general code (won't be any faster), while dedicated ECB subroutine for
specifically AESNI delivers 3x performance improvement (over
non-dedicated procedure using single block AESNI subroutine). Another
such example is [Galois] counter mode (which is under development).

> Should we be moving the SSE optimisations out into their own engine too?

It's an option... If there was an inter-procedural or "super-procedural"
(something similar to above mentioned ECB) optimization specific for SSE
it would definitely be the case. But as of now, SSE code is limited to
lowest-level leaf functions and lifting it into an engine is problematic
to motivate. Not to mention that it would result in code duplication
(meaning more maintenance as if it wasn't enough). A.
__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   majord...@openssl.org


RE: [openssl.org #2045] [PATCH] Use Intel AES-NI automatically where available.

2010-03-26 Thread David Woodhouse
On Thu, 2010-03-25 at 17:57 +0100, PMHager wrote:
> 
> As all major compilers for Intel CPUs support intrinsics and, if used 
> correctly, optimize to the same instructions as direct assembler, IMHO 
> these policies should be reconsidered to keep OpenSSL competitive.
> 
> For good reasons perlasm is not an option for a company like Intel. To get 
> a solution, I now use a self-patched version of OpenSSL with intrinsics 
> which fulfills my and my customer's requirements. 

I'm not sure I understand you. You seem to be talking about the merits
of using inline assembler ('__asm__()' statements inside C code) vs.
external assembler-only files which are processed by perl and then
assembled (and which by necessity contain whole functions which are
called from the C code).

I have no interest in that debate. I'm quite happy using the perlasm
approach. It's a PITA sometimes, but I see the portability advantages of
it, and OpenSSL is a highly portable project.

My question was about the inconsistency between, for example,
SSE-optimised and AESNI-optimised functions. Both are implemented as
perlasm; that's not relevant. What _is_ relevant, however, is that the
SSE optimisations end up in the 'core' AES_encrypt() function which is
tested by 'openssl speed aes', while the AESNI version is in an engine
and isn't even used by default unless the application explicitly asks
for it.

My patch (unapplied for 6 months now) would at least fix the problem of
the AESNI engine not being used automatically, but I still don't quite
understand why it should be an engine while SSE support is not. I'd like
to understand the logic.

Should we be moving the SSE optimisations out into their own engine too?

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation

__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   majord...@openssl.org


RE: [openssl.org #2045] [PATCH] Use Intel AES-NI automatically where available.

2010-03-25 Thread PMHager
Though I am not a member of the OpenSSL team, I totally agree with you. 
As for the AES, the Westmere CPUs have also a new instruction for the 
GHASH (pclmulqdq / _mm_clmulepi64_si128). This as well is only available 
as intrinsic or in native assembler. 

So, when I offered some weeks ago a contribution regarding the GHASH for 
the GCM, (now with a fallback from pclmulqdq to SSE2 to native C), I was 
instructed that (at least inline) assembler or intrinsics are not an 
option for OpenSSL. 

> Inline assembler (or exotic intrinsics) is not considered
> as viable option for MMX/SSE (or any code bigger than couple of
> instructions), perlasm code is.

As all major compilers for Intel CPUs support intrinsics and, if used 
correctly, optimize to the same instructions as direct assembler, IMHO 
these policies should be reconsidered to keep OpenSSL competitive.

For good reasons perlasm is not an option for a company like Intel. To get 
a solution, I now use a self-patched version of OpenSSL with intrinsics 
which fulfills my and my customer's requirements.

Peter-Michael

--

Peter-Michael Hager - acm senior - HAGER-ELECTRONICS GmbH - Germany


On Mon, 2009-09-14 and Thu, 2010-30-25 David Woodhouse via RT wrote:
> I'm a little confused about the way Intel AES-NI is supported in OpenSSL
> HEAD.
> 
> This is just a feature of new CPUs, like SSE is. Yet SSE support is
> directly included in the normal assembly routines for x86, while AES-NI
> is implemented separately as an engine. Why is that?
> 
> Are we slowly moving _all_ the 'special' implementations to engines, and
> uncluttering the core implementations? Or are we just being
> inconsistent? Or is there some distinction between the two (SSE/AESNI)
> that I'm missing, which makes it sensible to treat them differently?
> 
> For now, let's at least address the major disadvantage of the engine,
> which is that it doesn't even get _used_ unless someone registers it.
> 
> diff --git a/crypto/engine/eng_aesni.c b/crypto/engine/eng_aesni.c
> index 2a997ca..91fb5b8 100644
> --- a/crypto/engine/eng_aesni.c
> +++ b/crypto/engine/eng_aesni.c
> @@ -106,6 +106,7 @@ void ENGINE_load_aesni (void)
> return;
> ENGINE_add (toadd);
> ENGINE_free (toadd);
> +   ENGINE_register_complete (toadd);
> ERR_clear_error ();
>  #endif
>  }


__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   majord...@openssl.org


Re: [openssl.org #2045] [PATCH] Use Intel AES-NI automatically where available.

2010-03-25 Thread David Woodhouse
On Mon, 2009-09-14 at 23:13 +0200, David Woodhouse via RT wrote:
> I'm a little confused about the way Intel AES-NI is supported in OpenSSL
> HEAD.
> 
> This is just a feature of new CPUs, like SSE is. Yet SSE support is
> directly included in the normal assembly routines for x86, while AES-NI
> is implemented separately as an engine. Why is that?
> 
> Are we slowly moving _all_ the 'special' implementations to engines, and
> uncluttering the core implementations? Or are we just being
> inconsistent? Or is there some distinction between the two (SSE/AESNI)
> that I'm missing, which makes it sensible to treat them differently?
> 
> For now, let's at least address the major disadvantage of the engine,
> which is that it doesn't even get _used_ unless someone registers it.
> 
> diff --git a/crypto/engine/eng_aesni.c b/crypto/engine/eng_aesni.c
> index 2a997ca..91fb5b8 100644
> --- a/crypto/engine/eng_aesni.c
> +++ b/crypto/engine/eng_aesni.c
> @@ -106,6 +106,7 @@ void ENGINE_load_aesni (void)
> return;
> ENGINE_add (toadd);
> ENGINE_free (toadd);
> +   ENGINE_register_complete (toadd);
> ERR_clear_error ();
>  #endif
>  }

Ping?

-- 
dwmw2

__
OpenSSL Project http://www.openssl.org
Development Mailing List   openssl-dev@openssl.org
Automated List Manager   majord...@openssl.org