Okay, I've finally come up with a patch that enables the OpenSSL asm
code in a way which is generically controllable and extends to other
code which may want to enable CPU-specific optimizations (e.g. libgmp).

The patch is at


It's based on a patch originally by Mike Silbersack <[EMAIL PROTECTED]>,
but the MACHINE_CPU stuff is mine and he shouldn't be blamed for it.
Speedups are on the order of 3x-5x depending on the algorithm, for
those which have asm cores available.  Even the 386 should benefit
from significant speed improvements, although I haven't tested this
patch on a 386 or 486.

It looks like the OpenSSL alpha asm code is broken (using the vendor
build process doesn't build it either) - sorry, folks.

The patch is fairly self-explanatory, and introduces a new variable
called MACHINE_CPU which contains an unordered list of the CPU
generations which we would like optimizations for, if
present. Basically, this should be set to your CPU type plus all
backwards-compatible revisions: e.g. MACHINE_CPU=i686 i585 i486 i386.
I prefer doing it this way (MACHINE_CPU being a list) since it greatly
simplifies the makefiles:

For example, OpenSSL has Pentium ASM code for several algorithm cores
in libcrypto.  This code is what we want to compile on all "pentium
class and above" CPUs (Pentium, PPro, Pentium II/III, AMD, ...), but
there is also code for 686-class CPUs, and someday there may be
AMD-optimized asm code, etc.  If MACHINE_CPU is only a single word
containing the exact CPU generation we intend to run on (e.g. "k6")
then the makefile tests for whether to use the pentium code need to
actually check for the name of all pentium-compatible CPUs and above,
and if we miss one or the user uses a name we don't support then they
won't get any optimization.  Doing it as a list (i.e. MACHINE_CPU is a
list of preferences or features we'd like) means that we can easily
pick the best code to use based on what is available, and makes it
more robust against mistakes.

I'm not sure whether the way I've introduced MACHINE_CPU into sys.mk
is the best way to do it, and whether make(1) also needs to be taught
about it.

I'd like to get this committed ASAP, please review.


PGP signature

Reply via email to