[note -- i changed the cc to rt because there's something preventing me
from posting to openssl-dev... and rt seems to be one way for me to get my
messages through.]

On Mon, 8 Dec 2003, Andy Polyakov wrote:

> > details and a patch are available at
> > <http://arctic.org/~dean/crypto/rsa.html>
>
> Being located in U.S. you have to comply with their export
> requlations:-( See "HOW TO CONTRIBUTE TO OpenSSL" paragraph in README.

right -- i'll take care of that when i update the patch.


> "- Transition from x87 FPU to MMX technology instructions or to SSE or
> SSE2 instructions that operate on MMX registers should be preceded by
> saving the state of the x87 FPU.

this depends on the ABI -- if there's callee saves data in the FPU then it
needs to be saved.  but the x86 ELF ABI defines the FPU stack to be empty
on entry to functions, so there's nothing live to be saved.  see page 38
of <http://www.linuxbase.org/spec/refspecs/elf/abi386-4.pdf> for example.

unfortunately that ABI is pretty old... i wonder where xmm/mmx are
explicitly handled.  i'll ask around at the office (transmeta).


> Intel won't be able to guarantee sane results? I mean wouldn't it be
> more appropriate to use XMM anyway? As fulfilling first requirement
> would have undesirable impact on performance...

i'll go re-implement with xmm and see what happens to the perf.

the trick with xmm regs is that i'm only using 64-bits of the register,
and opteron, pentium-m and efficeon implement their MMX/SSE2 with a dual
pair of 64-bit units.  which generally means issuing two 64-bit ops for
every SSE2 128-bit op.  it's possible throughput is compromised by having
to issue pairs of ops, one of which is always dead.  (efficeon is
generally smart enough to kill dead ops when it can prove one is dead, but
i'm not sure the others can do it.)


>       &picmeup("eax","CRYPTO_cpuid_value");
>       &test(&DWP(0,"eax","",0),$cpuid_sse2);

cool -- i'll make that change.


> I haven't made up my mind about cpuid.c yet... It least it fails to
> compile with -fPIC... A.

yeah i'm sure the SIGILL and pthread_once stuff is playing havoc here.
(curse intel for requiring a faulting test to determine if SSE is
enabled.)

do you know if there's any method i can rely on to be called when PIC code
is loaded?

thanks
-dean

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to