> For public reference. In certain degree it's apparent from the context,
> but the report is about RSA sign performance difference for OpenSSL
> SPARC T4 Montgomery multiplication module and corresponding Solaris T4
> module, with OpenSSL being significantly slower. The least one can say
> [at this point] is that problem appears to be "multi-layer", in sense
> that there are different factors in play. First question in line is how
> come same code performs that differently on Solaris and Linux. OpenSSL
> on Linux delivers ~70% more RSA1024 signs than on Solaris (if we assume
> that both systems operate at same frequency, which is supported by the
> fact that verify results were virtually identical).

Another question is about suitability of floating-point fcmps and fmovd
instructions. These are used to pick a vector from powers table in
cache-timing neutral manner. I have to admit I haven't done due research
whether or not they are optimal choice in the context, and/or whether or
not we are better off using fand and for instructions for this purpose.
As instructions in question are floating-point they might be executed by
*shared* FPU and not by individual core [which might be disruptive for
pipeline?]...
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [email protected]

Reply via email to