I agree with most of that. However, based on benchmarks on my desktop (a
Core 2 Duo E6400) the 32-bit x86 assembler mont exp implementation in
OpenSSL seems a _lot_ slower than my GPU.

Of course your CPU is a lot slower to perform 2048 signs, but it's a lot faster to perform one. I mean if you simply don't get more than 1 sign request within 240ms and if you insist on always using GPU, you'd have to ask it to perform 1 real and 2047 bogus signs. And so you'll have GPU spending 240ms on one sign instead of having CPU spending portion of millisecond. Resulting in ... lower throughput.

I suppose my question is whether anyone has considered providing an
alternative, asynchronous, interface to the OpenSSL crypto libraries. The
current API is not a suitable abstraction for a crypto device which
exhibits latency comparable to or greater than its execution time, unless
you're prepared to fire off a lot of threads in the client. Either batched
submission or callbacks would address this.

I understand the question and once again I don't mean to discourage anybody from looking for the answer. I'm just saying that solution is likely to be more complex than anticipated and that outcome might be limited to specific kind of applications (e.g. only multi-threaded ones, as opposite to "fork-on-accept"). A.
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to