Chris Zimman wrote:
>
> After my first round of adding nCipher support to SSLeay, I'm going to start
> working on a new hardware API for OpenSSL.
>
Just a quick check, you aren't in the US are you?
> I already have some ideas about what kinds of things I want to do, but I'd like
> some feedback from others.
>
> Some potential issues I see with OpenSSL:
>
> (1) Completely lacks any sort of hardware API
>
> When I added the nCipher support, it basically ammounted to some stub code
>interfaces from
> the guys at nCipher, and then providing callbacks for each library function that I
>wanted
> done in hardware (RNG, mod_exp/mod_exp_crt, etc.) While this was fine for a once
>off type
> of deal, it wouldn't work very well at all as support for different or new hardware
>is
> desired. Plus it means that the codebase becomes littered with platform specific
>calls.
>
Depends. RSA can be handled fairly cleanly. You can replace the high
level PKCS#1 RSA functions or the lower level math operations. You can
then install this as the default RSA method and RSA calls from then on
get redirected to the other code.
If you handle this right the application needs to call some function to
initialise the default RSA method (hardware_init() say) and is otherwise
unchanged.
This can be done without changing the library at all.
Ideally it should have things like rsa_sign/rsa_verify callbacks too but
the current system is usable.
Unfortunately analagous stuff can't be currently done with DSA or DH.
What we should have is a DSA_METHOD and a DH_METHOD as well.
If this thing is to be handled totally transparently then you'd probably
need to have some way for OpenSSL to transparently load a shared lib
cross platform.
> (2) Doesn't have the capability internally to support calls out to hardware in a
> nonblocking fashion
>
Things like PKCS#11 don't allow this either.
> The hardware from nCipher at least, is very parallel internally, and is designed to
>efficiently
> support a large number of simultaneous transactions. The initial latency from
>submission to
> receipt can be high in relative terms. With my rev1 for SSLeay, the only way to
>really see the
> benefit of hardware is to use a large thread pool, and have a thread per
>SSL_connect()/accept().
>
> As an example metric, a (we'll say non-optimized for the purpose of discussion)
>server running
> on an UltraSPARC will quickly get the CPU pegged if it's being hammered with SSL
>connections.
> When the nCipher hardware is employed, the machine will be able to handle just about
>as many
> connections as you could reasonably throw at it, and remain at between 25%-30% CPU
>utilization,
> with most of that time spent in I/O wait. The problem with this is that the actual
>connection
> latency is higher than without hardware, because of the time it takes to get data in
>and out
> of the accelerator. But by submitting several requests simultaneously, the hardware
>can be
> taken good advantage of.
>
> The hardware itself can support asynchronous submits via an interface that's rougly
>analagous
> to select(), but OpenSSL currently doesn't support a return for functions that's
>analagous to
> EAGAIN. That's something that would need to be added. Even though I didn't think
>it was so
> terrible when I first did it, I think the idea of forcing people to use threads to
>obtain
> decent performance is unacceptable.
>
> I've been looking over the codebase to see how this would need to be accomplished.
>It's
> definitely not a trivial task, but a managable one.
>
Quite a few applications already follow the "one SSL connection per
thread or process" model.
If I understand your suggestion you are saying that a more effective
model would be to handle multiple SSL connections per thread or even
just have a single thread with a select() loop.
My initial thought is that this could be tricky to handle. Would you be
able to simultaneously select() on a group of fds and the hardware for
example?
> (3) Doesn't understand the notion of a key that it cannot see
>
> Hardware key management is (arguably) much more insecure if the keys are taken down
>out of
> the "secure" hardware for operations. Ideally, OpenSSL would perform the
>encrypt/decrypt
> operations, and be totally oblivious to whether *it* actually did them or not. If
>that
> statement confuses anyone, just think of the callback interfaces. The library has
>no way
> of knowing what the callback functions are actually doing, it just expects that when
>it
> makes the call, it gets back what it was expecting. I am thinking of trying
>something
> similar in terms of how keys are dealt with.
>
Well it does have a notion of hardware keys. Again this is only handled
with RSA. RSA keys can have separate methods. A typical smart card
application might load the public key components into an RSA structure
and redirect rsa_mod_exp to the card while keeping the rest in software.
This needed a few kludges before OpenSSL 0.9.4 but it is a bit cleaner
now.
What is doesn't have a notion of is how to "load" the keys. This needs
some kind of key database API. If this is handled properly then an
application should be able to "load" a key from, say PKCS#11, a PKCS#12
file, the current stuff like PEM files or anything else and not notice
any difference except (possibly) it can't access the private key
components.
Steve.
--
Dr Stephen N. Henson. http://www.drh-consultancy.demon.co.uk/
Personal Email: [EMAIL PROTECTED]
Senior crypto engineer, Celo Communications: http://www.celocom.com/
Core developer of the OpenSSL project: http://www.openssl.org/
Business Email: [EMAIL PROTECTED] PGP key: via homepage.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [EMAIL PROTECTED]
Automated List Manager [EMAIL PROTECTED]