For authenticated encryption speed on a typical general purpose processor
(such as Atom), I would suggest AES-128 in GCM (Galois Counter Mode),
this does one 12-round AES per 16 bytes, plus one extra per message, with
no additional hashing algorithm use.
I don't know if that mode is in TLS, or you will have to run it
independently
of TLS. Running outside TLS context will also allow you to manage key
selection
independently of socket level connection setup, so If you maintain a key and
message number counter for each direction with each peer (64 bytes per
peer including the IP address, 320,000 bytes for 5000 peers), you can even
avoid the need to keep 5000 open connections (which uses a lot more RAM,
just for the TCP state in the kernel).
Keys can be exchanged over once-in-a-while TLS connections with certificate
and client certificate checking against your own private CA only (so a
Comodo
incident will not affect it).
On 14-11-2011 02:16, Curt Sampson wrote:
I'm dealing with some of the exact same issues; I am trying to satisfy
a requirement that a rather lightweight host (an embedded system with
an Atom processor) handle conversations with about 5,000 other hosts
with strict latency requirements. I'm currently attempting this as
peer-to-peer TLS communication, but I'm not sure if it's actually going
to meet the latency requirements.
Basically, you're going to have to keep all of your TLS connections up
and running at all times; in my quick-and-dirty benchmarks connection
setup uses about a thousand times more CPU than sending a single message
on an existing connection.
If you really, really need keep the number of sockets down you can look
at using DTLS/UDP instead of TLS/TCP and hacking OpenSSL to let you use
an unconnected socket for multiple DTLS connections (standard OpenSSL
will not do this). I had a reference somewhere to a page discussing
various ways of doing this, but I seem to have lost it, so you'll have
to dig around and see if you can find it. (Please post it here if you do.)
But if you're only dealing with something on the order of 10,000 other
hosts, Linux at least will handle that just fine once you bump up the
per-process socket limits. (Do make sure you're using epoll or whatever;
libevent can be helpful here.)
In my situation, unfortunately, even using RC4-SHA (which is only about
10% slower than NONE-MD5) may be too slow to handle the traffic volume
on my system, so we may just be dropping authentication of our data
entirely. I'd be interested in ideas about how to avoid doing this.
cjs
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users@openssl.org
Automated List Manager majord...@openssl.org