Hi all,
today is a great day (could say night considering the time I'm
posting) !
After several months of efforts by the Exceliance team, we managed to
rework all the buffer and connection layers in order to get SSL
working bon both sides of HAProxy.
The code is still in preview, we can't break it anymore but
considering that we've fixed some bugs today, I'm sure that some
still remain in the 100+ patches and 16000 lines of patches this work
required (not counting the many ones that were abandonned or
re-merged multiple times).
The code is still going to change because we're getting closer to
something which will allow outgoing connections to be reused,
resulting in keep-alive on both sides. But not yet, be patient.
What's done right now ?
1) connections
Connections are independant entities which can be instanciated without
allocating a full session and its buffers. Connections are
responsible for handshakes and control, and pass data to buffers.
Connection-level TCP-request rules, the PROXY protocol and SSL
handshakes are processed at the connection level.
2) buffers
buffers have been split in three: channel (the tube where the data
flows), buffer (where data is temporarily stored for analysis or
forwarding) and optionally the pipe (stored in kernel area for
forwarding only). New buffers only handle data without consideration
for what it's used for. Health checks are currently being migrated to
use this with connections.
3) data I/O
data I/O are now performed between a connection and a buffer. We have
two data-layer operations now : raw and ssl. It is very easy to add
new ones now, we're even wondering whether it would make sense to
write one dedicated to yassl in native mode (without the openssl
API).
4) socket I/O
at the moment we only support normal sockets, but the design
considered "remote sockets" so that we could off-load heavy
processing to external processes (eg: HTTP on one process, SSL on two
other). Remote sockets have not been started yet but surely will.
SHMs have also been considered to emulate sockets.
5) configuration
Configuration has been extended to support the "ssl" keyword on "bind"
lines and on "server" lines. For both, the syntax is :
... ssl <cert.pem> [ciphers <suite>] [nosslv3] [notlsv1]
<cert.pem> is a PEM file made by concatenating the .crt and the
.key of a certificate.
eg: bind :443 ssl /etc/haproxy/pub.pem
server local 192.168.0.1:443 ssl ciphers EXPORT40 notlsv1
6) session management
SSL sessions are stored in a shared memory cache, allowing haproxy to
run with nbproc > 1 and still work correctly. This is the session
cache we developped for stunnel then stud, it was time to adopt it in
haproxy. It's so fast that we don't use openssl's cache at all, since
even at one single process, it's at least as fast.
7) other
A lot remains to be done, mainly some of the aforementionned structres
are still included in other ones, which simplified the split Once all
the work is over, we should end up with less memory used per
connection. This is important to better handle DDoS.
At the moment, everything we could try seems to work fine. The SSL
stacks well on top of the PROXY protocol, which is very important to
build SSL offload farms (I'm sure Baptiste will want to write a blog
article on the subject of using sub-$1000 machines to build large
100k+tps farms). Stats work over https too. Right now we're missing
ACLs to match whether the traffic was SSL or clear, as well as logs.
Both can be worked around by using distinct "bind" lines or even
frontends. The doc is still clearly lacking, but we think that the
config will change a little bit.
Only the GNU makefile was updated, neither the BSD nor OSX were,
they're a little trickier. If someone with one of these systems wants
to update them, I'll happily accept the patches.
What else ? Ah yes, 4k. You're there wondering about the results. 4000
SSL connections per second and 300 Mbps is what we got out of a
dual-core Atom D510 at 1.66 GHz, in SSLv3 running over 4 processes
(hyperthreading was enabled) :-) This is a bit more than stud and
obviously much better than stunnel (which doesn't scale to more than
a few hundred connections before the performance quickly drops).
And older tests seem to indicate that with YaSSL we can get 30-40%
more, maybe even more. We need to work with the YaSSL guys to
slightly improve their cache management before this can become a
default build option.
Enough speaking, for those who want to test or even have the hardware
to run more interesting benchmarks, the code was merged into the
master branch and is in today's snapshot (20120904) here :
http://haproxy.1wt.eu/download/1.5/src/snapshot/
Build it by passing "USE_OPENSSL=1" on the make command line. You
should also include support for linux-2.6 options for better results
:
make TARGET=linux2628 USE_OPENSSL=1
If all goes well by the end of the week, I'll issue -dev12, but I
expect that we'll have some bugs to fix till then.
BTW, be very careful, openssl is a memory monster. We counted about
80kB per connection for haproxy+ssl, this is 800 MB for only 10k
connections! And remember, this is still beta-quality code. Don't
blindly put this in production (eventhough I did it on 1wt.eu :
https://demo.1wt.eu/). You have been warned!
Please use the links below :
site index : http://haproxy.1wt.eu/
sources : http://haproxy.1wt.eu/download/1.5/src/snapshot/
changelog :
http://haproxy.1wt.eu/download/1.5/src/snapshot/CHANGELOG Exceliance
: http://www.exceliance.fr/en/
Have a lot of fun and please report your success/failures,
Willy