Hi all, today is a great day (could say night considering the time I'm posting) !
After several months of efforts by the Exceliance team, we managed to rework all the buffer and connection layers in order to get SSL working bon both sides of HAProxy. The code is still in preview, we can't break it anymore but considering that we've fixed some bugs today, I'm sure that some still remain in the 100+ patches and 16000 lines of patches this work required (not counting the many ones that were abandonned or re-merged multiple times). The code is still going to change because we're getting closer to something which will allow outgoing connections to be reused, resulting in keep-alive on both sides. But not yet, be patient. What's done right now ? 1) connections Connections are independant entities which can be instanciated without allocating a full session and its buffers. Connections are responsible for handshakes and control, and pass data to buffers. Connection-level TCP-request rules, the PROXY protocol and SSL handshakes are processed at the connection level. 2) buffers buffers have been split in three: channel (the tube where the data flows), buffer (where data is temporarily stored for analysis or forwarding) and optionally the pipe (stored in kernel area for forwarding only). New buffers only handle data without consideration for what it's used for. Health checks are currently being migrated to use this with connections. 3) data I/O data I/O are now performed between a connection and a buffer. We have two data-layer operations now : raw and ssl. It is very easy to add new ones now, we're even wondering whether it would make sense to write one dedicated to yassl in native mode (without the openssl API). 4) socket I/O at the moment we only support normal sockets, but the design considered "remote sockets" so that we could off-load heavy processing to external processes (eg: HTTP on one process, SSL on two other). Remote sockets have not been started yet but surely will. SHMs have also been considered to emulate sockets. 5) configuration Configuration has been extended to support the "ssl" keyword on "bind" lines and on "server" lines. For both, the syntax is : ... ssl <cert.pem> [ciphers <suite>] [nosslv3] [notlsv1] <cert.pem> is a PEM file made by concatenating the .crt and the .key of a certificate. eg: bind :443 ssl /etc/haproxy/pub.pem server local 192.168.0.1:443 ssl ciphers EXPORT40 notlsv1 6) session management SSL sessions are stored in a shared memory cache, allowing haproxy to run with nbproc > 1 and still work correctly. This is the session cache we developped for stunnel then stud, it was time to adopt it in haproxy. It's so fast that we don't use openssl's cache at all, since even at one single process, it's at least as fast. 7) other A lot remains to be done, mainly some of the aforementionned structres are still included in other ones, which simplified the split Once all the work is over, we should end up with less memory used per connection. This is important to better handle DDoS. At the moment, everything we could try seems to work fine. The SSL stacks well on top of the PROXY protocol, which is very important to build SSL offload farms (I'm sure Baptiste will want to write a blog article on the subject of using sub-$1000 machines to build large 100k+tps farms). Stats work over https too. Right now we're missing ACLs to match whether the traffic was SSL or clear, as well as logs. Both can be worked around by using distinct "bind" lines or even frontends. The doc is still clearly lacking, but we think that the config will change a little bit. Only the GNU makefile was updated, neither the BSD nor OSX were, they're a little trickier. If someone with one of these systems wants to update them, I'll happily accept the patches. What else ? Ah yes, 4k. You're there wondering about the results. 4000 SSL connections per second and 300 Mbps is what we got out of a dual-core Atom D510 at 1.66 GHz, in SSLv3 running over 4 processes (hyperthreading was enabled) :-) This is a bit more than stud and obviously much better than stunnel (which doesn't scale to more than a few hundred connections before the performance quickly drops). And older tests seem to indicate that with YaSSL we can get 30-40% more, maybe even more. We need to work with the YaSSL guys to slightly improve their cache management before this can become a default build option. Enough speaking, for those who want to test or even have the hardware to run more interesting benchmarks, the code was merged into the master branch and is in today's snapshot (20120904) here : http://haproxy.1wt.eu/download/1.5/src/snapshot/ Build it by passing "USE_OPENSSL=1" on the make command line. You should also include support for linux-2.6 options for better results : make TARGET=linux2628 USE_OPENSSL=1 If all goes well by the end of the week, I'll issue -dev12, but I expect that we'll have some bugs to fix till then. BTW, be very careful, openssl is a memory monster. We counted about 80kB per connection for haproxy+ssl, this is 800 MB for only 10k connections! And remember, this is still beta-quality code. Don't blindly put this in production (eventhough I did it on 1wt.eu : https://demo.1wt.eu/). You have been warned! Please use the links below : site index : http://haproxy.1wt.eu/ sources : http://haproxy.1wt.eu/download/1.5/src/snapshot/ changelog : http://haproxy.1wt.eu/download/1.5/src/snapshot/CHANGELOG Exceliance : http://www.exceliance.fr/en/ Have a lot of fun and please report your success/failures, Willy