Re: General SSL vs. non-SSL Performance

2016-03-20 Thread Aleksandar Lazic

Hi.

Am 17-03-2016 16:55, schrieb Pavlos Parissis:

On 17/03/2016 04:49 μμ, Nenad Merdanovic wrote:

Hello Pavlos,

On 3/17/2016 4:45 PM, Pavlos Parissis wrote:

I am working(not very actively) on a solution which utilizes this.
It will use www.vaultproject.io as central store, a generating engine
and a pull/push mechanism in place.

But, the current version of HAProxy doesn't support different TLS
tickets per frontend, which I would like to use.


What do you mean? You can specify tls-ticket-keys per bind line.




I *was* wrong as I have completely forgot that and also that socket
command accepts IDs:

set ssl tls-key  

I am sorry for the spreading wrong information.


Okay I'm now lost 8-O

please can anyone help me to understand how the flow works.

1st Request
client -> ssl handshake -> haproxy server 1 (tls ticket?!)

2nd Request
Same client -> ssl handshake -> haproxy server 2 (tls ticket?!)

how does the server 2 receive the server 1 tls ticket?

Thanks for help.

BR Aleks






RE: General SSL vs. non-SSL Performance

2016-03-19 Thread Lukas Tribus
>> Hm, I haven't tried Apache yet but would that be a huge benefit compared
>> to a setup using nbproc> 1?
>
> I haven't tried it either, but yes, I would assume so.

To be more specific: the number of TLS handshakes would probably be
similar, especially in a nbproc>1 configuration, but when you hit that
limit, haproxy or nginx would likely struggle with the load because of
all the blocking.

So while your handshakes per second number may be similar, you would
probably have a performance drop for the actual traffic, I suspect.



Lukas 


Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Dennis Jacobfeuerborn
On 18.03.2016 11:46, Willy Tarreau wrote:
> Hi Christian,
> 
> On Fri, Mar 18, 2016 at 11:31:57AM +0100, Christian Ruppert wrote:
>> I also just stumbled over this:
>> https://software.intel.com/en-us/articles/accelerating-ssl-load-balancers-with-intel-xeon-v3-processors
>> Might be interesting for others as well. So ECC and multi-threaded/process
>> is the way to go it seems.
> 
> Thanks for this one, I had seen it once and lost it! I'll add it to
> the list of articles on the haproxy page not to lose it anymore!
> 
>>> But note that people who have to deal with heavy SSL traffic actually
>>> deal with this in haproxy by using to levels of processing, one for
>>> HTTP and one for TLS. It means that only TLS traffic can be hurt by
>>> handshakes :
>>>
>>>   listen secure
>>>bind :443 ssl crt foo.pem process 2-32
>>> mode tcp
>>>server clear 127.0.0.1:80
>>>
>>>   frontend clear
>>>bind :80 process 1
>>> mode http
>>>use_backend my_wonderful_server_farm
>>>
>>>   ...
>>>
>>
>> Your example would be better and easier but we need the client IP for ACLs
>> and so forth which wouldn't work in tcp mode and there would be no XFF
>> header. So we're duplicating stuff in the frontend but use one backend.
> 
> You don't need, just use the proxy protocol :
> 
>listen secure
> bind :443 ssl crt foo.pem process 2-32
>   mode tcp
> server clear 127.0.0.1:81 send-proxy-v2
> 
>frontend clear
> bind 127.0.0.1:81 accept-proxy process 1
> bind :80 process 1
>   mode http
> use_backend my_wonderful_server_farm
> 
> Also, if you have one backend with all frontends bound to many processes,
> then all your backends run on these processes, which makes it harder to
> enforce maxconn limitations or to share stick-tables. That's why it's
> much better to only move SSL out of the regular path. Of course if you
> need to pass extra info, you'll have to enable HTTP in the frontend.

Doesn't this setup use a lot of additional sockets for the
send/accept-proxy communication?
I'm using abstract namespace sockets to handle this kind of forwarding:

listen secure
 bind :443 ssl crt foo.pem process 2-32
 mode tcp
 server clear abns@fclear send-proxy-v2

frontend clear
 bind abns@fclear accept-proxy process 1
 bind :80 process 1
 mode http
 use_backend my_wonderful_server_farm

Is there any downside to use abstract namespace sockets like this?

Regards,
  Dennis




RE: General SSL vs. non-SSL Performance

2016-03-19 Thread Lukas Tribus
> The "option httpclose" was on purpose. Also the client could (during a
> attack) simply do the same and achieve the same result. I don't think
> that will help in such cases.

So what you are actually and purposely benchmarking are SSL/TLS
handshakes, because thats the bottleneck you are trying to improve.

First of all the selected cipher is very important, as is the certificate
and the RSA key size.

For optimal performance, you would drop your RSA certificate
and get a ECC cert. If thats not a possibility then use 2048-bit
RSA certificates.


Your ab output suggest that the negotiated cipher is
ECDHE-RSA-AES128-GCM-SHA256 - which is fine for RSA certificates,
but your RSA certificate is 4096 bit long, which is where the performance
penalty comes from - use 2048bit certificates or better yet use ECC
certificates.

read: DO NOT USE RSA certificates longer than 2048bit.


Both nginx [1] and haproxy currently do not support offloading TLS
handshakes to another thread or dedicating a thread to a TLS session.

Thats why Apache will scale better currently, because its threading.



Hope this helps,

Lukas



[1] https://twitter.com/ngx_vbart/status/611956593324916736

  


Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Nenad Merdanovic
Hello Pavlos,

On 3/17/2016 4:45 PM, Pavlos Parissis wrote:
> I am working(not very actively) on a solution which utilizes this.
> It will use www.vaultproject.io as central store, a generating engine
> and a pull/push mechanism in place.
> 
> But, the current version of HAProxy doesn't support different TLS
> tickets per frontend, which I would like to use.

What do you mean? You can specify tls-ticket-keys per bind line.
> 
> Cheers,
> Pavlos
> 

Regards,
Nenad



RE: General SSL vs. non-SSL Performance

2016-03-19 Thread Lukas Tribus
> Some customers may require 4096 bit keys as it seems to be much more
> decent than 2048 nowadays.

I've not come across any recommendations pointing in that direction, in
fact 2048-bit RSA are supposed to be safe for commercial use until 2030.

I don't think this is a real requirement from knowledgeable people, to
be frank.

In any case it doesn't make any sense because if your customer really has
such huge requirements you may as well switch to ECC because you won't
be able to support old clients anyway.



> That's still more than 96% difference compared to non-SSL

Well your are basically benchmarking your stack with a TLS specific
denial of service attack. Of course the same attack without TLS won't
have noticable effect on the stack. So that number is quite obviously
high.



>> Thats why Apache will scale better currently, because its threading.
>
> Hm, I haven't tried Apache yet but would that be a huge benefit compared
> to a setup using nbproc> 1?

I haven't tried it either, but yes, I would assume so. It also doesn't block
other connections will handshaking new ones.




Regards,

Lukas 


Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Willy Tarreau
Hi Christian,

On Wed, Mar 16, 2016 at 05:25:53PM +0100, Christian Ruppert wrote:
> Hi Lukas,
> 
> On 2016-03-16 16:53, Lukas Tribus wrote:
> >>The "option httpclose" was on purpose. Also the client could (during a
> >>attack) simply do the same and achieve the same result. I don't think
> >>that will help in such cases.
> >
> >So what you are actually and purposely benchmarking are SSL/TLS
> >handshakes, because thats the bottleneck you are trying to improve.
> 
> You're right, yes.

You also found the hard way why it's important to share TLS secrets
between multiple front nodes, or to properly distribute the load to
avoid handshakes as much as possible.

> >Both nginx [1] and haproxy currently do not support offloading TLS
> >handshakes to another thread or dedicating a thread to a TLS session.
> >
> >Thats why Apache will scale better currently, because its threading.
> 
> Hm, I haven't tried Apache yet but would that be a huge benefit compared to
> a setup using nbproc > 1?

Here I don't know. TLS handshakes are one large part of what made me think
that we must go multi-threaded instead of multi-process over the long term,
just because I want to be able to pin some tasks to some CPUs. Ie when TLS
says "handshake needed", we want to be able to migrate the task to another
CPU to avoid the huge latency imposed to all other processing (eg: 7ms in
your case).

But note that people who have to deal with heavy SSL traffic actually
deal with this in haproxy by using to levels of processing, one for
HTTP and one for TLS. It means that only TLS traffic can be hurt by
handshakes :

   listen secure
bind :443 ssl crt foo.pem process 2-32
mode tcp
server clear 127.0.0.1:80

   frontend clear
bind :80 process 1
mode http
use_backend my_wonderful_server_farm

   ...

And before linux kernel reintroduced support for SO_REUSEPORT (in
3.9), it was common to have the single process load-balance incoming
TCP connections to all other TLS processes. It then makes it possible
to chose the LB algo you want, including source hash so that a same
attacker can only affect one process for example.

Willy




Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Christian Ruppert

Hi Cyril,

On 2016-03-16 16:14, Cyril Bonté wrote:

Hi all,

replying really quickly from a webmail, sorry for the lack of details


[...]
I also ran 2 parallel "ab" on two separate machines against a third
one.
The requests per second were around ~70 r/s per host instead of ~140.
So
I doubt it's a entropy problem.


The issue is in your haproxy configuration : you disabled HTTP
keep-alive by using "option httpclose", so you are benchmarking SSL
handshakes and your values are not unusual in that case.
Please try with something else, like "option http-server-close".


The "option httpclose" was on purpose. Also the client could (during a 
attack) simply do the same and achieve the same result. I don't think 
that will help in such cases.


--
Regards,
Christian Ruppert



Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Christian Ruppert

Hi Willy,

On 2016-03-17 06:05, Willy Tarreau wrote:

Hi Christian,

On Wed, Mar 16, 2016 at 05:25:53PM +0100, Christian Ruppert wrote:

Hi Lukas,

On 2016-03-16 16:53, Lukas Tribus wrote:
>>The "option httpclose" was on purpose. Also the client could (during a
>>attack) simply do the same and achieve the same result. I don't think
>>that will help in such cases.
>
>So what you are actually and purposely benchmarking are SSL/TLS
>handshakes, because thats the bottleneck you are trying to improve.

You're right, yes.


You also found the hard way why it's important to share TLS secrets
between multiple front nodes, or to properly distribute the load to
avoid handshakes as much as possible.


I also just stumbled over this:
https://software.intel.com/en-us/articles/accelerating-ssl-load-balancers-with-intel-xeon-v3-processors
Might be interesting for others as well. So ECC and 
multi-threaded/process is the way to go it seems.





>Both nginx [1] and haproxy currently do not support offloading TLS
>handshakes to another thread or dedicating a thread to a TLS session.
>
>Thats why Apache will scale better currently, because its threading.

Hm, I haven't tried Apache yet but would that be a huge benefit 
compared to

a setup using nbproc > 1?


Here I don't know. TLS handshakes are one large part of what made me 
think
that we must go multi-threaded instead of multi-process over the long 
term,
just because I want to be able to pin some tasks to some CPUs. Ie when 
TLS
says "handshake needed", we want to be able to migrate the task to 
another
CPU to avoid the huge latency imposed to all other processing (eg: 7ms 
in

your case).

But note that people who have to deal with heavy SSL traffic actually
deal with this in haproxy by using to levels of processing, one for
HTTP and one for TLS. It means that only TLS traffic can be hurt by
handshakes :

   listen secure
bind :443 ssl crt foo.pem process 2-32
mode tcp
server clear 127.0.0.1:80

   frontend clear
bind :80 process 1
mode http
use_backend my_wonderful_server_farm

   ...



Your example would be better and easier but we need the client IP for 
ACLs and so forth which wouldn't work in tcp mode and there would be no 
XFF header. So we're duplicating stuff in the frontend but use one 
backend.



And before linux kernel reintroduced support for SO_REUSEPORT (in
3.9), it was common to have the single process load-balance incoming
TCP connections to all other TLS processes. It then makes it possible
to chose the LB algo you want, including source hash so that a same
attacker can only affect one process for example.

Willy


--
Regards,
Christian Ruppert



Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Nenad Merdanovic
Hello,

On 3/16/2016 6:25 PM, Christian Ruppert wrote:
> 
> Some customers may require 4096 bit keys as it seems to be much more
> decent than 2048 nowadays. So you may be limited here. A test with a
> 2048 bit Cert gives me around ~770 requests per second, a test with an
> 256 bit ECC cert around 1600 requests per second. That's still more than
> 96% difference compared to non-SSL, way better than the 4096 bit RSA one
> though. I also have to make sure that even some older clients can
> connect to the site, so I have to take a closer look on the ECC certs
> and cipher then. ECC is definitively an enhancement, if there's no
> compatibility problem.

HAproxy can, in latest versions, serve both ECC and RSA certificates
depending on client support. In a fairly large environment I have found
that about 85% of clients are ECC capable. Also, look at configuring TLS
ticket keys and rotating them properly as well as using keepalive.

The difference in performance you are observing is fairly normal. You
can measure the SSL performance of your CPU using 'openssl speed' to see
how many computes/s you get without the HAproxy penalty, but the numbers
should be very close.

Another thing you might consider is switching to OpenSSL 1.0.2 because
you have a v3 Intel Xeon which has AVX2 instruction support and will
benefit from improvements done in 1.0.2.

In an SSL-heavy environment, we use servers with a lot of cores, albeit
slower per core, and with a good DDoS-protection ruleset haven't
experienced any attacks that weren't effectively mitigated.

With a properly configured SSL stack in HAproxy (all of the things
mentioned above), the CPU usage difference is almost negligible. And to
be honest, there are not that many SSL-exhaustion attacks.

> 
>>
>>
>> Both nginx [1] and haproxy currently do not support offloading TLS
>> handshakes to another thread or dedicating a thread to a TLS session.
>>
>> Thats why Apache will scale better currently, because its threading.
> 
> Hm, I haven't tried Apache yet but would that be a huge benefit compared
> to a setup using nbproc > 1?

No :) Your CPU can only give as much.

Regards,
Nenad



Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Aleksandar Lazic

Hi.

Am 17-03-2016 11:51, schrieb Gary Barrueto:

Hi.

On Mar 16, 2016 10:06 PM, "Willy Tarreau" <


Here I don't know. TLS handshakes are one large part of what made me

think

that we must go multi-threaded instead of multi-process over the long

term,

just because I want to be able to pin some tasks to some CPUs. Ie when

TLS

says "handshake needed", we want to be able to migrate the task to

another

CPU to avoid the huge latency imposed to all other processing (eg: 7ms

in

your case).



While that would help a single server, how about when dealing with 
multi

servers + anycast: Has there been any thoughts about sharing ssl/tls
session cache between servers? Like how apache can use memcache to 
store

its cache or how cloudfare used/patched openresty to do the same
recently.


I have asked this also but I think the answer is not that easy.

As willy have written above there should be a hook in the ssl-handshake 
flow to be able to call external data.

Maybe similar like the peers handling for sessions.

even nginx have only shared session store between processes not servers.

http://nginx.org/en/docs/http/ngx_http_ssl_module.html#ssl_session_cache

Maybe the session tickets concept could help but I haven't dig to deep 
into this topic.


http://tools.ietf.org/html/rfc5077

BR Aleks
http://nginx.org/en/docs/http/ngx_http_ssl_module.html#ssl_session_tickets



Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Christian Ruppert

Hi Aleks,

On 2016-03-16 15:57, Aleksandar Lazic wrote:

Hi.

Am 16-03-2016 15:17, schrieb Christian Ruppert:

Hi,

this is rather HAProxy unrelated so more a general problem but 
anyway..
I did some tests with SSL vs. non-SSL performance and I wanted to 
share my

results with you guys but also trying to solve the actual problem

So here is what I did:


[snipp]


A test without SSL, using "ab":
# ab -k -n 5000 -c 250 http://127.0.0.1:65410/


[snipp]

That's much worse than I expected it to be. ~144 requests per second 
instead of
42*k*. That's more than 99% performance drop. The cipher a moderate 
but secure

(for now), I doubt that changing the cipher will help a lot here.
nginx and HAProxy
performance is almost equal so it's not a problem with the server 
software.
One could increase nbproc (at least in my case it only increased up to 
nbproc 4,
Xeon E3-1281 v3) but that's just a rather minor enhancement. With 
those ~144 r/s

you're basically lost when being under attack. How did you guys solve
this problem?
External SSL offloading, using hardware crypto foo, special
cipher/settings tuning,
simply *much* more hardware or not yet at all?


You run both client & server on the same machine

Maybe you are running out of entropy?
Are you able to run the client on a different machine?

BR Aleks


I also ran 2 parallel "ab" on two separate machines against a third one. 
The requests per second were around ~70 r/s per host instead of ~140. So 
I doubt it's a entropy problem.


--
Regards,
Christian Ruppert



Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Christian Ruppert

On 2016-03-18 11:31, Christian Ruppert wrote:

Hi Willy,

On 2016-03-17 06:05, Willy Tarreau wrote:

Hi Christian,

On Wed, Mar 16, 2016 at 05:25:53PM +0100, Christian Ruppert wrote:

Hi Lukas,

On 2016-03-16 16:53, Lukas Tribus wrote:
>>The "option httpclose" was on purpose. Also the client could (during a
>>attack) simply do the same and achieve the same result. I don't think
>>that will help in such cases.
>
>So what you are actually and purposely benchmarking are SSL/TLS
>handshakes, because thats the bottleneck you are trying to improve.

You're right, yes.


You also found the hard way why it's important to share TLS secrets
between multiple front nodes, or to properly distribute the load to
avoid handshakes as much as possible.


I also just stumbled over this:
https://software.intel.com/en-us/articles/accelerating-ssl-load-balancers-with-intel-xeon-v3-processors
Might be interesting for others as well. So ECC and
multi-threaded/process is the way to go it seems.




>Both nginx [1] and haproxy currently do not support offloading TLS
>handshakes to another thread or dedicating a thread to a TLS session.
>
>Thats why Apache will scale better currently, because its threading.

Hm, I haven't tried Apache yet but would that be a huge benefit 
compared to

a setup using nbproc > 1?


Here I don't know. TLS handshakes are one large part of what made me 
think
that we must go multi-threaded instead of multi-process over the long 
term,
just because I want to be able to pin some tasks to some CPUs. Ie when 
TLS
says "handshake needed", we want to be able to migrate the task to 
another
CPU to avoid the huge latency imposed to all other processing (eg: 7ms 
in

your case).

But note that people who have to deal with heavy SSL traffic actually
deal with this in haproxy by using to levels of processing, one for
HTTP and one for TLS. It means that only TLS traffic can be hurt by
handshakes :

   listen secure
bind :443 ssl crt foo.pem process 2-32
mode tcp
server clear 127.0.0.1:80

   frontend clear
bind :80 process 1
mode http
use_backend my_wonderful_server_farm

   ...



Your example would be better and easier but we need the client IP for
ACLs and so forth which wouldn't work in tcp mode and there would be
no XFF header. So we're duplicating stuff in the frontend but use one
backend.



Hm, not sure how that would perform with "server ... send-proxy[-v2]" in 
the listen block and "bind :anotherport accept-proxy" in the frontend 
block, additionally.
Duplication a lot of ACLs and so forth or using your example 
(simplified) with PROXY protocol.



And before linux kernel reintroduced support for SO_REUSEPORT (in
3.9), it was common to have the single process load-balance incoming
TCP connections to all other TLS processes. It then makes it possible
to chose the LB algo you want, including source hash so that a same
attacker can only affect one process for example.

Willy


--
Regards,
Christian Ruppert



Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Aleksandar Lazic

Hi Nenad

Am 17-03-2016 19:27, schrieb Nenad Merdanovic:

Hello Aleksandar

On 3/17/2016 6:00 PM, Aleksandar Lazic wrote:

Okay I'm now lost 8-O

please can anyone help me to understand how the flow works.

1st Request
client -> ssl handshake -> haproxy server 1 (tls ticket?!)

2nd Request
Same client -> ssl handshake -> haproxy server 2 (tls ticket?!)



I'll just oversimplify everything :) The TLS ticket is maintained on 
the
client side and contains an encrypted session state which can be used 
to

resume a TLS session. The keys for decrypting this information are
distributed to all HAproxy servers so that any server might resume the
session. What you are specifying in tls-ticket-keys file are the
encryption (and decryption) keys.


Hm I'm not sure if understand this right.
I will try to repeat just to check if I have understand it righ.

http://cbonte.github.io/haproxy-dconv/configuration-1.6.html#5.1-tls-ticket-keys

#
frontend ssl
  bind :443 ssl tls-ticket-keys /myramdisk/ticket-file <= this is a 
local file right
  stick-table type binary len ?? 10m expire 12h store ??? if { 
req.ssl_st_ext 1 }

##

could this pseudo conf snippet work?
What I don't understand is HOW the tls ticket 'distributed to all 
HAproxy servers' with the current haproxy options.


Thanks for the patience.

BR Aleks



Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Cyril Bonté
Hi all,

replying really quickly from a webmail, sorry for the lack of details

> [...]
> I also ran 2 parallel "ab" on two separate machines against a third
> one.
> The requests per second were around ~70 r/s per host instead of ~140.
> So
> I doubt it's a entropy problem.

The issue is in your haproxy configuration : you disabled HTTP keep-alive by 
using "option httpclose", so you are benchmarking SSL handshakes and your 
values are not unusual in that case.
Please try with something else, like "option http-server-close".



Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Pavlos Parissis


On 17/03/2016 12:26 μμ, Nenad Merdanovic wrote:
> Hello Gary,
> 
> On 3/17/2016 11:51 AM, Gary Barrueto wrote:
>>
>> While that would help a single server, how about when dealing with multi
>> servers + anycast: Has there been any thoughts about sharing ssl/tls
>> session cache between servers? Like how apache can use memcache to store
>> its cache or how cloudfare used/patched openresty to do the same recently.
>>
> 
> HAproxy can load TLS ticket keys from file, which can be distributed by
> a central server. That way the information is kept on the client side
> and can be reused by any server in the anycasted pool.
> 
> https://cbonte.github.io/haproxy-dconv/configuration-1.6.html#5.1-tls-ticket-keys
> 

I am working(not very actively) on a solution which utilizes this.
It will use www.vaultproject.io as central store, a generating engine
and a pull/push mechanism in place.

But, the current version of HAProxy doesn't support different TLS
tickets per frontend, which I would like to use.

Cheers,
Pavlos



signature.asc
Description: OpenPGP digital signature


Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Gary Barrueto
Hi.

On Mar 16, 2016 10:06 PM, "Willy Tarreau" <
>
> Here I don't know. TLS handshakes are one large part of what made me think
> that we must go multi-threaded instead of multi-process over the long
term,
> just because I want to be able to pin some tasks to some CPUs. Ie when TLS
> says "handshake needed", we want to be able to migrate the task to another
> CPU to avoid the huge latency imposed to all other processing (eg: 7ms in
> your case).
>

While that would help a single server, how about when dealing with multi
servers + anycast: Has there been any thoughts about sharing ssl/tls
session cache between servers? Like how apache can use memcache to store
its cache or how cloudfare used/patched openresty to do the same recently.

> But note that people who have to deal with heavy SSL traffic actually
> deal with this in haproxy by using to levels of processing, one for
> HTTP and one for TLS. It means that only TLS traffic can be hurt by
> handshakes :
>
>listen secure
> bind :443 ssl crt foo.pem process 2-32
> mode tcp
> server clear 127.0.0.1:80
>
>frontend clear
> bind :80 process 1
> mode http
> use_backend my_wonderful_server_farm
>
>...
>
> And before linux kernel reintroduced support for SO_REUSEPORT (in
> 3.9), it was common to have the single process load-balance incoming
> TCP connections to all other TLS processes. It then makes it possible
> to chose the LB algo you want, including source hash so that a same
> attacker can only affect one process for example.
>
> Willy
>
>


Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Janusz Dziemidowicz
2016-03-17 20:48 GMT+01:00 Aleksandar Lazic :
> Hm I'm not sure if understand this right.
> I will try to repeat just to check if I have understand it righ.
>
> http://cbonte.github.io/haproxy-dconv/configuration-1.6.html#5.1-tls-ticket-keys
>
> #
> frontend ssl
>   bind :443 ssl tls-ticket-keys /myramdisk/ticket-file <= this is a local
> file right
>   stick-table type binary len ?? 10m expire 12h store ??? if {
> req.ssl_st_ext 1 }
> ##
>
> could this pseudo conf snippet work?
> What I don't understand is HOW the tls ticket 'distributed to all HAproxy
> servers' with the current haproxy options.

If this local file is the same on two servers then those two servers
can both resume the same session. Session state is stored on the
client (encrypted by the contents of "this local file"). There is no
need to distribute anything apart this local file. The downside is
that not all clients support this.

-- 
Janusz Dziemidowicz



Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Christian Ruppert

On 2016-03-17 00:14, Nenad Merdanovic wrote:

Hello,

On 3/16/2016 6:25 PM, Christian Ruppert wrote:


Some customers may require 4096 bit keys as it seems to be much more
decent than 2048 nowadays. So you may be limited here. A test with a
2048 bit Cert gives me around ~770 requests per second, a test with an
256 bit ECC cert around 1600 requests per second. That's still more 
than
96% difference compared to non-SSL, way better than the 4096 bit RSA 
one

though. I also have to make sure that even some older clients can
connect to the site, so I have to take a closer look on the ECC certs
and cipher then. ECC is definitively an enhancement, if there's no
compatibility problem.


HAproxy can, in latest versions, serve both ECC and RSA certificates
depending on client support. In a fairly large environment I have found
that about 85% of clients are ECC capable. Also, look at configuring 
TLS

ticket keys and rotating them properly as well as using keepalive.

The difference in performance you are observing is fairly normal. You
can measure the SSL performance of your CPU using 'openssl speed' to 
see
how many computes/s you get without the HAproxy penalty, but the 
numbers

should be very close.

Another thing you might consider is switching to OpenSSL 1.0.2 because
you have a v3 Intel Xeon which has AVX2 instruction support and will
benefit from improvements done in 1.0.2.



That's indeed a noticeable performance increase on RSA but I couldn't 
notice any difference for ECC.



In an SSL-heavy environment, we use servers with a lot of cores, albeit
slower per core, and with a good DDoS-protection ruleset haven't
experienced any attacks that weren't effectively mitigated.

With a properly configured SSL stack in HAproxy (all of the things
mentioned above), the CPU usage difference is almost negligible. And to
be honest, there are not that many SSL-exhaustion attacks.



For now perhaps, but more and more sites/customer want 100% https 
whether it's just cool or indeed useful doesn't matter. And I am 
somewhat scared if one can take down the site with very few requests 
just by disabling keep-alive and other features on the client side.







Both nginx [1] and haproxy currently do not support offloading TLS
handshakes to another thread or dedicating a thread to a TLS session.

Thats why Apache will scale better currently, because its threading.


Hm, I haven't tried Apache yet but would that be a huge benefit 
compared

to a setup using nbproc > 1?


No :) Your CPU can only give as much.

Regards,
Nenad


--
Regards,
Christian Ruppert



RE: General SSL vs. non-SSL Performance

2016-03-19 Thread Christian Ruppert

On 2016-03-16 17:56, Lukas Tribus wrote:

Some customers may require 4096 bit keys as it seems to be much more
decent than 2048 nowadays.


I've not come across any recommendations pointing in that direction, in
fact 2048-bit RSA are supposed to be safe for commercial use until 
2030.


I don't think this is a real requirement from knowledgeable people, to
be frank.


That's almost always the case when talking about requirements.



In any case it doesn't make any sense because if your customer really 
has

such huge requirements you may as well switch to ECC because you won't
be able to support old clients anyway.



I just compared the RSA one against ECC on ssllabs and it seems there's 
no difference on the browser/device compatibility topic. So we should 
indeed consider ECC keys.






That's still more than 96% difference compared to non-SSL


Well your are basically benchmarking your stack with a TLS specific
denial of service attack. Of course the same attack without TLS won't
have noticable effect on the stack. So that number is quite obviously
high.



Yeah but to me it looks like almost anybody else will be affected as 
well when migrating to 100% https. A few hosts could easily take down 
the site when disabling keep-alive and so on on the client while doing 
some "valid" requests. So it's hard to noticed compared to http only, 
because they can use much less requests, connections etc.






Thats why Apache will scale better currently, because its threading.


Hm, I haven't tried Apache yet but would that be a huge benefit 
compared

to a setup using nbproc> 1?


I haven't tried it either, but yes, I would assume so. It also doesn't 
block

other connections will handshaking new ones.




Regards,

Lukas


--
Regards,
Christian Ruppert



Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Nenad Merdanovic
Hello Aleksandar

On 3/17/2016 6:00 PM, Aleksandar Lazic wrote:
> Okay I'm now lost 8-O
> 
> please can anyone help me to understand how the flow works.
> 
> 1st Request
> client -> ssl handshake -> haproxy server 1 (tls ticket?!)
> 
> 2nd Request
> Same client -> ssl handshake -> haproxy server 2 (tls ticket?!)
> 

I'll just oversimplify everything :) The TLS ticket is maintained on the
client side and contains an encrypted session state which can be used to
resume a TLS session. The keys for decrypting this information are
distributed to all HAproxy servers so that any server might resume the
session. What you are specifying in tls-ticket-keys file are the
encryption (and decryption) keys.

Regards,
Nenad



Re: General SSL vs. non-SSL Performance

2016-03-19 Thread Willy Tarreau
On Fri, Mar 18, 2016 at 03:04:43PM +0100, Dennis Jacobfeuerborn wrote:
> > You don't need, just use the proxy protocol :
> > 
> >listen secure
> > bind :443 ssl crt foo.pem process 2-32
> > mode tcp
> > server clear 127.0.0.1:81 send-proxy-v2
> > 
> >frontend clear
> > bind 127.0.0.1:81 accept-proxy process 1
> > bind :80 process 1
> > mode http
> > use_backend my_wonderful_server_farm
> > 
> > Also, if you have one backend with all frontends bound to many processes,
> > then all your backends run on these processes, which makes it harder to
> > enforce maxconn limitations or to share stick-tables. That's why it's
> > much better to only move SSL out of the regular path. Of course if you
> > need to pass extra info, you'll have to enable HTTP in the frontend.
> 
> Doesn't this setup use a lot of additional sockets for the
> send/accept-proxy communication?

It only doubles the total number of sockets on the system, but the
main process still has the same number. In fact, even slightly less
since its client is very close and consumes data much faster.

> I'm using abstract namespace sockets to handle this kind of forwarding:
> 
> listen secure
>  bind :443 ssl crt foo.pem process 2-32
>  mode tcp
>  server clear abns@fclear send-proxy-v2
> 
> frontend clear
>  bind abns@fclear accept-proxy process 1
>  bind :80 process 1
>  mode http
>  use_backend my_wonderful_server_farm
> 
> Is there any downside to use abstract namespace sockets like this?

In general it's even better than TCP and significantly faster. The only
case I'm aware of where abstract sockets could be problematic is when
multiple processes are bound to them, because they are not resumable.
Upon a soft reload, if the new process fails to bind and tells the old
ones "ok I changed my mind, continue what you were doing", only one of
the old ones will be able to rebind. In the case above there's a single
listener so that's fine.

Willy




Re: General SSL vs. non-SSL Performance

2016-03-18 Thread Pavlos Parissis


On 17/03/2016 04:49 μμ, Nenad Merdanovic wrote:
> Hello Pavlos,
> 
> On 3/17/2016 4:45 PM, Pavlos Parissis wrote:
>> I am working(not very actively) on a solution which utilizes this.
>> It will use www.vaultproject.io as central store, a generating engine
>> and a pull/push mechanism in place.
>>
>> But, the current version of HAProxy doesn't support different TLS
>> tickets per frontend, which I would like to use.
> 
> What do you mean? You can specify tls-ticket-keys per bind line.
>>

I *was* wrong as I have completely forgot that and also that socket
command accepts IDs:

set ssl tls-key  

I am sorry for the spreading wrong information.

Cheers,
Pavlos



signature.asc
Description: OpenPGP digital signature


Re: General SSL vs. non-SSL Performance

2016-03-18 Thread Nenad Merdanovic
Hello Gary,

On 3/17/2016 11:51 AM, Gary Barrueto wrote:
> 
> While that would help a single server, how about when dealing with multi
> servers + anycast: Has there been any thoughts about sharing ssl/tls
> session cache between servers? Like how apache can use memcache to store
> its cache or how cloudfare used/patched openresty to do the same recently.
> 

HAproxy can load TLS ticket keys from file, which can be distributed by
a central server. That way the information is kept on the client side
and can be reused by any server in the anycasted pool.

https://cbonte.github.io/haproxy-dconv/configuration-1.6.html#5.1-tls-ticket-keys

Is there any reason why you'd still want to use session cache?

Regards,
Nenad



RE: General SSL vs. non-SSL Performance

2016-03-18 Thread Christian Ruppert

Hi Lukas,

On 2016-03-16 16:53, Lukas Tribus wrote:

The "option httpclose" was on purpose. Also the client could (during a
attack) simply do the same and achieve the same result. I don't think
that will help in such cases.


So what you are actually and purposely benchmarking are SSL/TLS
handshakes, because thats the bottleneck you are trying to improve.


You're right, yes.



First of all the selected cipher is very important, as is the 
certificate

and the RSA key size.

For optimal performance, you would drop your RSA certificate
and get a ECC cert. If thats not a possibility then use 2048-bit
RSA certificates.


Your ab output suggest that the negotiated cipher is
ECDHE-RSA-AES128-GCM-SHA256 - which is fine for RSA certificates,
but your RSA certificate is 4096 bit long, which is where the 
performance

penalty comes from - use 2048bit certificates or better yet use ECC
certificates.

read: DO NOT USE RSA certificates longer than 2048bit.


Some customers may require 4096 bit keys as it seems to be much more 
decent than 2048 nowadays. So you may be limited here. A test with a 
2048 bit Cert gives me around ~770 requests per second, a test with an 
256 bit ECC cert around 1600 requests per second. That's still more than 
96% difference compared to non-SSL, way better than the 4096 bit RSA one 
though. I also have to make sure that even some older clients can 
connect to the site, so I have to take a closer look on the ECC certs 
and cipher then. ECC is definitively an enhancement, if there's no 
compatibility problem.





Both nginx [1] and haproxy currently do not support offloading TLS
handshakes to another thread or dedicating a thread to a TLS session.

Thats why Apache will scale better currently, because its threading.


Hm, I haven't tried Apache yet but would that be a huge benefit compared 
to a setup using nbproc > 1?






Hope this helps,

Lukas



[1] https://twitter.com/ngx_vbart/status/611956593324916736


--
Regards,
Christian Ruppert



Re: General SSL vs. non-SSL Performance

2016-03-18 Thread Aleksandar Lazic

Hi.

Am 16-03-2016 15:17, schrieb Christian Ruppert:

Hi,

this is rather HAProxy unrelated so more a general problem but anyway..
I did some tests with SSL vs. non-SSL performance and I wanted to share 
my

results with you guys but also trying to solve the actual problem

So here is what I did:


[snipp]


A test without SSL, using "ab":
# ab -k -n 5000 -c 250 http://127.0.0.1:65410/


[snipp]

That's much worse than I expected it to be. ~144 requests per second 
instead of
42*k*. That's more than 99% performance drop. The cipher a moderate but 
secure

(for now), I doubt that changing the cipher will help a lot here.
nginx and HAProxy
performance is almost equal so it's not a problem with the server 
software.
One could increase nbproc (at least in my case it only increased up to 
nbproc 4,
Xeon E3-1281 v3) but that's just a rather minor enhancement. With those 
~144 r/s

you're basically lost when being under attack. How did you guys solve
this problem?
External SSL offloading, using hardware crypto foo, special
cipher/settings tuning,
simply *much* more hardware or not yet at all?


You run both client & server on the same machine

Maybe you are running out of entropy?
Are you able to run the client on a different machine?

BR Aleks



Re: General SSL vs. non-SSL Performance

2016-03-18 Thread Willy Tarreau
Hi Christian,

On Fri, Mar 18, 2016 at 11:31:57AM +0100, Christian Ruppert wrote:
> I also just stumbled over this:
> https://software.intel.com/en-us/articles/accelerating-ssl-load-balancers-with-intel-xeon-v3-processors
> Might be interesting for others as well. So ECC and multi-threaded/process
> is the way to go it seems.

Thanks for this one, I had seen it once and lost it! I'll add it to
the list of articles on the haproxy page not to lose it anymore!

> >But note that people who have to deal with heavy SSL traffic actually
> >deal with this in haproxy by using to levels of processing, one for
> >HTTP and one for TLS. It means that only TLS traffic can be hurt by
> >handshakes :
> >
> >   listen secure
> >bind :443 ssl crt foo.pem process 2-32
> > mode tcp
> >server clear 127.0.0.1:80
> >
> >   frontend clear
> >bind :80 process 1
> > mode http
> >use_backend my_wonderful_server_farm
> >
> >   ...
> >
> 
> Your example would be better and easier but we need the client IP for ACLs
> and so forth which wouldn't work in tcp mode and there would be no XFF
> header. So we're duplicating stuff in the frontend but use one backend.

You don't need, just use the proxy protocol :

   listen secure
bind :443 ssl crt foo.pem process 2-32
mode tcp
server clear 127.0.0.1:81 send-proxy-v2

   frontend clear
bind 127.0.0.1:81 accept-proxy process 1
bind :80 process 1
mode http
use_backend my_wonderful_server_farm

Also, if you have one backend with all frontends bound to many processes,
then all your backends run on these processes, which makes it harder to
enforce maxconn limitations or to share stick-tables. That's why it's
much better to only move SSL out of the regular path. Of course if you
need to pass extra info, you'll have to enable HTTP in the frontend.

Regards,
Willy