Re: Product Info

2019-11-06 Thread Lucas Rolff
I think the point Willy tried to make is that it should be handled the same way 
regardless of being a security patch or not. All fixes are important - so see 
them as "security" fixes for bugs if you like.

On 06/11/2019, 10.04, "apcoeproductnotificati...@wellsfargo.com" 
 wrote:

Hi Willy,

Thanks for the info but honestly I am not focusing only on security fix, I 
need confirmation whether 2.0.8 is security patch or discretionary patch so 
that I can work on it accordingly  on the basis of patch type.

Regards,
Anurag


-Original Message-
From: Willy Tarreau  
Sent: Wednesday, November 6, 2019 2:15 PM
To: APCoE Product Notifications 
Cc: xro...@gmail.com; haproxy@formilux.org; Na, Anurag 

Subject: Re: Product Info

On Wed, Nov 06, 2019 at 08:09:55AM +, 
apcoeproductnotificati...@wellsfargo.com wrote:
> Hi Rob/Thomas,
> Good day!!
> 
> Thanks for the update, so as per the link the current patch is 2.0.8 
> released on 23-10-2019, request you to please confirm whether this 
> patch is also a security patch and fixing any vulnerability (please 
> provide CVE if available) or not as it has one major bug fix in the 
release notes.

Well, it was marked security since considered as such by the reporter 
eventhough it requires you to use a vulnerable server and to purposely write a 
bogus configuration, so my personal opinion on it is that it's very minor 
compared to all the issues we fix on a daily basis.

In addition, please note that ALL FIXES ARE IMPORTANT and that if you're 
trying to only pick fixes explicitly marked as security, you'll end up with the 
most bogus load balancer on earth, and you'd rather not do this at all if you 
care for your site's availability.

Focusing on CVEs only is part of what Linus Torvalds calls the "security 
circus" and I fully agree with him on that, considering how harmful most bugs 
can be for production and which are dropped by people focusing on CVE only and 
who instead pick irrelevant stuff because these have a "security"
sticker. Also please have a look at this presentation by GregKH explaining 
the ridiculous situation we've reached with CVE nowadays:


https://kernel-recipes.org/en/2019/talks/cves-are-dead-long-live-the-cve/

In short if you're wondering what patch to pick, you WILL eventually cause 
some disaster on your production that only YOU will be responsible for, by 
having deliberately rejected important fixes. You'd rather rely on up-to-date 
releases, either from sources if you build yourself, or from distro maintainers 
if you prefer to use pre-built packages. The project maintainers devote a lot 
of time maintaining stable branches containing only fixes precisely so that 
nobody has to duplicate this boring and dangerous job.

Note that if you fear regressions, it's normal. Nobody likes to face them. 
In this case, just wait one week or even one month for others to deploy a new 
version before you do so, and you'll know if you're taking any risk. Everyone 
does this depending on the criticity. What is certain is that by not updating 
you're taking the risk to hit any of the hundreds of bugs that are known and 
fixed upstream.

Willy





Re: Using haproxy together with NFS

2018-08-02 Thread Lucas Rolff
I indeed removed the send-proxy - then I had to put the IP of haproxy in the 
NFS exports file instead to be able to mount the share (which makes sense seen 
from a NFS perspective).

Making the NFS server support proxy protocol, isn't something I think will 
happen - I rely on the upstream packages (CentOS 7 packages in this case).

And using transparency mode - I think relying on stuff going via haproxy for 
routing won't be a possibility in this case - so I guess I have to drop my wish 
about haproxy + NFS in this case, I'd like something that is fairly standard 
without too much modifications on the current NFS infrastructure (since it 
would introduce more complexity).

Thanks for your replies both of you!

Best Regards,

On 02/08/2018, 18.09, "Willy Tarreau"  wrote:

On Thu, Aug 02, 2018 at 04:05:24AM +0000, Lucas Rolff wrote:
> Hi michael,
> 
> Without the send-proxy, the client IP in the export would have to be the
> haproxy server in that case right?

That's it. But Michael is absolutely right, your NFS server doesn't support
the proxy protocol, and the lines it emits below indicate it :

  Aug 01 21:44:44 nfs-server-f8209dc4-a1a6-4baf-86fa-eba0b0254bc9 kernel: 
RPC: fragment too large: 1347571544
  Aug 01 21:44:44 nfs-server-f8209dc4-a1a6-4baf-86fa-eba0b0254bc9 kernel: 
RPC: fragment too large: 1347571544  
  Aug 01 21:44:44 nfs-server-f8209dc4-a1a6-4baf-86fa-eba0b0254bc9 kernel: 
RPC: fragment too large: 1347571544
  Aug 01 21:44:45 nfs-server-f8209dc4-a1a6-4baf-86fa-eba0b0254bc9 kernel: 
RPC: fragment too large: 1347571544

This fragment size (1347571544) is "PROX" encoded in big endian, which are
the first 4 chars of the proxy protocol header :-)

> The issue there is then, that I end up with all clients having access to
> haproxy can suddenly mount all shares in nfs, which I would like to 
prevent

Maybe you can modify your NFS server to support the proxy protocol, that
could possibly make sense for your use case ? Otherwise on Linux you may
be able to configure haproxy to work in transparent mode using "source
0.0.0.0 usesrc clientip" but beware that it requires some specific iptables
rules to divert the traffic and send it back to haproxy. It will also 
require
that all your NFS servers route the clients via haproxy for the response
traffic. This is not always very convenient.

Regards,
Willy




Re: Using haproxy together with NFS

2018-08-01 Thread Lucas Rolff
Hi michael,

Without the send-proxy, the client IP in the export would have to be the 
haproxy server in that case right?

The issue there is then, that I end up with all clients having access to 
haproxy can suddenly mount all shares in nfs, which I would like to prevent

There’s still different shares that different servers need access to

I’ll try not the sample config from the link above! Thanks!

Get Outlook for iOS<https://aka.ms/o0ukef>

From: Michael Ezzell 
Sent: Thursday, August 2, 2018 2:38:06 AM
To: Lucas Rolff
Cc: HAproxy Mailing Lists
Subject: Re: Using haproxy together with NFS



On Wed, Aug 1, 2018, 16:00 Lucas Rolff 
mailto:lu...@lucasrolff.com>> wrote:

I use the “send-proxy” to let the NFS Server see the actual source IP, instead 
of the haproxy machine IP.
You'll probably need remove that.  Unless the destination service explicitly 
supports the Proxy Protocol (in which case, it must not, by definition, process 
connections where the protocol's preamble is *absent* from the stream), then 
this would just look like corrupt data.  This option doesn't actually change 
the source address.

HAProxy in TCP mode should work fine with NFS -- at least, it does with NFS4.1 
as implemented in Amazon Elastic File System -- which is the only version I've 
tested against.

https://serverfault.com/a/799213/153161




Using haproxy together with NFS

2018-08-01 Thread Lucas Rolff
Hi guys,

I’ve been playing around today with two NFS servers (each on their own storage 
array), synced by Unison to provide a bit higher uptime.

To allow NFS clients to use a single IP, I’ve configured an haproxy install (1 
now, two when in prod), where I want to talk over tcp mode to the NFS servers.
My idea is that all traffic is directed based on the source IP balancing, so 
the traffic will be somewhat 50/50 on each NFS server.

My question is if anyone have actually ever got a setup like this to work, I’m 
using NFS 4.0, and whenever I try to mount the NFS mount on the client, it does 
communicate with haproxy, and I do see traffic on the NFS server itself, 
meaning the communication seems to work.

The issue I’m facing, is that the mounting will actually never complete due to 
some weird behavior when I go through haproxy:

Aug 01 21:44:44 nfs-server-f8209dc4-a1a6-4baf-86fa-eba0b0254bc9 kernel: RPC: 
fragment too large: 1347571544
Aug 01 21:44:44 nfs-server-f8209dc4-a1a6-4baf-86fa-eba0b0254bc9 kernel: RPC: 
fragment too large: 1347571544
Aug 01 21:44:44 nfs-server-f8209dc4-a1a6-4baf-86fa-eba0b0254bc9 kernel: RPC: 
fragment too large: 1347571544
Aug 01 21:44:45 nfs-server-f8209dc4-a1a6-4baf-86fa-eba0b0254bc9 kernel: RPC: 
fragment too large: 1347571544

It will continue to give this “fragment too large”.
If I bypass haproxy it works completely fine, so I know the NFS Server is 
configured correctly for the client to connect.

My haproxy configuration looks like this:

global
  log 127.0.0.1 local1 debug
  nbproc 4
  user haproxy
  group haproxy
  daemon
  chroot /var/lib/haproxy
  pidfile /var/run/haproxy.pid

defaults
mode tcp
log global
option tcplog
timeout client 1m
timeout server 1m
timeout connect 10s
balance source

frontend nfs-in1
bind *:2049
use_backend nfs_backend1
frontend nfs-in2
bind *:111
use_backend nfs_backend2
frontend nfs-in3
bind *:46716
use_backend nfs_backend3
frontend nfs-in4
bind *:36856
use_backend nfs_backend4

backend nfs_backend1
  server nfs1 217.xx.xx.xx:2049 send-proxy
backend nfs_backend2
  server nfs1 217.xx.xx.xx:111 send-proxy
backend nfs_backend3
  server nfs1 217.xx.xx.xx:46716 send-proxy
backend nfs_backend4
  server nfs1 217.xx.xx.xx:36856 send-proxy

I use the “send-proxy” to let the NFS Server see the actual source IP, instead 
of the haproxy machine IP.

If anyone has any idea what can be the cause of the “fragment too large” when 
going via haproxy, or an actual working config for haproxy to do NFS 4.0 or 4.1 
traffic – then please let me know!

Best Regards,
Lucas Rolff


Re: RHEL distribution still uses HAProxy 1.5

2018-05-01 Thread Lucas Rolff
Well, RHEL is set to provide non-breaking software for the time a major release 
will exist, that's something they've decided as an OS vendor. You're free to 
run your own version, just be aware that it's unsupported by RHEL

RHEL isn't the solution if you want cutting edge versions of software, however 
they do opt for shipping stable releases and maintain them if the official 
vendor decides to stop supporting it.

Get Outlook for iOS

From: Norman Branitsky 
Sent: Tuesday, May 1, 2018 4:26:52 PM
To: haproxy
Subject: RHEL distribution still uses HAProxy 1.5

We opened a ticket with RHEL Support to ask when they would upgrade to at least 
HAProxy 1.7.
This was their reply:

Most recent comment: On 2018-05-01 10:22:28, Patil, Ravindra commented:
"Hello

The reason 1.7 (as well and 1.6 and 1.8) are not in RHEL is due to backward 
compatibility. We can't simply rebase haproxy in RHEL to the latest release -- 
we would break existing deployments. This is a non-starter.

We have added haproxy 1.8 to RHSCL 3.1 which should be released soon. But it 
will never been in base RHEL.

There are no defects in 1.5. It is extremely stable. Far more stable that 1.7 
or 1.8.

Regards
Ravindra Patil
Red Hat Global Support

Comments?


Re: -Ws argument isn't document?

2018-02-03 Thread Lucas Rolff
haproxy --help:
-W master-worker mode.
-Ws master-worker mode with systemd notify support.

On 03/02/2018, 15.44, "Pavlos Parissis"  wrote:

Hi,

In contrib/systemd/haproxy.service.in we see -Ws used in ExecStart as it is 
the recommended way to
start haproxy under systemd:

 ExecStart=@SBINDIR@/haproxy -Ws -f $CONFIG -p $PIDFILE

But, it isn't documented in doc/management.txt, only -W is mentioned while 
I failed to find any
references for '-s'.
I assume that '-s' adds certain functionality as I find it in other 
argument combinations (-st and -sf).

Is my understanding wrong?

Cheers,
Pavlos





Re: Poll: haproxy 1.4 support ?

2018-01-02 Thread Lucas Rolff
I also vote for both 1.4 and 1.5 being marked, if vendors do rely on old 
versions due to LTS, they tend to backport critical security and bug fixes 
anyway

Get Outlook for iOS

From: Pavlos Parissis 
Sent: Tuesday, January 2, 2018 4:34:39 PM
To: Jonathan Matthews; haproxy
Subject: Re: Poll: haproxy 1.4 support ?

On 02/01/2018 04:23 μμ, Jonathan Matthews wrote:
> On 2 January 2018 at 15:12, Willy Tarreau  wrote:
>> So please simply voice in. Just a few "please keep it alive" will be
>> enough to convince me, otherwise I'll mark it unmaintained.
>
> I don't use 1.4, but I do have a small reason to say please *do* mark
> it as unmaintained.
>
> The sustainability of haproxy is linked to the amount of work you (and
> a /relatively/ small set of people) both have to do and want to do.
> I would very much like it to continue happily, so I would vote to
> reduce your mental load and to mark 1.4 as unmaintained.
>

+1

BTW: I don't use/look at haproxy 1.5 version as well...

Cheers,
Pavlos



Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-30 Thread Lucas Rolff
I’ve tested the 1.8.3 build, and I can indeed confirm it works like charm!

@Willy, thanks for the extensive time you spend on debugging and investigating 
this as well!

Best Regards,
Lucas Rolff
 



Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff
<<<<<<<<<<<<<<<<<<
:authority: dashboard.domain.com
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:59.0) 
Gecko/20100101 Firefox/59.0
accept: text/css,*/*;q=0.1
accept-language: da,en-US;q=0.8,en;q=0.6,es;q=0.4,tr;q=0.2
accept-encoding: gzip, deflate, br
referer: https://dashboard.domain.com/stats/6
cookie: _ga=GA1.2.2085297229.1474098197
Wx1ZSI: XSRF-TOKEN=SECURE_TOKEN%3D
cookie: laravel_session=SECURE_SESSION%3D%3D
pragma: no-cache
cache-control: no-cache
#


So, this Wx1ZSI usually should be “cookie” – however it’s somehow turned into 
garbage.

Repeated – now it’s s3U2JV – but still supposed to be “cookie”:

<<<<<<<<<<<<<<<<<<
:authority: dashboard.domain.com
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:59.0) 
Gecko/20100101 Firefox/59.0
accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
accept-language: da,en-US;q=0.8,en;q=0.6,es;q=0.4,tr;q=0.2
accept-encoding: gzip, deflate, br
referer: https://dashboard.domain.com/stats/1
cookie: _ga=GA1.2.2085297229.1474098197
s3U2JV: XSRF-TOKEN=SECURE_TOKEN%3D
cookie: laravel_session=SECURE_SESSION%3D%3D
upgrade-insecure-requests: 1
pragma: no-cache
cache-control: no-cache
#

It’s consistently the cookie header that fails.

Some repeated requests, all related to cookie where header field became:
6InNEa
InVMdk

Best Regards,
Lucas Rolff


On 29/12/2017, 21.21, "Willy Tarreau" <w...@1wt.eu> wrote:

On Fri, Dec 29, 2017 at 06:56:36PM +, Lucas Rolff wrote:
> h2_make_h1_request:153
> h2_frt_decode_headers:2621
> h2_frt_decode_headers:2643
> 
> /* this can be any type of header */
> /* RFC7540#8.1.2: upper case not allowed in header field names */
> for (i = 0; i < list[idx].n.len; i++)
> if ((uint8_t)(list[idx].n.ptr[i] - 'A') < 'Z' - 'A')
> goto fail;
> 
> That's an interesting place to fail

OK I can propose the attached patch which will dump all the requests to
stderr, as they are received or extracted from the dynamic headers table.
The patch needs to be applied without the previous ones. This will look
like this :

  <<<<<<<<<<<<<<<<<<
  :authority: 127.0.0.1:4443
  user-agent: curl/7.57.0
  accept: */*
  >>>>>>>>>>>>>>>>>
  <<<<<<<<<<<<<<<<<<
  :authority: 127.0.0.1:4443
  user-agent: curl/7.57.0
  accept: */*
  aaa: AaA
  >>>>>>>>>>>>>>>>>

The '<<<' and '>>>' enclose a request. The final one will instead use "###"
to indicate that at least one bad char was received, or '!!!' to indicate
that another error was met. Please note that it will silently let the 
request
pass through so you need to check the output to see if these "###" happen.

Maybe we'll find a bug in the dynamic headers table causing some crap to
be returned. Or maybe we'll find that a given browser occasionally sends
a bad header.

Cheers,
willy




Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff
h2_make_h1_request:153
h2_frt_decode_headers:2621
h2_frt_decode_headers:2643

/* this can be any type of header */
/* RFC7540#8.1.2: upper case not allowed in header field names */
for (i = 0; i < list[idx].n.len; i++)
if ((uint8_t)(list[idx].n.ptr[i] - 'A') < 'Z' - 'A')
goto fail;

That’s an interesting place to fail

- Lucas R

On 29/12/2017, 19.36, "Willy Tarreau" <w...@1wt.eu> wrote:

On Fri, Dec 29, 2017 at 06:18:00PM +, Lucas Rolff wrote:
> I think you forgot to attach the patch

Grrr common mistake, sorry.

> I did try to add the continue patch for the "blacklisted" headers, and 
same result (now in this case, it happened on the website request itself:

Thanks, I'll take a look once I'm home.

Willy




Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff
I think you forgot to attach the patch

I did try to add the continue patch for the “blacklisted” headers, and same 
result (now in this case, it happened on the website request itself:

POST Request to site:
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067

GET Request to site:
h2_frt_decode_headers:2621
h2_frt_decode_headers:2643

A bit background info on why you see post and get:

Basically it’s a dropdown with automatic submit which then does a 302 redirect 
– so I post a bit of data e.g. following:
_token  TApEQSj4V3D3TZCUYauWoPan1mKhrk
date12/2017
zone1

The application will then do a redirect to a specific page based on that input 
data.



On 29/12/2017, 19.11, "Willy Tarreau" <w...@1wt.eu> wrote:

On Fri, Dec 29, 2017 at 06:02:15PM +0000, Lucas Rolff wrote:
> POST Request (to website):
> h2s_frt_make_resp_data:3180
> h2s_frt_make_resp_data:3067
> -
> GET Request (to website):
> h2s_frt_make_resp_data:3180
> h2s_frt_make_resp_data:3067
> -
> Get Request (app.css)
> h2_frt_decode_headers:2621
> h2_frt_decode_headers:2643

Excellent, it's this one :

/* OK now we have our header list in  */
outlen = h2_make_h1_request(list, bi_end(buf), try);

if (outlen < 0) {
h2c_error(h2c, H2_ERR_COMPRESSION_ERROR);
goto fail;
}

Now I'm starting to wonder whether it's true or not that the
Connection header is *never* sent... But it might also be that
something else violates the rules.

Would you want to retest with the extra patch attached ? It will do
the same with h2.c which is responsible for h2_make_h1_request() so
that we know better. And after this you could try again by applying
the patch I sent this morning which silently skips the connection
headers (replaces a goto fail with a continue). Do not hesitate to
ask me to redo it if you've lost it!

Cheers,
Willy




Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff
Working page load (total of 4 requests), and we see 4x 3180|3067

POST Request (to website):
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067
-
GET Request (to website):
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067
-
GET Request (app.css)
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067
-
GET Request (app.js)
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067

Not working page load:

POST Request (to website):
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067
-
GET Request (to website):
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067
-
Get Request (app.css)
h2_frt_decode_headers:2621
h2_frt_decode_headers:2643
-
Get Request (app.js)
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067

Best Regards,
Lucas Rolff

On 29/12/2017, 18.21, "Willy Tarreau" <w...@1wt.eu> wrote:

On Fri, Dec 29, 2017 at 04:48:13PM +0000, Lucas Rolff wrote:
> > If you're willing to run another test, I can prepare a debugging patch 
which will try to report every single error path in the H2 and HPACK code so 
that we can try to understand where the code was upset
> 
> I'd love to run another test or 10 - in the end, we'll all benefit from 
it (hopefully)

OK great, let's start with an easy one. This patch will output a line on
stderr for every function and line number where we go through a goto (most
goto in the code are error handling and most error handling is unrolled
using goto). For me it seldom prints :

   h2_process_demux:1759

When the client doesn't immediately send because it sees a partial request.
But with FF I'm not even seeing this one.

With a bit of luck you'll find one or a few lines that only happen when you
observe the problem, and the sequence will help us figure what code path
we're following.

If it doesn't work we'll try to do better.

Thanks!
Willy




Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff
> If you're willing to run another test, I can prepare a debugging patch which 
> will try to report every single error path in the H2 and HPACK code so that 
> we can try to understand where the code was upset

I’d love to run another test or 10 – in the end, we’ll all benefit from it 
(hopefully)

Best Regards,
Lucas Rolff 



Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff
Both in Firefox and Chrome my POST requests in 1.8.2 with the supplied patch, 
seem to do the trick (did about 300 post requests in each browser with no 
fails).

Best Regards,

On 29/12/2017, 15.58, "Willy Tarreau"  wrote:

On Fri, Dec 29, 2017 at 03:42:30PM +0100, Willy Tarreau wrote:
> OK I managed to reproduce it with nghttp using --expect-continue to
> force it to leave a pause before sending the data. And indeed there
> the data are immediately followed by a shutdown. Getting closer...

So here's what I found : when dealing with request forwarding, we used
to let the close migrate from the client to the server with the last
block. And this happens only once we switch to fast forwarding, which
means that the last block from the request didn't fit in the buffer.
Thus it would randomly impact large uploads (though timing would often
protect them) and almost always impact small ones if sent in two parts
as we could produce.

The attached patch fixes it for me. Could you please give it a try ?

Thanks,
Willy




Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff
> Lucas, can you check my previous mail and see if you can enable ignoring 
> client aborts in your backend, assuming you are using nginx?

I can confirm that ignoring client aborts in my backend using 
fastcgi_ignore_client_abort “resolves” the issue regarding POST requests.

Best Regards,
Lucas R

On 29/12/2017, 11.46, "lu...@ltri.eu on behalf of Lukas Tribus"  
wrote:

Hello,


On Fri, Dec 29, 2017 at 11:22 AM, Lukas Tribus  wrote:
> It's that:
> - when sending the POST request to the backend server, haproxy sends a
> FIN before the server responds
> - nginx doesn't like that and closes the request (you will see nginx
> error code 499 in nginx server logs)
> - as there is a race on the backend server between receiving the FIN
> and completing the response, this does not always happen
> - haproxy returns "400 Bad Request" to the client, although the
> request is fine and the response was empty (I consider this a bug)
>
>
> The feature on nginx is basically what we call abortonclose, and can
> be disabled by the following nginx directives (depending which backend
> modules is used):
> 
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ignore_client_abort
> 
http://nginx.org/en/docs/http/ngx_http_fastcgi_module.html#fastcgi_ignore_client_abort
>
>
> Howto reproduce the haproxy behavior:
> - have a http backend pointing to nc
> - make a POST request
> - this is even reproducible with H1 clients, however H2 has to be
> enabled on haproxy otherwise it doesn't send the FIN (strangely
> enough)
>
>
> Does this make sense?

The FIN behavior comes from a48c141f4 ("BUG/MAJOR: connection: refine
the situations where we don't send shutw()"), which also hit 1.8.2, so
that explains the change in behavior between 1.8.1 and 1.8.2.

Lucas, can you check my previous mail and see if you can enable
ignoring client aborts in your backend, assuming you are using nginx?



Thanks,
Lukas




Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff
> Actually that's not the case and that may explain the situation. The machine 
> runs OpenSSL 1.0.1 so only NPN is used, ALPN isn't. I'll try with a static 
> build of openssl 1.0.2 to see if the ratio increases.

That might very well be the case, I know for sure that Chrome dropped support 
NPN and requires 1.0.2 for http2 to work, and I’m suspecting the same with 
recent Firefox versions (at least on Windows and Mac) seems to only work with 
ALPN, I’ll check haproxy.org later to see if it then works, and see if I can 
replicate the issue in Firefox with GET requests being “lost” sometimes.

> That's interesting because if the client managed to display the error, it 
> means the stream was not reset and the connection not aborted. So in fact I 
> suspect a side effect of the work done to better support the abortonclose 
> option.

At least in 1.8.2 I see a huge amount of failed POST requests when http2 is 
enabled, like 30%+ sometimes even a lot higher.

And those requests do give a 400 Bad Request, nothing “odd” other than the fact 
that haproxy believes the request got aborted by the client (CH).

>> Which is the case for 1.8.2 and latest master.
>> 
>> If I do the patch for 1.8.1 I still get BADREQ sometimes in haproxy with the 
>> "empty" GET requests.
> Just to be clear, do you mean you *only* get the BADREQ with GET (ie it never 
> happens with POST) or that you *also* get the BADREQ with GET ? I mean, given 
> that we're fixing bugs, I'm not interested in seeing if patches also work as 
> alternatives to bugs we've already fixed, but if you're observing 
> regressions, that's different.

1.8.1: Only  (CR) on GET requests occasionally
1.8.2 + master: Bad Request (CH) on POST/PUT 30%+ of the time and  (CR) 
on GET requests occasionally

Best Regards,
Lucas Rolff

On 29/12/2017, 11.13, "Willy Tarreau" <w...@1wt.eu> wrote:

On Fri, Dec 29, 2017 at 08:46:18AM +, Lucas Rolff wrote:
> > Yep. For what it's worth, it's been enabled for about one month on 
haproxy.org and till now we didn't get any bad report, which is pretty 
encouraging.
> 
> Can I ask where? The negotiated protocol I get on https://haproxy.org/ is
> http/1.1 in both Google Chrome and Firefox as an example.

That's getting funny, as I'm having H2 here on firefox, as can be seen
in the attached capture :-)

Looking at the server's logs of the last hour, I'm clearly seeing *some*
H2 traffic, about 5.5% of the HTTPS traffic : 

  -bash-4.2$ fgrep 'public~' h.log | grep www.haproxy.org | wc -l
  1804
  -bash-4.2$ fgrep 'public~' h.log | grep www.haproxy.org |  grep HTTP/2 | 
wc -l
  108

I think this ratio is rather low and there are bots. If I focus only on the 
home
page, it's slightly better but still low :

  -bash-4.2$ fgrep 'public~' h.log | fgrep 'www.haproxy.org/ ' |  grep 
HTTP/2 | wc -l
  21
  -bash-4.2$ fgrep 'public~' h.log | fgrep 'www.haproxy.org/ ' | wc -l
  298

And the fact that your browser doesn't negociate it certainly implies than a
number of other browsers do not either.

> If I use curl, I can see it has ALPN enabled with http2 - however, 
mentioned
> browsers doesn't seem to actually establish the http2 connection, but 
rather
> a 1.1 connection.

Actually that's not the case and that may explain the situation. The machine
runs OpenSSL 1.0.1 so only NPN is used, ALPN isn't. I'll try with a static
build of openssl 1.0.2 to see if the ratio increases.

> For POST requests, it's not resolved with the patch, what I did find when
> getting the 400 Bad Request on POST, it sometimes actually arrive at the
> backend:
> 
> So haproxy:
> 
> Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 
[29/Dec/2017:08:01:36.800] https_frontend~ cdn-backend/mycdn 0/0/1/-1/3 400 187 
- - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1"
> Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 
[29/Dec/2017:08:01:36.806] https_frontend~ cdn-backend/mycdn 0/0/0/-1/2 400 187 
- - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1"

We definitely need to understand what's causing this. I'm currently thinking
how to add more information to improve observability inside the mux to match
what can currently be done outside (ie: we don't have the equivalent of the
"show errors" nor error counters inside the mux).

> https://snaps.hcdn.dk/AONBkWojDFwuErMjxF66Mg4VxJl9jz4KUM61jPpgcL.png -
> however the request is ~ 68-69 ms, and that browser is Google Chrome, so 
in
> that regard, the 400 Bad Request for POST/PUT also happens in Google 
Chrome.

That's interesting because if the client managed to display the error,
it means the stream was not reset and the con

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff
> Yep. For what it's worth, it's been enabled for about one month on 
> haproxy.org and till now we didn't get any bad report, which is pretty 
> encouraging.

Can I ask where? The negotiated protocol I get on https://haproxy.org/ is 
http/1.1 in both Google Chrome and Firefox as an example.

If I use curl, I can see it has ALPN enabled with http2 – however, mentioned 
browsers doesn’t seem to actually establish the http2 connection, but rather a 
1.1 connection.

> If this is met in production it definitely is a problem that we have to 
> address. Could you please try the attached patch to see if it fixes the issue 
> for you ? If so, I would at least like that we can keep some statistics on 
> it, and maybe even condition it.

For POST requests, it’s not resolved with the patch, what I did find when 
getting the 400 Bad Request on POST, it sometimes actually arrive at the 
backend:

So haproxy:

Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 
[29/Dec/2017:08:01:36.800] https_frontend~ cdn-backend/mycdn 0/0/1/-1/3 400 187 
- - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1"
Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 
[29/Dec/2017:08:01:36.806] https_frontend~ cdn-backend/mycdn 0/0/0/-1/2 400 187 
- - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1"

Backend:

2017/12/29 08:01:36 [info] 7701#7701: *1191 epoll_wait() reported that client 
prematurely closed connection, so upstream connection is closed too while 
sending request to upstream, client: 92.70.20.xxx, server: 
dashboard.domain.com, request: "POST /login HTTP/1.1", upstream: 
"fastcgi://unix:/var/run/php-fpm/php5.sock:", host: "dashboard.domain.com", 
referrer: "https://dashboard.domain.com/login;
2017/12/29 08:01:36 [info] 7701#7701: *1193 epoll_wait() reported that client 
prematurely closed connection, so upstream connection is closed too while 
sending request to upstream, client: 92.70.20.xxx, server: 
dashboard.domain.com, request: "POST /login HTTP/1.1", upstream: 
"fastcgi://unix:/var/run/php-fpm/php5.sock:", host: "dashboard.domain.com", 
referrer: https://dashboard.domain.com/login

So haproxy says the client aborted while waiting for the server to start 
responding.

https://snaps.hcdn.dk/AONBkWojDFwuErMjxF66Mg4VxJl9jz4KUM61jPpgcL.png - however 
the request is ~ 68-69 ms, and that browser is Google Chrome, so in that 
regard, the 400 Bad Request for POST/PUT also happens in Google Chrome.

Which is the case for 1.8.2 and latest master.

If I do the patch for 1.8.1 I still get BADREQ sometimes in haproxy with the 
“empty” GET requests.

On nginx backend it looks like this:

https://gist.github.com/lucasRolff/5eb0ae277c97b7457fbe546a1118e34f

I do see a few connection resets, however these happen also when things work.

However, that also confirms that “connection” header isn’t actually sent, 
despite Firefox says it’s sent in their dev tools.

So, just so no misunderstandings happen, it seems there’s two things going 
wrong:

===
1.8.1:
- Firefox sometimes has issues with GET requests failing under http2 (Confirmed 
by Maximilian B)
- Haproxy log shows: Dec 29 08:17:50 localhost haproxy[4881]: 92.70.20.xx:56726 
[29/Dec/2017:08:17:17.178] https_frontend~ https_frontend/ 
-1/-1/-1/-1/33388 400 0 - - CR-- 24/1/0/0/0 0/0 "" (CR == The client 
aborted before sending a full HTTP request)

1.8.2 and git master:
- All browsers seem to suffer with occasional issues with POST/PUT requests, 
the GET requests issue still persist). (POST confirmed by Lukas T)
- Haproxy log shows (for POST requests): Dec 29 08:09:27 localhost 
haproxy[4432]: 92.70.20.xx:54951 [29/Dec/2017:08:09:27.167] https_frontend~ 
cdn-backend/mycdn 0/0/0/-1/2 400 187 - - CH-- 1/1/0/0/0 0/0 "POST /login 
HTTP/1.1" (CH == The client aborted while waiting for the server to start 
responding), while backend server will see the request as aborted by the client 
as well.
- - Client in scope of haproxy == browser
- - Client in scope of backend == haproxy

Semi conclusion:
- Connection headers are not sent by Firefox, Safari or Webkit, despite webdev 
tools say so (in current versions at least)
- Not able to replicate GET request failures in other browsers than Firefox 
despite doing thousands of requests (Tested Chrome, Safari, Opera, Webkit)
- Some bug seems like it got introduced between 1.8.1 and 1.8.2 causing 
POST/PUT requests to fail in some cases

What I’d like to know:
- URL on haproxy.org that negotiates http2 correctly for Chrome and Firefox 
(not exactly sure why it doesn’t do it already?) ( 
https://snaps.hcdn.dk/JFsEzPmxspw9hnXuFyM5G4QGYst7Q6R2zXmkZbRRjz.png )

Best Regards,
Lucas Rolff

On 29/12/2017, 08.13, "Willy Tarreau" <w...@1wt.eu> wrote:

Hi Lucas,

On Fri, Dec 29, 2017 at 06:06:49AM +, Lucas Rolff wrote:
> As much as I agree about that specs should be followed, I realize

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lucas Rolff
Hi Willy,

> In fact it's a race between the GOAWAY frame caused by the invalid request, 
> and the HEADERS frame being sent in response to the stream being closed
> I agree that it's quite confusing, but we're talking about responses to 
> conditions that are explicitly forbidden in the spec, so I'd rather not spend 
> too much energy on this for now.

As much as I agree about that specs should be followed, I realized that even if 
there’s people that want to follow the spec 100%, there will always be 
implementations used in large scale that won’t be following the spec 100% - the 
reasoning behind this can be multiple – one I could imagine is the fact when 
browsers or servers start implementing a new protocol (h2 is a good example) 
before the spec is actually 100% finalized – when it’s then finalized, the 
vendor might end up with a implementation that is slightly violating the actual 
spec, but however either won’t fix it because the violations are minor or 
because the violations are not by any way “breaking” when you also compare it 
to the other implementations done.

In this case if I understand you correctly, the errors are related to the fact 
that certain clients didn’t implement the spec correctly in first place.

I was very curious why e.g. the Connection header (even if it isn’t sent by 
Firefox or Safari/Webkit even though their webdev tools say it is), would work 
in nginx, and Apache for that matter, so I asked on their mailing list why they 
were violating the spec.

Valentin gave a rather interesting answer why they in their software actually 
decided to sometimes violate specific parts, it all boiled down to client 
support, because they also realized the fact that many browsers (that might be 
EOL and never get updated), might have implementations that would not work with 
http2 in that case.

http://mailman.nginx.org/pipermail/nginx/2017-December/055356.html

I know that it’s different software, and that how others decide to design their 
software is completely up to them.
Violating specs on purpose is generally bad, no doubt about that – but if it’s 
a requirement to be able to get good coverage in regards to clients (both new 
and old browsers that are actually in use), then I understand why one would go 
to such lengths as having to “hack” a bit to make sure generally used browsers 
can use the protocol.

> So at least my analysis for now is that for a reason still to be determined, 
> this version of firefox didn't correctly interoperate with haproxy in a given 
> environment

Downgrading Firefox to earlier versions (such as 55, which is “pre”-quantum) 
reveals the same issue with bad requests.

Hopefully you’ll not have to violate the http2 spec in any way – but I do see a 
valid point explained by Valentin – the fact that you cannot guarantee all 
clients to be 100% compliant by the spec, and there might be a bunch of (used) 
EOL devices around.

I used to work at a place where haproxy were used extensively, so seeing http2 
support getting better and better is a really awesome thing, because it would 
actually mean that http2 could be implemented in that specific environment – I 
do hope in a few releases that http2 in haproxy gets to a point where we could 
rate it as “production ready”, with no real visible bugs from a customer 
perspective, at that point I think it would be good to implement in a large 
scale environment (for a percentage of the requests) to see how much traffic 
might actually get dropped in case the spec is followed – to see from some real 
world workload how many clients actually violate the spec.

For now, I’ll personally leave http2 support disabled – since it’s breaking my 
applications for a big percentage of my users, and I’ll have to find an 
intermediate solution until at least the bug in regards to Firefox losing 
connections (this thing):

Dec 28 21:22:35 localhost haproxy[1534]: 80.61.160.xxx:64921 
[28/Dec/2017:21:22:12.309] https_frontend~ https_frontend/ 
-1/-1/-1/-1/22978 400 0 - - CR-- 1/1/0/0/0 0/0 ""
Dec 28 21:22:40 localhost haproxy[1534]: 80.61.160.xxx:64972 
[28/Dec/2017:21:22:35.329] https_frontend~ cdn-backend/mycdn 0/0/1/0/5001 200 
995 - -  1/1/0/1/0 0/0 "GET /js/app.js?v=1 HTTP/1.1"

I never expect software to be bug free – but at this given point, this specific 
issue that happens causes too much visible “trouble” for end-users for me to be 
able to keep it enabled
I’ll figure out if I can replicate the same issue in more browsers (without 
connection: keep-alive header), maybe that would give us more insight.

Best Regards,
Lucas Rolff

On 29/12/2017, 00.08, "Willy Tarreau" <w...@1wt.eu> wrote:

Hi Lukas,

On Thu, Dec 28, 2017 at 09:19:24PM +0100, Lukas Tribus wrote:
> On Thu, Dec 28, 2017 at 12:29 PM, Lukas Tribus <lu...@ltri.eu> wrote:
> > Hello,
> >
> >
> >> But in this example, you're using 

Re: h2 bad requests

2017-12-28 Thread Lucas Rolff
Hi Sander,

Which exact browser version do you use?

There’s an ongoing thread already 
(https://www.mail-archive.com/haproxy@formilux.org/msg28333.html ) regarding 
the same issue.

Best Regards,
Lucas Rolff
 



Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lucas Rolff
> the output of the http2 golang test and can you please both clarify which OS 
> you reproduce this on?

If I visit http2 golang test, I also don’t see it, and I saw it in developer 
tools (Because dev tools shouldn’t put headers that isn’t requested/received) – 
however based on your findings, that seem to be the case.

What I find odd, is that Firefox (Together with Safari, and Webkit) all have 
the same behaviour, however, I’m unable to reproduce it in Chrome and Opera.
I can reproduce the error in nghttp and curl when using -H “Connection: 
keep-alive” – omitting the header makes the request work in nghttp and curl as 
well (as expected).

However, are we sure that the http2 golang doesn’t just ignore the header (or 
even removes it?)

I found that Firefox actually has a way to enable HTTP Logging to get info 
about what’s going on – it can be enabled by going to about:networking#logging 

What I did is to first to take a sample where it doesn’t fail (now in this case 
for a GET request), figure out what it does specific to my “app.css” file, and 
then another request (retrying 20-30 times), until it failed, and then 
comparing what differs.

When it doesn’t work, it does a “BeginConnect”, some “ResolveProxy” and 
“OnProxyAvailable” function calls, it seems like establishing a connection - if 
that’s the case, it would be odd, since the TCP connection should already be 
established during the initial page request.

From the logs:

Dec 28 21:22:35 localhost haproxy[1534]: 80.61.160.xxx:64921 
[28/Dec/2017:21:22:12.309] https_frontend~ https_frontend/ 
-1/-1/-1/-1/22978 400 0 - - CR-- 1/1/0/0/0 0/0 ""
Dec 28 21:22:40 localhost haproxy[1534]: 80.61.160.xxx:64972 
[28/Dec/2017:21:22:35.329] https_frontend~ cdn-backend/mycdn 0/0/1/0/5001 200 
995 - -  1/1/0/1/0 0/0 "GET /js/app.js?v=1 HTTP/1.1"

This could explain the additional calls which looks like a connection, due to 
the “CR” states:

C : the TCP session was unexpectedly aborted by the client.
R : the proxy was waiting for a complete, valid REQUEST from the client
(HTTP mode only). Nothing was sent to any server.

Why haproxy sees the TCP session as aborted from the client, I’m not sure.

The working example also does actual stuff in the socket thread, where the 
non-working one doesn’t really do much, other than generating headers (last two 
lines)

The output is here: 
https://gist.github.com/lucasRolff/c7f25c93281715c3911d36e9488b111a

> and can you please both clarify which OS you reproduce this on?

I’m personally sitting on OS X, and I’ve been able to reproduce it on Firefox 
on Ubuntu 16.04 as well.

On 28/12/2017, 21.19, "lu...@ltri.eu on behalf of Lukas Tribus"  
wrote:

Hello,



On Thu, Dec 28, 2017 at 12:29 PM, Lukas Tribus  wrote:
> Hello,
>
>
>> But in this example, you're using HTTP/1.1, The "Connection" header is
>> perfectly valid for 1.1. It's HTTP/2 which forbids it. There is no
>> inconsistency here.
>
> For me a request like this:
> $ curl -kv --http2 https://localhost/111 -H "Connection: keep-alive"
> -d "bla=bla"
>
> Fired multiple times from the shell, leads to a "400 Bad Request"
> response in about 20 ~ 30 % of the cases and is forwarded to the
> backend in other cases.
> I'm unable to reproduce a "400 Bad Request" when using GET request in
> my quick tests.
>
>
>
> Here 2 exact same requests with different haproxy behavior:

My previous mail proves that haproxy's behavior is inconsistent.


However I am unable to reproduce the issue with Firefox: none of the
quantum releases (57.0, 57.0.1, 57.0.2, 57.0.3) emit a connection
header in my testing:

- https://http2.golang.org/reqinfo never shows a connection header
(not even with POST)
- sniffing with wiresshark (using SSLKEYLOGFILE) also shows that
Firefox never emits a connection header in H2
- the developer tools *always* show a connection header in the
request, although there really isn't one - clearly there is a
discrepancy between what is transmitted on the wire and what is shown
on in dev tools

What am I missing? Can you guys provide a decrypted trace showing this
behavior, the output of the http2 golang test and can you please both
clarify which OS you reproduce this on?



Thanks,
Lukas




Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lucas Rolff
>Did I get it right, according to the spec, the "Connection"-header is 
> forbidden ("MUST NOT"), still, firefox does send it? This leads to the 
> described issue.

I think it indeed might be the root cause, also for failed GET requests (which 
only seems to happen sometimes?) but it’s really visible with PUT and POST 
requests.

I’ve opened https://bugzilla.mozilla.org/show_bug.cgi?id=1427256 - so if by any 
chance you can go comment with a “I face the same issue”, then Firefox might 
pick it up faster.

>Firefox sends "Connection: keep-alive" while Chrome does not.

Correct, however – it seems like Safari also sends it, so in fact I have to 
open a bug report to Safari as well (

Best Regards,
Lucas Rolff

On 28/12/2017, 12.08, "Maximilian Böhm" <maximilian.bo...@auctores.de> wrote:

Sorry, for my long absence. Thank you, Lucas, for perfectly describing and 
digging into the issue. I'll be here if there is any further assistance 
required.  

Did I get it right, according to the spec, the "Connection"-header is 
forbidden ("MUST NOT"), still, firefox does send it? This leads to the 
described issue.

Just checked it on https://http2.golang.org/.
Firefox sends "Connection: keep-alive" while Chrome does not.

>> I'd rather not fall into such idiocies, you see.
Thanks  whereby, I'd rather prefer such idiocies instead installing 
plugins without asking users (well, that's another topic, I guess.. 
https://www.theverge.com/2017/12/16/16784628/mozilla-mr-robot-arg-plugin-firefox-looking-glass
 )


-Ursprüngliche Nachricht-
Von: Lucas Rolff [mailto:lu...@lucasrolff.com] 
Gesendet: Donnerstag, 28. Dezember 2017 11:27
An: Willy Tarreau <w...@1wt.eu>
Cc: haproxy@formilux.org
Betreff: Re: HTTP/2 Termination vs. Firefox Quantum

> It's normal then, as it's mandated by the HTTP/2 spec to reject 
> requests containing any connection-specific header fields

In that case, haproxy should be consistent in it’s way of handling clients 
sending connection-specific headers:

$ curl 'https://dashboard.domain.com/js/app.js?v=1' -H 'User-Agent: 
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:57.0) Gecko/20100101 
Firefox/57.0' --compressed -H 'Connection: keep-alive' -o /dev/null -vvv
  % Total% Received % Xferd  Average Speed   TimeTime Time  
Current
 Dload  Upload   Total   SpentLeft  
Speed
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:--
 0*   Trying 178.63.183.40...
* TCP_NODELAY set
* Connected to dashboard.domain.com (178.63.183.xxx) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: 
ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [93 bytes data]
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:--
 0* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [3000 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.domain.com
*  start date: Jan  3 11:17:55 2017 GMT
*  expire date: Jan  4 11:17:55 2018 GMT
*  subjectAltName: host "dashboard.domain.com" matched cert's "*.domain.com"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=AlphaSSL CA - SHA256 - G2
*  SSL certificate verify ok.
> GET /js/app.js?v=1 HTTP/1.1
> Host: dashboard.domain.com
> Accept: */*
> Accept-Encoding: deflate, gzip
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:57.0) 
> Gecko/20100101 Firefox/57.0
> Connection: keep-alive
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.5
< Date: Thu, 28 Dec 2017 10:11:34 GMT
< Content-Type: application/javascript; charset=utf-8 < Last-Modified: Sun, 
25 Jun 2017 17:17:05 GMT < Transfer-Encoding: chunked < Vary: Accept-Encoding < 
ETag: W/"594ff011-7b7"
&l

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lucas Rolff
Sorry regarding my previous curl – I didn’t use --http2 in my curl request, but 
result is the same (with negotiated http2 protocol), I’ve removed the TLSv1.2 
output since it’s useless in this case:

===

$ curl 'https://dashboard.domain.com/js/app.js?v=1' -H 'User-Agent: Mozilla/5.0 
(Macintosh; Intel Mac OS X 10.13; rv:57.0) Gecko/20100101 Firefox/57.0' 
--compressed -H 'Connection: keep-alive' -vo /dev/null --http2
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0* 
  Trying 178.63.183.xx...
* TCP_NODELAY set
* Connected to dashboard.domain.com (178.63.183.xx) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.domain.com
*  start date: Jan  3 11:17:55 2017 GMT
*  expire date: Jan  4 11:17:55 2018 GMT
*  subjectAltName: host "dashboard.domain.com" matched cert's "*.domain.com"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=AlphaSSL CA - SHA256 - G2
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7f860b005800)
> GET /js/app.js?v=1 HTTP/2
> Host: dashboard.domain.com
> Accept: */*
> Accept-Encoding: deflate, gzip
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:57.0) 
> Gecko/20100101 Firefox/57.0
> Connection: keep-alive
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 200
< server: nginx/1.13.5
< date: Thu, 28 Dec 2017 10:43:31 GMT
< content-type: application/javascript; charset=utf-8
< last-modified: Sun, 25 Jun 2017 17:17:05 GMT
< vary: Accept-Encoding
< etag: W/"594ff011-7b7"
< content-encoding: gzip
<
{ [683 bytes data]
100   6830   6830 0   3749  0 --:--:-- --:--:-- --:--:--  3752
* Connection #0 to host dashboard.domain.com left intact

===

So as you can see, I’m sending a “Connection: keep-alive” request from the 
client (like Firefox does), protocol is http2, and the response is the 
javascript file I’ve requested.

And when it’s requested in Firefox: 
https://snaps.hcdn.dk/3uaT06s2RJmAqMu5TqSJYAxBHjSzHOJGiHjfK0qcrV.png 

Best Regards,
Lucas Rolff


On 28/12/2017, 11.39, "Willy Tarreau" <w...@1wt.eu> wrote:

On Thu, Dec 28, 2017 at 10:27:28AM +, Lucas Rolff wrote:
> In that case, haproxy should be consistent in it's way of handling clients
> sending connection-specific headers:
> 
> $ curl 'https://dashboard.domain.com/js/app.js?v=1' -H 'User-Agent: 
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:57.0) Gecko/20100101 
Firefox/57.0' --compressed -H 'Connection: keep-alive' -o /dev/null -vvv
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
Current
>  Dload  Upload   Total   SpentLeft  
Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:--  
   0*   Trying 178.63.183.40...
> * TCP_NODELAY set
> * Connected to dashboard.domain.com (178.63.183.xxx) port 443 (#0)
> * ALPN, offering h2
> * ALPN, offering http/1.1
> * Cipher selection: 
ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
> * successfully set certificate verify locations:
> *   CAfile: /etc/ssl/cert.pem
>   CApath: none
> * TLSv1.2 (OUT), TLS handshake, Client hello (1):
> } [512 bytes data]
> * TLSv1.2 (IN), TLS handshake, Server hello (2):
> { [93 bytes data]
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:--  
   0* TLSv1.2 (IN), TLS handshake, Certificate (11):
> { [3000 bytes data]
> * TLSv1.2 (IN), TLS handshake, Server key exchange (12):
> { [333 bytes data]
> * TLSv1.2 (IN), TLS handshake, Server finished (14):
> { [4 bytes data]
> * TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
> } [70 bytes data]
> * TLSv1.2 (OUT), TLS change cipher, Client hello (1):
> } [1 bytes data]
> * TLSv1.2 (OUT), TLS handshake, Finished (20):
> } [16 bytes data]
> * TLSv1.2 (IN), TLS change cipher, Client hello (1):
> { [1 bytes data]
> * TLSv1.2 (IN), TLS handshake, Finished (20):
> { [16 bytes data]
> * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
> 

Re: HAProxy 1.8 takes too long to update new config

2017-12-28 Thread Lucas Rolff
Robin, there's also an ongoing thread with Firefox which has the same issues, 
especially with post/put requests in 1.8.2, you might wanna keep an eye on that 
one as well

Get Outlook for iOS

From: Robin Anil 
Sent: Thursday, December 28, 2017 9:36:56 AM
To: haproxy@formilux.org
Subject: Re: HAProxy 1.8 takes too long to update new config

I can isolate that to http2, not threads at v1.8.2

Separately, I am seeing a lot of intermittent failures with POST/PUT requests, 
I see the following returned from haproxy.
400 Bad request
Your browser sent an invalid request.


I can verify these requests never hit the backend.






On Thu, Dec 28, 2017 at 1:47 AM Robin Anil 
> wrote:
In Http/2 mode with threads enabled, updating the config on a live serving 
haproxy server takes several minutes.

If I turn this off, the update of config is near instantaneous.

The config change itself is just adding a newline in the file. So it feels like 
haproxy is waiting for connections to close down or something...

My question is, is this a known behavior? Definitely not desired, not unless 
the stats page has some indicator that update is happening and connections are 
being migrated over or something.

Robin


Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lucas Rolff
I think I might have found a cause for this to happen, or at least a way to 
fully replicate the issue

I saw that the issue also happens in Firefox whenever doing POST requests 
(Let’s say, logging into a website) – I got a bunch of “400 Bad Request” 
whenever trying to actually do the post request.

I copied the curl command-line from Firefox to get the exact headers and 
cookies that are set during the request towards the site.

I then tried to run the curl command directly and no issues what so ever.

However, if I decide to try the same with nghttp – I get a 
“COMPRESSION_ERROR(0x09)”:

[  0.170] recv GOAWAY frame 

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-27 Thread Lucas Rolff
My small site is basically a html page with two pages (home and about for 
example), each page contains basic markup, some styling and some JavaScript, 
switching pages tends to replicate the issue every now and then (differs a bit 
how often it happens, but possibly every 20-30 request or so)

I’m single user being able to replicate the issue, no other traffic than myself

So the test scenario is fairly easy to replicate in that sense

Tomorrow I’ll check if I can replicate the same issue in other browsers as well

I haven’t been able to replicate it with curl yet and haven’t tried with 
nghttp, I’ll continue to troubleshoot meanwhile, but it’s a bit odd it happens

Best regards,

Get Outlook for iOS<https://aka.ms/o0ukef>

From: lu...@ltri.eu <lu...@ltri.eu> on behalf of Lukas Tribus <lu...@ltri.eu>
Sent: Wednesday, December 27, 2017 10:51:01 PM
To: Lucas Rolff
Cc: haproxy@formilux.org
Subject: Re: HTTP/2 Termination vs. Firefox Quantum

Hello Lucas,



On Wed, Dec 27, 2017 at 9:24 PM, Lucas Rolff <lu...@lucasrolff.com> wrote:
> Can't even compose an email correctly..
>
> So:
>
> I experience the same issue however with nginx as a backend.
>
> I tried enabling “option httplog” within my frontend, it's rather easy for
> me to replicate, it affects a few percent of the traffic.

So you have this html endpoint and you hit F5 in FF Quantum until you
can see the issue or how is it that you actually reproduce? Does this
occur in a idle test environment as well, or do you need production
traffic to hit this issue?



> I have a site, with a total of 3 requests being performed:
>
> -  The HTML itself
> - 1 app.css file
> - 1 app.js file

Please clarify:

- if any of those responses are cached (or if they are uncachable) and
if they use any kind of revalidation (If-modified-since --> 304)
- if any of those files are compressed, by haproxy or nginx, and which
compression is used
- the exact uncompressed content-length of each of those responses
- the exact client OS
- is Quantum a 32 or 64 bit executable on the client?
- is haproxy a 32 or 64 bit executable?
- can you run this repro in debug mode on show the output when the issue occurs?
- RTT and bandwidth between server and client (race conditions may
depend on specific network performance - not every issue is
reproducible on localhost)
- confirm that FF sandboxing is not affecting this issue by lowering
security.sandbox.content.level to 2 or 0 in about:config (then restart
FF) - don't forget to turn it back on




Thanks,
Lukas


Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-27 Thread Lucas Rolff
I tried enabling “option httplog” within my frontend, I do have the same issue 
wit



Re: haproxy 1.8.2 ALPN h2 broken?

2017-12-27 Thread Lucas Rolff
- you said that using multiple certs breaks, but did you get a working state in 
any way ?

Actually regarding the multiple certs breaking – I was wrong.

So, if I use release 1.8.1 (downloaded from haproxy.org and compiling from 
source) then my bind works perfectly.

If I use release 1.8.2 with same compile options, and I use the same bind, or a 
bind even with a single certificate (  bind *:443 ssl crt 
/etc/haproxy/certs/wildcard_domain.com.pem alpn h2,http/1.1 ) I still end up 
with the same error from curl:

curl: (16) Error in the HTTP2 framing layer

So it’s as long as I pass alpn h2,http/1.1 in my bind “flag” that it breaks.

>  if you run haproxy with -d (debug mode), do you see something like this :

Yes, I see the ALPN=h2:

:https_frontend.accept(0006)=0010 from [80.61.160.xxx:52922] ALPN=h2
:https_frontend.clireq[0010:]: GET / HTTP/1.1
:https_frontend.clihdr[0010:]: user-agent: curl/7.54.1
:https_frontend.clihdr[0010:]: accept: */*
:https_frontend.clihdr[0010:]: host: dashboard.domain.com
:cdn-backend.srvcls[0010:adfd]
:cdn-backend.clicls[0010:adfd]
:cdn-backend.closed[0010:adfd]

> are you sure you didn't limit your buffer size to less than 16kB ?

Config between my compiled 1.8.1 and 1.8.2 didn’t change at all, and I’m also 
not touching buffers within the haproxy config, the only defaults I really set 
is connect, client and server timeouts – the rest pretty much stays the same:

https://gist.github.com/lucasRolff/12b2036baa47400d6c3437a67d9f5fd1 - I try to 
avoid touching things the instance does next to no traffic so defaults *should* 
be fine.
So unless buffers change between 1.8.1 an 1.8.2 then no changes has been done.

The specific request will have a content-length of 55 kilobytes.

> how did you manage to get curl to emit this amount of useful debugging 
> information ? I never got that even after reading all options, I'm jealous!

Use -vvv option in curl – or even better, on http2 enabled sites, you can use 
nghttp -v http://url/ it will give you extensive information regarding your 
http2 traffic – since it will be aware of your streams, priorities etc etc.

Best Regards,
Lucas Rolff

On 27/12/2017, 19.25, "Willy Tarreau" <w...@1wt.eu> wrote:

Hi Lucas,

On Wed, Dec 27, 2017 at 04:49:31PM +, Lucas Rolff wrote:
> Hi guys,
> 
> I was running haproxy 1.8.1 and testing out http2, for this I require the 
alpn h2,http/1.1 in my bind - however when using multiple certificates together 
with alpn in 1.8.2 - this seems to break.
> 
> My bind looks like this:
> 
> bind *:443 ssl crt /etc/haproxy/certs/default.pem crt /etc/haproxy/certs 
alpn h2,http/1.1
> 
> So I supply a default certificate (wildcard for a specific domain), 
second I supply a folder that haproxy scans and picks up all certificates 
within that directory - this configuration works perfectly in 1.8.1
> In 1.8.2, I'll get a certificate error whenever I have alpn h2,http/1.1 
added, curl gives following error:
> 
> * Cipher selection: 
ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
> * successfully set certificate verify locations:
> *   CAfile: /usr/local/etc/openssl/cert.pem
>   CApath: /usr/local/etc/openssl/certs
> * TLSv1.2 (OUT), TLS header, Certificate Status (22):
> * TLSv1.2 (OUT), TLS handshake, Client hello (1):
> * TLSv1.2 (IN), TLS handshake, Server hello (2):
> * TLSv1.2 (IN), TLS handshake, Certificate (11):
> * TLSv1.2 (IN), TLS handshake, Server key exchange (12):
> * TLSv1.2 (IN), TLS handshake, Server finished (14):
> * TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
> * TLSv1.2 (OUT), TLS change cipher, Client hello (1):
> * TLSv1.2 (OUT), TLS handshake, Finished (20):
> * TLSv1.2 (IN), TLS change cipher, Client hello (1):
> * TLSv1.2 (IN), TLS handshake, Finished (20):
> * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
> * ALPN, server accepted to use h2
> * Server certificate:
> *  subject: OU=Domain Control Validated; CN=*.domain.com
> *  start date: Jan  3 11:17:55 2017 GMT
> *  expire date: Jan  4 11:17:55 2018 GMT
> *  subjectAltName: host "dashboard.domain.com" matched cert's 
"*.domain.com"
> *  issuer: C=BE; O=GlobalSign nv-sa; CN=AlphaSSL CA - SHA256 - G2
> *  SSL certificate verify ok.
> * Using HTTP2, server supports multi-use
> * Connection state changed (HTTP/2 confirmed)
> * Copying HTTP/2 data in stream buffer to connection buffer after 
upgrade: len=0
> * Using Stream ID: 1 (easy handle 0x7fe99d815400)
> > GET / HTTP/2
> > Host: dashboard.domain.com
> > User-Agent: curl/7.54.1
> > A

Re: haproxy 1.8.2 ALPN h2 broken?

2017-12-27 Thread Lucas Rolff
1.8.2 is the latest version, or do you mean latest version as in compiling 
master from git?

Current build config (my 1.8.1 build config is exactly the same, except version 
number is different):

HA-Proxy version 1.8.2-08396fa 2017/12/23
Copyright 2000-2017 Willy Tarreau <wi...@haproxy.org<mailto:wi...@haproxy.org>>

Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv 
-Wno-unused-label
  OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_SYSTEMD=1 USE_PCRE=1 USE_PCRE_JIT=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.2k-fips  26 Jan 2017
Running on OpenSSL version : OpenSSL 1.0.2k-fips  26 Jan 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT 
IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.32 2012-11-30
Running on PCRE version : 8.32 2012-11-30
PCRE library supports JIT : yes
Built with zlib version : 1.2.7
Running on zlib version : 1.2.7
Compression algorithms supported : identity("identity"), deflate("deflate"), 
raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.

Available polling systems :
  epoll : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
  [SPOE] spoe
  [COMP] compression
  [TRACE] trace

Best Regards,
Lucas Rolff

From: Olivier Doucet <webmas...@ajeux.com>
Date: Wednesday, 27 December 2017 at 19.14
To: Lucas Rolff <lu...@lucasrolff.com>
Cc: "haproxy@formilux.org" <haproxy@formilux.org>
Subject: Re: haproxy 1.8.2 ALPN h2 broken?

Hi Lucas,

There has been so many bugs fixed in HAProxy 1.8.2 you should really check this 
latest version first, and see if you still have this issue.

Olivier




2017-12-27 17:49 GMT+01:00 Lucas Rolff 
<lu...@lucasrolff.com<mailto:lu...@lucasrolff.com>>:
Hi guys,

I was running haproxy 1.8.1 and testing out http2, for this I require the alpn 
h2,http/1.1 in my bind – however when using multiple certificates together with 
alpn in 1.8.2 – this seems to break.

My bind looks like this:

bind *:443 ssl crt /etc/haproxy/certs/default.pem crt /etc/haproxy/certs alpn 
h2,http/1.1

So I supply a default certificate (wildcard for a specific domain), second I 
supply a folder that haproxy scans and picks up all certificates within that 
directory – this configuration works perfectly in 1.8.1
In 1.8.2, I’ll get a certificate error whenever I have alpn h2,http/1.1 added, 
curl gives following error:

* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /usr/local/etc/openssl/cert.pem
  CApath: /usr/local/etc/openssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.domain.com<http://domain.com>
*  start date: Jan  3 11:17:55 2017 GMT
*  expire date: Jan  4 11:17:55 2018 GMT
*  subjectAltName: host "dashboard.domain.com<http://dashboard.domain.com>" 
matched cert's "*.domain.com<http://domain.com>"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=AlphaSSL CA - SHA256 - G2
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fe99d815400)
> GET / HTTP/2
> Host: dashboard.domain.com<http://dashboard.domain.com>
> User-Agent: curl/7.54.1
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, Client hello (1):
curl: (16) Error in the HTTP2 framing layer

Removing alpn (and http2 support) “fixes” the issue.

Best Regards,
Lucas Rolff



Re: haproxy SSL termination performance

2017-12-27 Thread Lucas Rolff
Hi Willy,

I ended up adding an actual backend to perform the test (reused my nginx 
instance I had already), so the connection between haproxy and nginx would be a 
matter of localhost traffic – and I was indeed able to reach about 18k req/s on 
a single core with keep-alive.

I can see that reply didn’t reach the mailing list however, since I failed to 
see that the mailing list doesn’t do an automatic “reply-to: 
haproxy@formilux.org” so have to manually add it, so it’s all solved, and I’m 
happy with the results – on this specific server I was testing from, I did 
reach about 65k req/s on relatively cheap hardware, so even if I want to scale 
to 100k+ req/s it should be no problem from what I can see ( I know there will 
be a slightly bigger overhead when doing a lot of clients also because of 
networking involving more than a single client).

So thanks a lot!

Best Regards,
Lucas Rolff

On 27/12/2017, 19.01, "Willy Tarreau" <w...@1wt.eu> wrote:

On Tue, Dec 26, 2017 at 10:28:57AM +0100, Jérôme Magnin wrote:
> 748 looks like what a single core on a VM can achieve in terms of private 
key
> computation with rsa 2048 certs. You can confirm this by running the 
following
> command in your vm:
> 
> openssl speed rsa2048.
> 
> 21000 is too high to be key computation only. 

Indeed, clearly one is doing RSA only while the other one does resume.

> > My haproxy config looks like this: 
https://gist.github.com/lucasRolff/36fc84ac44aad559c1d43ab6f30237c8
> 
> This configuration has no backend, so each request will be replied to 
with a 503
> response containing a connection: close header, which means each request 
will
> lead to a key computation. 

Good catch, indeed the error (even if it's rewritten as a fake 200) will
result in the connection being aborted and I guess then the SSL context
is not kept in ab in this case. Lucas, a better solution is to use a
redirect, such as :

 redirect location /foo

This will not cost much and will perform a complete HTTP rules evaluation
as well. Some of the numbers we've observed here on a single core/single
threaded core i7-4790 :

 1350 TLSv1.2 key computations/s (RSA2048)
14000 TLSv1.2 connection resumes/s
   18 req/s over TLSv1.2 (keep-alive)

By using the redirect above instead of the errorfile, you should be able
to test all these.

Willy




haproxy 1.8.2 ALPN h2 broken?

2017-12-27 Thread Lucas Rolff
Hi guys,

I was running haproxy 1.8.1 and testing out http2, for this I require the alpn 
h2,http/1.1 in my bind – however when using multiple certificates together with 
alpn in 1.8.2 – this seems to break.

My bind looks like this:

bind *:443 ssl crt /etc/haproxy/certs/default.pem crt /etc/haproxy/certs alpn 
h2,http/1.1

So I supply a default certificate (wildcard for a specific domain), second I 
supply a folder that haproxy scans and picks up all certificates within that 
directory – this configuration works perfectly in 1.8.1
In 1.8.2, I’ll get a certificate error whenever I have alpn h2,http/1.1 added, 
curl gives following error:

* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /usr/local/etc/openssl/cert.pem
  CApath: /usr/local/etc/openssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.domain.com
*  start date: Jan  3 11:17:55 2017 GMT
*  expire date: Jan  4 11:17:55 2018 GMT
*  subjectAltName: host "dashboard.domain.com" matched cert's "*.domain.com"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=AlphaSSL CA - SHA256 - G2
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fe99d815400)
> GET / HTTP/2
> Host: dashboard.domain.com
> User-Agent: curl/7.54.1
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, Client hello (1):
curl: (16) Error in the HTTP2 framing layer

Removing alpn (and http2 support) “fixes” the issue.

Best Regards,
Lucas Rolff


haproxy SSL termination performance

2017-12-26 Thread Lucas Rolff
Hi guys,

I’m currently performing a few tests on haproxy and nginx to determine the best 
software to terminate SSL early in a hosting stack, and also perform some load 
balancing to varnish machines.

I’d love to use haproxy for this setup, since haproxy does one thing really 
great – load balancing and the way it determines if backends are down.

However, one issue I’m facing is the SSL termination performance, on a single 
core I manage to terminate 21000 connections per second on nginx, but only 748 
connections per second in haproxy.

They’re using the exact same cipher suite (TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256) 
to minimize the SSL overhead, I decided to go for AES128 since the security 
itself isn’t super important, but rather just that things are somewhat 
encrypted (mainly serving static files or non-sensitive content).

I’m testing with a single apache benchmark client (actually from the hypervisor 
on where I have my VM running, so the network latency is minimal to rule out 
any networking being the cause to get the highest possible numbers.

I generated a flame graph for both haproxy and nginx using `perf` tool

Haproxy flame graph can be found here: 
https://snaps.trollcdn.com/sadiZsJd96twAez0GUiWJdDiEbwsRPWUxJ3sRskLG4.svg

Nginx flame graph can be found here: 
https://snaps.trollcdn.com/P7PVyDkjhsxbsXCmK6bzVeqWsHHwnOxRucnCYG084f.svg

What I find odd, is that in haproxy you’ll libcrypto.so.1.0.2k with 81k 
samples, but the function right below (unknown) only got 8.3k samples, where in 
nginx the gap is *a lot* smaller, and I’ve still not really figured out what 
actually happens in haproxy that causes this gap.

However, my main concern is the fact that terminating SSL, nginx performs 28 
times better.

I’ve tried running haproxy with both 10 threads, or 10 processes on a 12 core 
machine – pinning each thread or process to a specific core, and putting RX and 
TX queues on individual cores as well to ensure that load would be evenly 
distributed.

Doing the same with nginx, it still reveals a 5.5k requests per second on 
haproxy, but 125.000 requests per second on nginx (22 times difference).
I got absolutely best performance on haproxy by using processes over threads – 
with the processes, it’s not maxing out on the CPU but it is with the threads, 
so not sure why this happens either.

Now, since nginx can serve static files directly, I wanted to replicate the 
same in haproxy so I wouldn’t have to have a backend that would then do a 
connection in the backend, since this could surely degrade the overall requests 
per second on haproxy.

I did this by using an errorfile 200 /etc/haproxy/errorfiles/200.http to just 
serve a file directly on the frontend.

My haproxy config looks like this: 
https://gist.github.com/lucasRolff/36fc84ac44aad559c1d43ab6f30237c8

Do anyone have any suggestions or maybe insight into why haproxy seems to be 
terminating SSL connections at a way lower rate per second, than for example 
nginx? Is there any missing functionality in haproxy that isn’t available, and 
thus causing nginx to succeed in terms of the performance/scalability for 
terminations?

There’s many things I absolutely love about haproxy, but if there’s a 22-28x 
difference in how many SSL terminations it can handle per second, then we’re 
talking about a lot of added hardware to be able to handle, let’s say 500k 
requests per second.

The VM has AES-NI available as well.

Thanks in advance!

Best Regards,
Lucas Rolff