Hi Lukas.

Am 23.01.2019 um 10:24 schrieb Luke Seelenbinder:
> Hi Willy,
> 
> Thanks for continuing to look into this. 
> 
>>
> 
>> I've place an nginx instance after my local haproxy dev config, and
>> found something which might explain what you're observing : the process
>> apparently leaks FDs and fails once in a while, causing 500 to be returned :
> 
> That's fascinating. I would have thought nginx would have had a bit better 
> care given to things like that. . .

This can be fixed with increasing the ulimits ;-).

> Oddly enough, I cannot find any log entries that approximate this. However, 
> it's possible since we're primarily (99+%) using nginx as a reverse-proxy 
> that the fd issues wouldn't appear for us.

What's your ulimit for nginx process?

> My next thought is to try tcpdump to try to determine what's on the wire when 
> the CD-- and SD-- pairs appear, but since our stack is SSL e2e, that might 
> prove difficult. Any suggestions?

If you have enough log space you can try to activate debug log in nginx and 
haproxy.

https://nginx.org/en/docs/debugging_log.html
https://cbonte.github.io/haproxy-dconv/1.9/configuration.html#log => debug

This will have some impacts on the performance as every request creates a lot 
of loglines!

It would be interesting which error you have in the nginx log when the CD/SD 
happen as the 'http2 flood detected' is not in the logs.

Which release of nginx do you use?
http://hg.nginx.org/nginx/tags

Maybe there are some errors in the log which can be found in this directory.
http://hg.nginx.org/nginx/file/release-1.15.8/src/http/v2/

> One more interesting piece of data: if we use htx without h2 on the backends, 
> we only see CD-- entries consistently (with a very, very few SD-- entries). 
> Thus, it would seem whatever is causing the issue is directly related to h2 
> backends. I further think we can safely say it is directly related to h2 
> streams breaking (due to client-side request cancellations) resulting in the 
> whole connection breaking in HAProxy or nginx (though determining which will 
> be the trick).
> 
> There's also a strong possibility we replace nginx with HAProxy entirely for 
> our SSL + H2 setup as we overhaul the backends, so this problem will probably 
> be resolved by removing the problematic interaction.

What was the main reason to use the nginx between the haproxy and backends?
What's the backends?

Regards
Aleks

> I'm still working on running h2load against our nginx servers to see if that 
> turns anything up.
> 
>> And at this point the connection is closed and reopened for new requests.
>> There's never any GOAWAY sent.
> 
> If I'm understanding this correctly, that implies as long as nginx sends 
> GOAWAY properly, HAProxy will not attempt to reuse the connection?
> 
>> I managed to work around the problem by limiting the number of total
>> requests per connection. I find this extremely dirty but if it helps...
>> I just need to figure how to best do it, so that we can use it as well
>> for H2 as for H1.
> 
> We're pretty satisfied with our h2 fe <-> be h1.1 setup right now, so we will 
> probably stick with that for now, since we don't want to have any more 
> operational issues from bleeding-edge bugs. (Not a comment on HAProxy, per 
> se, just a business reality. :-) ) I'm more than happy to try out anything 
> you turn up on our staging setup!
> 
> Best,
> Luke
> 
> 
> —
> Luke Seelenbinder
> Stadia Maps | Founder
> stadiamaps.com
> 
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Wednesday, January 23, 2019 8:28 AM, Willy Tarreau <w...@1wt.eu> wrote:
> 
>> Hi Luke,
>>
> 
>> I've place an nginx instance after my local haproxy dev config, and
>> found something which might explain what you're observing : the process
>> apparently leaks FDs and fails once in a while, causing 500 to be returned :
>>
> 
>> 2019/01/23 08:22:13 [crit] 25508#0: *36705 open() 
>> "/usr/local/nginx/html/index.html" failed (24: Too many open files), client: 
>> 1>
>> 2019/01/23 08:22:13 [crit] 25508#0: accept4() failed (24: Too many open 
>> files)
>>
> 
>> 127.0.0.1 - - [23/Jan/2019:08:22:13 +0100] "GET / HTTP/2.0" 500 579 "-" 
>> "Mozilla/4.0 (compatible; MSIE 7.01; Windows)"
>>
> 
>> The ones are seen by haproxy :
>>
> 
>> 127.0.0.1:47098 [23/Jan/2019:08:22:13.589] decrypt trace/ngx 0/0/0/0/0 500 
>> 701 - - ---- 1/1/0/0/0 0/0 "GET / HTTP/1.1"
>>
> 
>> And at this point the connection is closed and reopened for new requests.
>> There's never any GOAWAY sent.
>>
> 
>> I managed to work around the problem by limiting the number of total
>> requests per connection. I find this extremely dirty but if it helps...
>> I just need to figure how to best do it, so that we can use it as well
>> for H2 as for H1.
>>
> 
>> Best regards,
>> Willy
> 


Reply via email to