Alright, thanks Willy and Lukas! But I wonder why the build-up of connections is between our haproxy-no-tls and haproxy-tls and not the "real" backends. The communication between haproxy-no-tls and haproxy-tls is "mode tcp" and "http-reuse never" and since it's mode tcp, it shouldn't really use the http-reuse? Both haproxy-no-tls and haproxy-tls uses our backends. So wouldn't I see this problem between the haproxy-no-tls and our backends as well if I was affected? I believe I could use "pool-max-conn 0" between haproxy-no-tls and haproxy-tls since it's "mode tcp" and I don't want to reuse connections as it breaks the "option forwardfor", sounds logical?
I've never actually compiled haproxy before, I just install Vincent Bernats PPA packages. I guess I could take his packages and try to build my own deb with the latest snapshot. So a bit off-topic, a pipeline that does nightly builds would make it easier to test changes and bugfixes. If anyone on this list have the time and competence, I could provide the hardware and probably bandwidth (well depending on nr downloads). (I should probably lift this part of the conversation to a separate thread) /Elias On Thu, Sep 12, 2019 at 8:04 AM Willy Tarreau <[email protected]> wrote: > Hi guys, > > On Wed, Sep 11, 2019 at 09:28:23PM +0200, Lukas Tribus wrote: > > Hello Elias, > > > > On Wed, Sep 11, 2019 at 11:52 AM Elias Abacioglu > > <[email protected]> wrote: > > > So we do zero config changes, upgrade haproxy to 2.0.x + restart > haproxy and like a minute or so then it runs out of resources. > > > Each haproxy (v2.0.5, no-TLS) have an request rate of 55-90K/s. > > > Each haproxy (v1.7.11, TLS) have an request rate of 15-20/s. > > > Each haproxy (v2.0.5, no-TLS) have a connection rate of 7-12K/s. > > > Each haproxy (v1.7.11, TLS) has a connection rate of 6-7K/s. > > > > > > I have no clue why a zero config change upgrade would break this > setup. Anyone that can help me go forward with troubleshooting this or > explain what might cause it to mass up established connections? > > > > It's a bug. There are lots of fixes in the 2.0 tree after 2.0.5. You > > could either wait for 2.0.6, or try the current stable git tree (2.0 > > git [1] or the snapshot from last night [2]). > > > > Beginning to troubleshoot 2.0.5 at this point does not make a lot of > > sense inmho considering the amount and importance of the fixes > > committed after 2.0.5. > > Given the connection rate, I suspect it could be related to the excess > of idle conns kept on the backend side. I faced the exact same problem > recently during a test with ulimit -n 100k. The CPU usage was 100% on > a single core at 700 conn/s! Strace showed nothing abnormal, until I > figured that connect() would take 1.4 millisecond, confirmed by perf() > showing that the system was trying hard to find a spare source port. I > addressed it with the commit below that was backported to latest 2.0: > > 4cae3bf631 ("BUG/MEDIUM: connection: don't keep more idle connections > than ever needed") > > Elias, one work-around for the issue is to add "pool-max-conn 0" on your > server lines, or "http-reuse never", or "http-reuse always", depending > on whether the next stage supports keep-alive well. > > And to second what Lukas said, please give the latest snapshot a try, it > would save you from having to adjust your configuration. > > Cheers, > Willy >

