[ANNOUNCE] haproxy-2.2-dev12
Hi, HAProxy 2.2-dev12 was released on 2020/07/04. It added 72 new commits after version 2.2-dev11. Yes, a 12th development release. But the good news is that it's just here to help with testing because we've finally managed to address the performance regression issue spotted by William Dauchy! And it was quite a tough one, so it was a good decision to invested so many efforts on this one before the release. To make a long story short, when you have many very fast servers, almost all of a server's idle connections could be used then released by a thread at once, and taken over by the next thread and so on, never leaving spare ones for other threads. And this takeover would go through the global run queue and cause contention there when using a moderate number of threads. So that wouldn't affect low-performance users but definitely was a performance killer for high performance ones dealing with tens to hundreds of thousands of requests per second. The great thing is that by fixing all these issues we've had to implement a few improvements that were anticipated for later and that this made the internal infrastructure a bit better and further improved the overall performance gap from 2.1. Since 2.2-dev11, the most user-visible changes are: - fixed the performance regression above - addition of the new "pool-low-conn" server setting to improve distribution of idle connections on very fast servers (sub-millisecond response time). We've found that using twice the number of threads seems to provide very good performance. - added a few new fields in the stats page to report the number of idle and used connections per server - new "show servers conn" command on the CLI to visualize the state of used and idle connections of a server, including per-thread - small change on the log-format processing: historically, multiple spaces were merged together as a single separator. This was OK for real logs, but is a bit annoying when building headers, and very annoying for error pages. So this was changed so that only logs merge spaces. This should probably be addressed in a more generic way later, but this was the most reasonble approach for this release. - the RFC5424 log format was missing the sub-second and timezone fields, the former being highly recommended and the latter being mandatory. So this was addressed right before having a new LTS version. I'm not much tempted by backporting this to stable releases because that could result in visible changes that are not welcome in the middle of a stable version, that's why I asked to have it right now. - a few sample fetches and patterns were missing the trailing NUL character and wouldn't always match (I don't remember which ones, sorry). This will likely be backported as it was a bug. - threads are now disabled by default on OpenBSD which lacks thread-local storage and fails to build. Clang seems to emulate it so users of clang can enable USE_THREAD=1 if they want. - "show sess" would endlessly dump new streams when they arrive too fast. It was a real pain so now it will only dump past the last stream known at the moment the command is typed. This means that it may show less streams than the total, but will not result in multi-gigabyte dumps anymore. - for developers, building with DEBUG_MEM_STATS provides a new expert command "debug dev memstats" which shows the total counts (calls and sizes) of memory allocations per line of code. This is very cheap and can be enabled on production servers if suspecting a memory leak somewhere (and it served to spot a regression in a recent fix). In addition to this, William is finishing the addition of a sample fetch to extract the equivalent of the TLS pre-master key for TLS 1.3, which is needed in order to decrypt TLS traffic with Wireshark. It would be useful to have this early so that those who upgrade can place that in their logs if that can help them. Christopher addressed a few other low-importance bugs in the private connections management. Since they were made available just before this release and we've all been very tired by looking at these bugs over the last weeks, I preferred that we let these cool down and we look at them after a small rest; we've accidently broken enough stuff while working on the fixes above, I didn't want to take the risk of creating new breakage. It looks like 3 of these patches could be merged before the release (they already affect previous versions) and the other ones could be merged post-2.2 then backported once considered safe enough. Tim also had some post-2.2 fixes pending to improve free() calls and remove some valgrind complaints on exit. With all the energy spent on the bugs above I couldn't work at all on the doc review I wanted to do. I'll try to do this shortly but it will not be as refined as I'd had expected. Anyway, I now consider 2.2 ready. I
Re: [PATCH] skip slow reg-tests on cirrus-ci
On Sat, Jul 04, 2020 at 12:44:20AM +0500, ??? wrote: > did we forget about it ? Oops you're right, sorry. Now applied. Willy
[no subject]
unsubscribe
Re: [PATCH] skip slow reg-tests on cirrus-ci
did we forget about it ? сб, 27 июн. 2020 г. в 11:10, Илья Шипицин : > Hello, > > slow tests fail from time to time like > https://cirrus-ci.com/task/6319998954110976 > > let us exclude them > > > Cheers, > Ilya Shipitcin >
Re: dev 2.2 High CPU Constantly
I was lucky with google cpu profiler https://github.com/gperftools/gperftools it could summarize cpu time per function. can you try it ? пт, 3 июл. 2020 г. в 23:20, Willy Tarreau : > Hi Igor, > > On Fri, Jul 03, 2020 at 12:52:35PM +0800, Igor Pav wrote: > > Hi William, Tried but still the same ;( > > That's bad. Do you know if your servers actually support 0rtt, and if > this 0rtt currently works between haproxy and the servers ? Because by > having the retry on 0rtt, there are two things which can have an impact > on your CPU usage: > - the alloc+memcpy() of the request buffer before sending it, in order > to be able to send it again if needed ; depending on your bandwidth > this may have an impact ; > > - if 0rtt constantly fails, haproxy would retry without it, so you > could actually be facing the double of the work on the request > processing. > > For the last one you should have a look at your stats page to see if the > retries column increases. You may also want to try without "allow-0rtt" > on the server lines and see if that fixes it. If so, we might be getting > closer (which doesn't mean I have anything in mind about it yet). > > Did 2.1 work fine for you on the same setup ? > > Thanks, > Willy > >
Re: dev 2.2 High CPU Constantly
Hi Igor, On Fri, Jul 03, 2020 at 12:52:35PM +0800, Igor Pav wrote: > Hi William, Tried but still the same ;( That's bad. Do you know if your servers actually support 0rtt, and if this 0rtt currently works between haproxy and the servers ? Because by having the retry on 0rtt, there are two things which can have an impact on your CPU usage: - the alloc+memcpy() of the request buffer before sending it, in order to be able to send it again if needed ; depending on your bandwidth this may have an impact ; - if 0rtt constantly fails, haproxy would retry without it, so you could actually be facing the double of the work on the request processing. For the last one you should have a look at your stats page to see if the retries column increases. You may also want to try without "allow-0rtt" on the server lines and see if that fixes it. If so, we might be getting closer (which doesn't mean I have anything in mind about it yet). Did 2.1 work fine for you on the same setup ? Thanks, Willy
Re: Haproxy decreases throughput
Hello, On Thu, Jul 02, 2020 at 08:51:24PM +0700, Hai Dang Nguyen wrote: > Dear Haproxy team, > Currently, I conduct a small experiment with haproxy. I upload a 20G file > to haproxy and realize that the measured bandwidth is only about 70% -75% > of the direct upload to the backend. All of my devices including client, > haproxy and backend are in the same LAN. So I don't think it is the problem > of transmission. We've fixed a big number of issues over the last few days. I'll try to issue dev12 this evening if I don't fall asleep before, but please give it a try again to check if it's better. In addition you'll need to give more info (bandwidth, #connections, tcp or http, etc). Then we'll possibly ask for more info, some logs or stats output to determine what's happening. > Is there any limitation of haproxy or do I miss something in the config > file? It depends on what the issue really is. > Below is my config and haproxy version in the attached image. Please, really please, do not send images for text outputs! They're a real pain to deal with, you can't search for specific words in them, you can't quote some parts or whatever. Even just copy-pasting a git tag from it is not possible. Just copy-paste the text as it appears on your screen and more people will be likely to read your message and respond! Regards, Willy
Re: [PATCH] MEDIUM: Support TCP keepalive parameters customization
Hi Takeshi, On Fri, Jul 03, 2020 at 11:21:59AM +, mizuta.take...@fujitsu.com wrote: > Dear maintainers, > > Thank you for discussing issue#670 on github. > https://github.com/haproxy/haproxy/issues/670 > > I have attached a patch that resolves the issue. > (I have changed the config keyword from the commit on github.) > Would you please comment on the patch? Thank you for this. We've been very busy these last weeks, chasing a bunch of bugs that have postponed the 2.2 release, which is why I couldn't spend more time discussing with you on this. I'd initially have preferred different names but actually your point about the values used in /proc is at least partially valid. I'm saying "partially" because if others made a mistake by naming their variables we're not forced to copy them :-) But I mean, that's probably OK and I won't argue on this. I'd be interested in others' opinions and/or suggestions on this, but it's not critical. > Documentation and test code will be added in the near future. Thanks. > This is the first time I have posted to this community, so feel free to say > anything. The welcome, and well done for your first post, it's not every day that the first one is that good! > - Documentation should be provided at the same time. Yes please, in the same commit so that any eventual backport that may happen isn't lost! > - Patch should be split. No I don't think anything needs to be split further, it's quite self-contained. Please just add "tcp:" as a subsystem tag. This helps when grepping for various stuff in the history. I think you can tag it MINOR as the impact is extremely low and I don't think I would have much objections against a backport to recent branches after some time cooking in -dev if someone really needs it. Thank you! Willy
Re: HTTP/2 in 2.1.x behaves different than in 2.0.x
On Fri, Jul 03, 2020 at 02:25:33PM +0200, Jerome Magnin wrote: > Hi Christian, > > On Fri, Jul 03, 2020 at 11:02:48AM +0200, Christian Ruppert wrote: > > Hi List, > > > > we've just noticed and confirmed some strange change in behavior, depending > > on whether the request is made with HTTP 1.x or 2.x. > > [...] > > That also affects ACLs like url*/path* and probably others. > > I don't think that is intended, isn't it? > > That looks like a regression to me. If that is a bug/regression, than it > > might be good if it's possible to catch that one via test case (regtest). > > > > This change is intentional and not a regression, it was introduced by > this commit: > http://git.haproxy.org/?p=haproxy.git;a=commit;h=30ee1efe676e8264af16bab833c621d60a72a4d7 Yep, it's the only way not to break end-to-end transmission, which is even harder when H1 is used first and H2 behind. Also please note that "path" is *not* broken because it's already taken from the right place. "url" will see changes when comparing with the previous version which would see a path in H2, or either a path or a uri in H1. Because if you're using "url", in H1 you can already have the two forms. Now what haproxy does is to preserve each URL component intact. If you change the scheme it only changes it. If you call "set-path" it will only change the path, if you use "replace-uri" it will replace the whole uri. I'd say that HTTP/2 with the :authority header was made very browser-centric and went back to the origins of the URIs. It's certain that for all of us working more on the server side it looks unusual but for those on the client side it's more natural. Regardless, what it does was already supported by HTTP/1 agents and even used to communicate with proxies, so it's not a fundamental breakage, it just emphasizes something that people were not often thinking about. Hoping this helps, Willy
Re: [PATCH] BUG/MINOR: http_act: don't check capture id in backend (2)
Hi Tim, On Fri, Jul 03, 2020 at 02:06:34PM +0200, Tim Düsterhus, WoltLab GmbH wrote: > Willy, > > find the patch attached. Looks good, now applied, thank you! Willy
Re: HTTP/2 in 2.1.x behaves different than in 2.0.x
Hi Christian, On Fri, Jul 03, 2020 at 11:02:48AM +0200, Christian Ruppert wrote: > Hi List, > > we've just noticed and confirmed some strange change in behavior, depending > on whether the request is made with HTTP 1.x or 2.x. > [...] > That also affects ACLs like url*/path* and probably others. > I don't think that is intended, isn't it? > That looks like a regression to me. If that is a bug/regression, than it > might be good if it's possible to catch that one via test case (regtest). > This change is intentional and not a regression, it was introduced by this commit: http://git.haproxy.org/?p=haproxy.git;a=commit;h=30ee1efe676e8264af16bab833c621d60a72a4d7 -- Jérôme
[PATCH] BUG/MINOR: http_act: don't check capture id in backend (2)
Willy, find the patch attached. Best regards Tim Düsterhus Developer WoltLab GmbH -- WoltLab GmbH Nedlitzer Str. 27B 14469 Potsdam Tel.: +49 331 96784338 duester...@woltlab.com www.woltlab.com Managing director: Marcel Werk AG Potsdam HRB 26795 P >From ea6bdbfa54b98d0b8a39e4e25ea5271de933867a Mon Sep 17 00:00:00 2001 From: Tim Duesterhus Date: Fri, 3 Jul 2020 13:43:42 +0200 Subject: [PATCH] BUG/MINOR: http_act: don't check capture id in backend (2) To: haproxy@formilux.org Cc: w...@1wt.eu Please refer to commit 19a69b3740702ce5503a063e9dfbcea5b9187d27 for all the details. This follow up commit fixes the `http-response capture` case, the previous one only fixed the `http-request capture` one. The documentation was already updated and the change to `check_http_res_capture` is identical to the `check_http_req_capture` change. This patch must be backported together with 19a69b3740702ce5503a063e9dfbcea5b9187d27. Most likely this is 1.6+. --- src/http_act.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/http_act.c b/src/http_act.c index 1c7a1d4e6..2eac12549 100644 --- a/src/http_act.c +++ b/src/http_act.c @@ -723,7 +723,10 @@ static int check_http_res_capture(struct act_rule *rule, struct proxy *px, char if (rule->action_ptr != http_action_res_capture_by_id) return 1; - if (rule->arg.capid.idx >= px->nb_rsp_cap) { + /* capture slots can only be declared in frontends, so we can't check their + * existence in backends at configuration parsing step + */ + if (px->cap & PR_CAP_FE && rule->arg.capid.idx >= px->nb_rsp_cap) { memprintf(err, "unable to find capture id '%d' referenced by http-response capture rule", rule->arg.capid.idx); return 0; -- 2.27.0
[PATCH] MEDIUM: Support TCP keepalive parameters customization
Dear maintainers, Thank you for discussing issue#670 on github. https://github.com/haproxy/haproxy/issues/670 I have attached a patch that resolves the issue. (I have changed the config keyword from the commit on github.) Would you please comment on the patch? Documentation and test code will be added in the near future. This is the first time I have posted to this community, so feel free to say anything. - Documentation should be provided at the same time. - Patch should be split. - etc Best regards, MIZUTA Takeshi 0001-MEDIUM-Support-TCP-keepalive-parameters-customizatio.patch Description: 0001-MEDIUM-Support-TCP-keepalive-parameters-customizatio.patch
HTTP/2 in 2.1.x behaves different than in 2.0.x
Hi List, we've just noticed and confirmed some strange change in behavior, depending on whether the request is made with HTTP 1.x or 2.x. Steps to reproduce: HAProxy 2.1.x A simple http frontend, including h2 + logging tail -f /var/log/haproxy.log|grep curl curl -s https://example.com -o /dev/null --http1.1 curl -s https://example.com -o /dev/null --http2 Notice the difference: test_https~ backend_test/testsrv1 1/0/0/2/3 200 4075 - - 1/1/0/0/0 0/0 {example.com|curl/7.69.1|} "GET / HTTP/1.1" test_https~ backend_test/testsrv1 0/0/0/3/3 200 4075 - - 1/1/0/0/0 0/0 {example.com|curl/7.69.1|} "GET https://example.com/ HTTP/2.0" Now the same with HAProxy 2.0.14: test_https~ backend_test/testsrv1 1/0/0/2/3 200 4075 - - 1/1/0/0/0 0/0 {example.com|curl/7.69.1|} "GET / HTTP/1.1" test_https~ backend_test/testsrv1 0/0/0/3/3 200 4075 - - 1/1/0/0/0 0/0 {example.com|curl/7.69.1|} "GET / HTTP/2.0" That also affects ACLs like url*/path* and probably others. I don't think that is intended, isn't it? That looks like a regression to me. If that is a bug/regression, than it might be good if it's possible to catch that one via test case (regtest). -- Regards, Christian Ruppert
Re: Rate Limit per IP with queueing (delay)
Returning on the topic, i'm trying a "smarter" solution trying to implement a leaky bucket with a window, as nginx is doing. what i've to do is to store per user the request per minute in current minute and previous minute. i've done in a lua script with a matrix, but i'm quite sure it's not the best solution. I've a couple of questions that I can't make my head around. - is it possible in a LUA script to access/modify the sticky table? if yes, how can i do it? - can i pass to activity value for reference? what's the way? right now the only way to access information from HA in lua is to use http-request set-var and then txn:get_var('txn..') - a lua script that has a global matrix (matrix = {} {}) is it shared with all the other instances/processes of haproxy? - how does lua/haproxy cope with threads sleeping? thanks by On Thu, Jun 11, 2020 at 8:21 AM Igor Cicimov wrote: > Glad you found a solution that works for you. I personally don't see any > issues with this since lua is lightweight and haproxy is famous for > efficient resource management. So all should be good under "normal" usage > and by normal I mean a traffic and usage pattern you expect from your app > users that non maliciously overstep your given limits. I cannot say what > will happen in case of a real DDOS attack and how much this buffering can > hurt you :-/, you might want to wait for a reply from one of the more > knowledgeable users or the devs. > > On Tue, Jun 9, 2020 at 10:38 PM Stefano Tranquillini > wro > >> I may have found a solution, that's a bit more elegant (to me) >> >> The idea is to use a lua script to do some weighted sleep depending on >> data. >> the question is: "is this idea good or bad"? especially, will the >> "core.msleep" have implications on performance for everybody? >> If someone uses all the connections available it will stuck all the >> users, right? >> >> said so, i should cap/limit the number of connections for each user at >> the same time. but that's another story. (i guess i can create an acl with >> OR condition, if it's 30 request in 10 sec or 30 open connections) >> going back to the beginning. >> >> my lua file >> >> function delay_request(txn) >> local number1 = tonumber(txn:get_var('txn.sc_http_req_rate')) >> core.msleep(50 * number1) >> end >> >> core.register_action("delay_request", { "http-req" }, delay_request, 0); >> >> my frontend >> >> frontend proxy >> bind *:80 >> >> stick-table type ip size 100k expire 10s store http_req_rate(10s) >> http-request track-sc0 src >> http-request set-var(txn.sc_http_req_rate) sc_http_req_rate(0) >> http-request lua.delay_request if { sc_http_req_rate(0) gt 30 } >> use_backend api >> >> Basically if there are more than 30 request per 10 seconds, i will make >> them wait 50*count (so starting from 1500ms up to whatver they keep >> insisting) >> does it make sense? >> do you see performance problems? >> >> On Tue, Jun 9, 2020 at 11:12 AM Igor Cicimov < >> ig...@encompasscorporation.com> wrote: >> >>> On Tue, Jun 9, 2020 at 6:48 PM Stefano Tranquillini >>> wrote: >>> Hello, i didn't really get what has been changed in this example, and why. On Tue, Jun 9, 2020 at 9:46 AM Igor Cicimov < ig...@encompasscorporation.com> wrote: > Modify your frontend from the example like this and let us know what > happens: > > frontend proxy > bind *:80 > stick-table type ip size 100k expire 15s store http_req_rate(10s) > sticky table is now here > http-request track-sc0 src table Abuse > but this refers to the other one , do I've to keep this? is it better to have it here or shared? use_backend api_delay if { sc_http_req_rate(0) gt 30 } > this is measuring that in the last 10s there are more than 30 requests, uses the table in this proxy here, not the abuse > use_backend api > > backend api > server api01 api01:80 > server api02 api02:80 > server api03 api03:80 > > backend api_delay > tcp-request inspect-delay 500ms > tcp-request content accept if WAIT_END > server api01 api01:80 > server api02 api02:80 > server api03 api03:80 > > Note that as per the sliding window rate limiting from the examples > you said you read this limits each source IP to 30 requests for the last > time period of 30 seconds. That gives you 180 requests per 60 seconds. > Yes sorry that's typo should had been >>> >>> frontend proxy >>> bind *:80 >>> stick-table type ip size 100k expire 15s store http_req_rate(10s) >>> http-request track-sc0 src >>> use_backend api_delay if { sc_http_req_rate(0) gt 30 } >>> use_backend api >>> In this example, and what I did before, it seems the same behaviour (or at least per my understanding). so that, if a user does more than 30 requests in 10 seconds then the re