[ANNOUNCE] haproxy-2.2-dev12

2020-07-03 Thread Willy Tarreau
Hi,

HAProxy 2.2-dev12 was released on 2020/07/04. It added 72 new commits
after version 2.2-dev11.

Yes, a 12th development release. But the good news is that it's just
here to help with testing because we've finally managed to address the
performance regression issue spotted by William Dauchy! And it was quite
a tough one, so it was a good decision to invested so many efforts on
this one before the release.

To make a long story short, when you have many very fast servers, almost
all of a server's idle connections could be used then released by a thread
at once, and taken over by the next thread and so on, never leaving spare
ones for other threads. And this takeover would go through the global run
queue and cause contention there when using a moderate number of threads.
So that wouldn't affect low-performance users but definitely was a
performance killer for high performance ones dealing with tens to hundreds
of thousands of requests per second.

The great thing is that by fixing all these issues we've had to implement
a few improvements that were anticipated for later and that this made the
internal infrastructure a bit better and further improved the overall
performance gap from 2.1.

Since 2.2-dev11, the most user-visible changes are:
  - fixed the performance regression above

  - addition of the new "pool-low-conn" server setting to improve distribution
of idle connections on very fast servers (sub-millisecond response time).
We've found that using twice the number of threads seems to provide very
good performance.

  - added a few new fields in the stats page to report the number of idle
and used connections per server

  - new "show servers conn" command on the CLI to visualize the state of used
and idle connections of a server, including per-thread

  - small change on the log-format processing: historically, multiple spaces
were merged together as a single separator. This was OK for real logs,
but is a bit annoying when building headers, and very annoying for error
pages. So this was changed so that only logs merge spaces. This should
probably be addressed in a more generic way later, but this was the most
reasonble approach for this release.

  - the RFC5424 log format was missing the sub-second and timezone fields,
the former being highly recommended and the latter being mandatory. So
this was addressed right before having a new LTS version. I'm not much
tempted by backporting this to stable releases because that could result
in visible changes that are not welcome in the middle of a stable version,
that's why I asked to have it right now.

  - a few sample fetches and patterns were missing the trailing NUL character
and wouldn't always match (I don't remember which ones, sorry). This will
likely be backported as it was a bug.

  - threads are now disabled by default on OpenBSD which lacks thread-local
storage and fails to build. Clang seems to emulate it so users of clang
can enable USE_THREAD=1 if they want.

  - "show sess" would endlessly dump new streams when they arrive too fast.
It was a real pain so now it will only dump past the last stream known
at the moment the command is typed. This means that it may show less
streams than the total, but will not result in multi-gigabyte dumps
anymore.

  - for developers, building with DEBUG_MEM_STATS provides a new expert
command "debug dev memstats" which shows the total counts (calls and
sizes) of memory allocations per line of code. This is very cheap and
can be enabled on production servers if suspecting a memory leak
somewhere (and it served to spot a regression in a recent fix).

In addition to this, William is finishing the addition of a sample fetch
to extract the equivalent of the TLS pre-master key for TLS 1.3, which
is needed in order to decrypt TLS traffic with Wireshark. It would be
useful to have this early so that those who upgrade can place that in
their logs if that can help them.

Christopher addressed a few other low-importance bugs in the private
connections management. Since they were made available just before this
release and we've all been very tired by looking at these bugs over the
last weeks, I preferred that we let these cool down and we look at them
after a small rest; we've accidently broken enough stuff while working
on the fixes above, I didn't want to take the risk of creating new
breakage. It looks like 3 of these patches could be merged before the
release (they already affect previous versions) and the other ones could
be merged post-2.2 then backported once considered safe enough.

Tim also had some post-2.2 fixes pending to improve free() calls and
remove some valgrind complaints on exit.

With all the energy spent on the bugs above I couldn't work at all on
the doc review I wanted to do. I'll try to do this shortly but it will
not be as refined as I'd had expected.

Anyway, I now consider 2.2 ready. I

Re: [PATCH] skip slow reg-tests on cirrus-ci

2020-07-03 Thread Willy Tarreau
On Sat, Jul 04, 2020 at 12:44:20AM +0500,  ??? wrote:
> did we forget about it ?

Oops you're right, sorry. Now applied.

Willy



[no subject]

2020-07-03 Thread D Tiz
unsubscribe



Re: [PATCH] skip slow reg-tests on cirrus-ci

2020-07-03 Thread Илья Шипицин
did we forget about it ?

сб, 27 июн. 2020 г. в 11:10, Илья Шипицин :

> Hello,
>
> slow tests fail from time to time like
> https://cirrus-ci.com/task/6319998954110976
>
> let us exclude them
>
>
> Cheers,
> Ilya Shipitcin
>


Re: dev 2.2 High CPU Constantly

2020-07-03 Thread Илья Шипицин
I was lucky with google cpu profiler

https://github.com/gperftools/gperftools

it could summarize cpu time per function.
can you try it ?

пт, 3 июл. 2020 г. в 23:20, Willy Tarreau :

> Hi Igor,
>
> On Fri, Jul 03, 2020 at 12:52:35PM +0800, Igor Pav wrote:
> > Hi William, Tried but still the same ;(
>
> That's bad. Do you know if your servers actually support 0rtt, and if
> this 0rtt currently works between haproxy and the servers ? Because by
> having the retry on 0rtt, there are two things which can have an impact
> on your CPU usage:
>   - the alloc+memcpy() of the request buffer before sending it, in order
> to be able to send it again if needed ; depending on your bandwidth
> this may have an impact ;
>
>   - if 0rtt constantly fails, haproxy would retry without it, so you
> could actually be facing the double of the work on the request
> processing.
>
> For the last one you should have a look at your stats page to see if the
> retries column increases. You may also want to try without "allow-0rtt"
> on the server lines and see if that fixes it. If so, we might be getting
> closer (which doesn't mean I have anything in mind about it yet).
>
> Did 2.1 work fine for you on the same setup ?
>
> Thanks,
> Willy
>
>


Re: dev 2.2 High CPU Constantly

2020-07-03 Thread Willy Tarreau
Hi Igor,

On Fri, Jul 03, 2020 at 12:52:35PM +0800, Igor Pav wrote:
> Hi William, Tried but still the same ;(

That's bad. Do you know if your servers actually support 0rtt, and if
this 0rtt currently works between haproxy and the servers ? Because by
having the retry on 0rtt, there are two things which can have an impact
on your CPU usage:
  - the alloc+memcpy() of the request buffer before sending it, in order
to be able to send it again if needed ; depending on your bandwidth
this may have an impact ;

  - if 0rtt constantly fails, haproxy would retry without it, so you
could actually be facing the double of the work on the request
processing.

For the last one you should have a look at your stats page to see if the
retries column increases. You may also want to try without "allow-0rtt"
on the server lines and see if that fixes it. If so, we might be getting
closer (which doesn't mean I have anything in mind about it yet).

Did 2.1 work fine for you on the same setup ?

Thanks,
Willy



Re: Haproxy decreases throughput

2020-07-03 Thread Willy Tarreau
Hello,

On Thu, Jul 02, 2020 at 08:51:24PM +0700, Hai Dang Nguyen wrote:
> Dear Haproxy team,
> Currently, I conduct a small experiment with haproxy. I upload a 20G file
> to haproxy and realize that the measured bandwidth is only about 70% -75%
> of the direct upload to the backend. All of my devices including client,
> haproxy and backend are in the same LAN. So I don't think it is the problem
> of transmission.

We've fixed a big number of issues over the last few days. I'll try to
issue dev12 this evening if I don't fall asleep before, but please give
it a try again to check if it's better. In addition you'll need to give
more info (bandwidth, #connections, tcp or http, etc). Then we'll possibly
ask for more info, some logs or stats output to determine what's happening.

> Is there any limitation of haproxy or do I miss something in the config
> file?

It depends on what the issue really is.

> Below is my config and haproxy version in the attached image.

Please, really please, do not send images for text outputs! They're a real
pain to deal with, you can't search for specific words in them, you can't
quote some parts or whatever. Even just copy-pasting a git tag from it is
not possible. Just copy-paste the text as it appears on your screen and
more people will be likely to read your message and respond!

Regards,
Willy



Re: [PATCH] MEDIUM: Support TCP keepalive parameters customization

2020-07-03 Thread Willy Tarreau
Hi Takeshi,

On Fri, Jul 03, 2020 at 11:21:59AM +, mizuta.take...@fujitsu.com wrote:
> Dear maintainers,
> 
> Thank you for discussing issue#670 on github.
> https://github.com/haproxy/haproxy/issues/670
> 
> I have attached a patch that resolves the issue.
> (I have changed the config keyword from the commit on github.)
> Would you please comment on the patch?

Thank you for this. We've been very busy these last weeks, chasing a
bunch of bugs that have postponed the 2.2 release, which is why I
couldn't spend more time discussing with you on this.

I'd initially have preferred different names but actually your
point about the values used in /proc is at least partially valid.
I'm saying "partially" because if others made a mistake by naming
their variables we're not forced to copy them :-)

But I mean, that's probably OK and I won't argue on this. I'd be
interested in others' opinions and/or suggestions on this, but
it's not critical.

> Documentation and test code will be added in the near future.

Thanks.

> This is the first time I have posted to this community, so feel free to say
> anything.

The welcome, and well done for your first post, it's not every day that
the first one is that good!

>  - Documentation should be provided at the same time.

Yes please, in the same commit so that any eventual backport that may
happen isn't lost!

>  - Patch should be split.

No I don't think anything needs to be split further, it's quite self-contained.

Please just add "tcp:" as a subsystem tag. This helps when grepping for
various stuff in the history.

I think you can tag it MINOR as the impact is extremely low and I don't
think I would have much objections against a backport to recent branches
after some time cooking in -dev if someone really needs it.

Thank you!
Willy



Re: HTTP/2 in 2.1.x behaves different than in 2.0.x

2020-07-03 Thread Willy Tarreau
On Fri, Jul 03, 2020 at 02:25:33PM +0200, Jerome Magnin wrote:
> Hi Christian,
> 
> On Fri, Jul 03, 2020 at 11:02:48AM +0200, Christian Ruppert wrote:
> > Hi List,
> > 
> > we've just noticed and confirmed some strange change in behavior, depending
> > on whether the request is made with HTTP 1.x or 2.x.
> > [...] 
> > That also affects ACLs like url*/path* and probably others.
> > I don't think that is intended, isn't it?
> > That looks like a regression to me. If that is a bug/regression, than it
> > might be good if it's possible to catch that one via test case (regtest).
> >
> 
> This change is intentional and not a regression, it was introduced by
> this commit:
> http://git.haproxy.org/?p=haproxy.git;a=commit;h=30ee1efe676e8264af16bab833c621d60a72a4d7

Yep, it's the only way not to break end-to-end transmission, which is
even harder when H1 is used first and H2 behind.

Also please note that "path" is *not* broken because it's already taken
from the right place. "url" will see changes when comparing with the
previous version which would see a path in H2, or either a path or a uri
in H1. Because if you're using "url", in H1 you can already have the two
forms.

Now what haproxy does is to preserve each URL component intact. If you
change the scheme it only changes it. If you call "set-path" it will only
change the path, if you use "replace-uri" it will replace the whole uri.

I'd say that HTTP/2 with the :authority header was made very browser-centric
and went back to the origins of the URIs. It's certain that for all of us
working more on the server side it looks unusual but for those on the client
side it's more natural. Regardless, what it does was already supported by
HTTP/1 agents and even used to communicate with proxies, so it's not a
fundamental breakage, it just emphasizes something that people were not
often thinking about.

Hoping this helps,
Willy



Re: [PATCH] BUG/MINOR: http_act: don't check capture id in backend (2)

2020-07-03 Thread Willy Tarreau
Hi Tim,

On Fri, Jul 03, 2020 at 02:06:34PM +0200, Tim Düsterhus, WoltLab GmbH wrote:
> Willy,
> 
> find the patch attached.

Looks good, now applied, thank you!
Willy



Re: HTTP/2 in 2.1.x behaves different than in 2.0.x

2020-07-03 Thread Jerome Magnin
Hi Christian,

On Fri, Jul 03, 2020 at 11:02:48AM +0200, Christian Ruppert wrote:
> Hi List,
> 
> we've just noticed and confirmed some strange change in behavior, depending
> on whether the request is made with HTTP 1.x or 2.x.
> [...] 
> That also affects ACLs like url*/path* and probably others.
> I don't think that is intended, isn't it?
> That looks like a regression to me. If that is a bug/regression, than it
> might be good if it's possible to catch that one via test case (regtest).
>

This change is intentional and not a regression, it was introduced by
this commit:
http://git.haproxy.org/?p=haproxy.git;a=commit;h=30ee1efe676e8264af16bab833c621d60a72a4d7

-- 
Jérôme



[PATCH] BUG/MINOR: http_act: don't check capture id in backend (2)

2020-07-03 Thread Tim Düsterhus , WoltLab GmbH
Willy,

find the patch attached.

Best regards
Tim Düsterhus
Developer WoltLab GmbH

-- 

WoltLab GmbH
Nedlitzer Str. 27B
14469 Potsdam

Tel.: +49 331 96784338

duester...@woltlab.com
www.woltlab.com

Managing director:
Marcel Werk

AG Potsdam HRB 26795 P
>From ea6bdbfa54b98d0b8a39e4e25ea5271de933867a Mon Sep 17 00:00:00 2001
From: Tim Duesterhus 
Date: Fri, 3 Jul 2020 13:43:42 +0200
Subject: [PATCH] BUG/MINOR: http_act: don't check capture id in backend (2)
To: haproxy@formilux.org
Cc: w...@1wt.eu

Please refer to commit 19a69b3740702ce5503a063e9dfbcea5b9187d27 for all the
details. This follow up commit fixes the `http-response capture` case, the
previous one only fixed the `http-request capture` one. The documentation was
already updated and the change to `check_http_res_capture` is identical to
the `check_http_req_capture` change.

This patch must be backported together with 19a69b3740702ce5503a063e9dfbcea5b9187d27.
Most likely this is 1.6+.
---
 src/http_act.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/http_act.c b/src/http_act.c
index 1c7a1d4e6..2eac12549 100644
--- a/src/http_act.c
+++ b/src/http_act.c
@@ -723,7 +723,10 @@ static int check_http_res_capture(struct act_rule *rule, struct proxy *px, char
 	if (rule->action_ptr != http_action_res_capture_by_id)
 		return 1;
 
-	if (rule->arg.capid.idx >= px->nb_rsp_cap) {
+	/* capture slots can only be declared in frontends, so we can't check their
+	 * existence in backends at configuration parsing step
+	 */
+	if (px->cap & PR_CAP_FE && rule->arg.capid.idx >= px->nb_rsp_cap) {
 		memprintf(err, "unable to find capture id '%d' referenced by http-response capture rule",
 			  rule->arg.capid.idx);
 		return 0;
-- 
2.27.0



[PATCH] MEDIUM: Support TCP keepalive parameters customization

2020-07-03 Thread mizuta.take...@fujitsu.com
Dear maintainers,

Thank you for discussing issue#670 on github.
https://github.com/haproxy/haproxy/issues/670

I have attached a patch that resolves the issue.
(I have changed the config keyword from the commit on github.)
Would you please comment on the patch?

Documentation and test code will be added in the near future.

This is the first time I have posted to this community, so feel free to say 
anything.
 - Documentation should be provided at the same time.
 - Patch should be split.
 - etc

Best regards,
MIZUTA Takeshi


0001-MEDIUM-Support-TCP-keepalive-parameters-customizatio.patch
Description:  0001-MEDIUM-Support-TCP-keepalive-parameters-customizatio.patch


HTTP/2 in 2.1.x behaves different than in 2.0.x

2020-07-03 Thread Christian Ruppert

Hi List,

we've just noticed and confirmed some strange change in behavior, 
depending on whether the request is made with HTTP 1.x or 2.x.

Steps to reproduce:
HAProxy 2.1.x
A simple http frontend, including h2 + logging

tail -f /var/log/haproxy.log|grep curl

curl -s https://example.com -o /dev/null --http1.1
curl -s https://example.com -o /dev/null --http2

Notice the difference:
test_https~ backend_test/testsrv1 1/0/0/2/3 200 4075 - -  1/1/0/0/0 
0/0 {example.com|curl/7.69.1|} "GET / HTTP/1.1"
test_https~ backend_test/testsrv1 0/0/0/3/3 200 4075 - -  1/1/0/0/0 
0/0 {example.com|curl/7.69.1|} "GET https://example.com/ HTTP/2.0"


Now the same with HAProxy 2.0.14:
test_https~ backend_test/testsrv1 1/0/0/2/3 200 4075 - -  1/1/0/0/0 
0/0 {example.com|curl/7.69.1|} "GET / HTTP/1.1"
test_https~ backend_test/testsrv1 0/0/0/3/3 200 4075 - -  1/1/0/0/0 
0/0 {example.com|curl/7.69.1|} "GET / HTTP/2.0"


That also affects ACLs like url*/path* and probably others.
I don't think that is intended, isn't it?
That looks like a regression to me. If that is a bug/regression, than it 
might be good if it's possible to catch that one via test case 
(regtest).


--
Regards,
Christian Ruppert



Re: Rate Limit per IP with queueing (delay)

2020-07-03 Thread Stefano Tranquillini
Returning on the topic, i'm trying a "smarter" solution trying to implement
a leaky bucket with a window, as nginx is doing.
what i've to do is to store per user the request per minute in current
minute and previous minute. i've done in a lua script with a matrix, but
i'm quite sure it's not the best solution.
I've a couple of questions that I can't make my head around.

- is it possible in a LUA script to access/modify the sticky table? if yes,
how can i do it?
- can i pass to activity value for reference? what's the way? right now the
only way to access information from HA in lua is to use http-request
set-var and then txn:get_var('txn..')
- a lua script that has a global matrix (matrix = {} {}) is it shared with
all the other instances/processes of haproxy?
- how does lua/haproxy cope with threads sleeping?

thanks by



On Thu, Jun 11, 2020 at 8:21 AM Igor Cicimov 
wrote:

> Glad you found a solution that works for you. I personally don't see any
> issues with this since lua is lightweight and haproxy is famous for
> efficient resource management. So all should be good under "normal" usage
> and by normal I mean a traffic and usage pattern you expect from your app
> users that non maliciously overstep your given limits. I cannot say what
> will happen in case of a real DDOS attack and how much this buffering can
> hurt you :-/, you might want to wait for a reply from one of the more
> knowledgeable users or the devs.
>
> On Tue, Jun 9, 2020 at 10:38 PM Stefano Tranquillini 
> wro
>
>> I may have found a solution, that's a bit more elegant (to me)
>>
>> The idea is to use a lua script to do some weighted sleep depending on
>> data.
>> the question is: "is this idea good or bad"? especially, will the
>> "core.msleep"  have implications on performance for everybody?
>> If someone uses all the connections available it will stuck all the
>> users, right?
>>
>> said so, i should cap/limit the number of connections for each user at
>> the same time. but that's another story. (i guess i can create an acl with
>> OR condition, if it's 30 request in 10 sec or 30 open connections)
>> going back to the beginning.
>>
>> my lua file
>>
>> function delay_request(txn)
>> local number1 = tonumber(txn:get_var('txn.sc_http_req_rate'))
>> core.msleep(50 * number1)
>> end
>>
>> core.register_action("delay_request", { "http-req" }, delay_request, 0);
>>
>> my frontend
>>
>> frontend proxy
>> bind *:80
>>
>> stick-table type ip size 100k expire 10s store http_req_rate(10s)
>> http-request track-sc0 src
>> http-request set-var(txn.sc_http_req_rate) sc_http_req_rate(0)
>> http-request lua.delay_request if { sc_http_req_rate(0) gt 30 }
>> use_backend api
>>
>> Basically if there are more than 30 request per 10 seconds, i will make
>> them wait 50*count (so starting from 1500ms up to whatver they keep
>> insisting)
>> does it make sense?
>> do you see performance problems?
>>
>> On Tue, Jun 9, 2020 at 11:12 AM Igor Cicimov <
>> ig...@encompasscorporation.com> wrote:
>>
>>> On Tue, Jun 9, 2020 at 6:48 PM Stefano Tranquillini 
>>> wrote:
>>>
 Hello,
 i didn't really get what has been changed in this example, and why.

 On Tue, Jun 9, 2020 at 9:46 AM Igor Cicimov <
 ig...@encompasscorporation.com> wrote:

> Modify your frontend from the example like this and let us know what
> happens:
>
> frontend proxy
> bind *:80
> stick-table type ip size 100k expire 15s store http_req_rate(10s)
>

 sticky table is now here


> http-request track-sc0 src table Abuse
>
 but this refers to the other one , do I've to keep this? is it better
 to have it here or shared?

 use_backend api_delay if { sc_http_req_rate(0) gt 30 }
>

 this is measuring that in the last 10s there are more than 30 requests,
 uses the table in this proxy here, not the abuse


> use_backend api
>
> backend api
> server api01 api01:80
> server api02 api02:80
> server api03 api03:80
>
> backend api_delay
> tcp-request inspect-delay 500ms
> tcp-request content accept if WAIT_END
> server api01 api01:80
> server api02 api02:80
> server api03 api03:80
>
> Note that as per the sliding window rate limiting from the examples
> you said you read this limits each source IP to 30 requests for the last
> time period of 30 seconds. That gives you 180 requests per 60 seconds.
>

 Yes sorry that's typo should had been
>>>
>>> frontend proxy
>>> bind *:80
>>> stick-table type ip size 100k expire 15s store http_req_rate(10s)
>>> http-request track-sc0 src
>>> use_backend api_delay if { sc_http_req_rate(0) gt 30 }
>>> use_backend api
>>>
 In this example, and what I did before, it seems the same behaviour (or
 at least per my understanding).
 so that, if a user does more than 30 requests in 10 seconds then the
 re