Re: compression offload ... in default section

2021-10-20 Thread Christopher Faulet

Le 10/20/21 à 01:24, Björn Jacke a écrit :

Hi,

On 19.10.21 11:06, Christopher Faulet wrote:

Sorry Björn, I missed your reply. It is strange, there is no known bug
in this area for now. There is probably something in the request or
response headers preventing the compression to be enabled.


I found the error: the "compression offload" is not honored in the
default section. This behavior is documented but it slipped from my memory.

What is actually the reason why this is silently ignored when it is set
in the the default section? If someone has a reason to set that in the
default section, why does haproxy ignore it intentionally?

If the support of compression offload shall stay intentionally
unsupported in the default section, then it would be good if this would
trigger a configure check warning, wouldn't it?

Thanks
Björn



Good catch. Indeed, this parameter is ignored in defaults sections. However 
there is no warning and the documentation is not really helpful on this point 
because it is only mentioned as a side note. I guess it may be more visible.


This parameter is not supported in defaults sections because there is no way to 
cancel it in backend sections. I guess we can add a directive to disable the 
compression offloading. This way, we will be able to support "compression 
offload" in defaults section.


--
Christopher Faulet



Re: PH disconnects, but "show errors" has 0 entries ?

2021-10-19 Thread Christopher Faulet

Le 10/19/21 à 16:49, Jim Freeman a écrit :

OK - this is weird (so don't shoot the messenger?).
With more tcpdump-ing and examination, the back-end service logs that it sent a 
response, but

  1) tcpdump running on the haproxy instance never sees the response !
      a) 2 proxies - an AWS ELB and on-instance nginx - lie between HAProxy 
instance and the service
  2) sans any response (and within 0.2 to 13 seconds of the request send), 
HAProxy initiates the PH/500 to the client!


It would make sense to me if any timeouts or disconnects were involved - HAProxy 
would report an [sS][DH] or somesuch.


And reverting the sending of the "content-security-policy: frame-ancestors ..." 
and "x-frame-options: ..." response(!) headers makes the problem disappear 
again.  You'll rightly point out that HTTP/1.1 is stateless, and that the prior 
history of the request/response stream (and response headers sent to the client) 
shouldn't affect the (non-)response to a given request.


Any clues as to how/why the PH/500 might be generated without a response to 
trigger it would be most eagerly received.  While it is entirely likely this 
will wind up being a "nut loose on the keyboard" issue, I just thought I'd share 
my observations and befuddlement ...




First of all, I missed a point. The 2.2.8 is quite old. You must upgrade first. 
Then, have you check the rewrite error counters on your backend ? Because, 
AFAIK, it is the only place where a 500 is possible with the PH termination 
state. If you are using http-after-response rules, it may explain this error.


However, share your redacted configuration too. It can help me to explain what 
you observe.


--
Christopher Faulet



Re: PH disconnects, but "show errors" has 0 entries ?

2021-10-19 Thread Christopher Faulet

Le 10/13/21 à 8:30 PM, Jim Freeman a écrit :
In adding a couple of new security response headers via haproxy.cfg (one is 112 
bytes, the other 32), a few requests are now getting 500 status (PH session 
state) responses, but "show errors" has 0 entries?  Most responses succeed (all 
have the additional headers), so it's not a problem with the new headers themselves.


If haproxy generates a PH/500, shouldn't "show errors" show details of the 
offending response ?


Thanks,
...jfree
==
# echo "show info" | socat stdio /run/haproxy/stats.sock | grep ^Version:
Version: 2.2.8-1~bpo10+1

#  echo "show errors -1" | socat - /run/haproxy/stats.sock
Total events captured on [13/Oct/2021:18:24:15.819] : 0

# cat /etc/debian_version
10.11


Hi,

Only parsing errors are reported by "show errors" command. Here PH/500 error is 
most probably due to a header rewrite error. I have not deeply checked however. 
You can verify my assumption by checking the "wrew" counter in "show stats" 
command output on the stats socket.


Header rewrite errors are triggered when there is not enough space in the buffer 
to perform the rewrites. By default, 1024 Bytes are reserved in the buffer, to 
be sure to have enough space to perform some rewrites. If you add many headers 
in the response, it may be the problem. You can increase the reserve by setting 
"tune.maxrewrite" global parameter.


When such error is encountered, HAProxy returns a 500-Internal-Error response. 
You can change that to make it fails silently. To do so, take a look at the 
"strict-mode" http-response action.


--
Christopher Faulet



Re: compression offload and http2

2021-10-19 Thread Christopher Faulet

Le 10/15/21 à 12:04 PM, Björn Jacke a écrit :

On 15.10.21 10:10, Christopher Faulet wrote:


It should work. What is your HAProxy version ?


2.4.7

Björn



Sorry Björn, I missed your reply. It is strange, there is no known bug in this 
area for now. There is probably something in the request or response headers 
preventing the compression to be enabled. If it is easy to reproduce on a test 
platform, you may try to start HAproxy in foreground in debug mode (-db -d on 
the command line). This will print the request and the response on stderr.


--
Christopher Faulet



Re: compression offload and http2

2021-10-15 Thread Christopher Faulet

Le 10/15/21 à 12:47 AM, Björn Jacke a écrit :

Hi,

I noticed that the compression offload feature is not working with
backends using h2. I couldn't find any note in the documentation that
the compression offload feature is limited to http 1 only. Is it a bug
that it doesn't work with http2 or is it by design and just the
documentation might need some clarification here.


Hi,

It should work. What is your HAProxy version ?


--
Christopher Faulet



Re: [PATCH] BUG/MINOR: lua: Fix lua error handling in `hlua_config_prepend_path()`

2021-10-12 Thread Christopher Faulet

Le 10/11/21 à 6:51 PM, Tim Duesterhus a écrit :

Set an `lua_atpanic()` handler before calling `hlua_prepend_path()` in
`hlua_config_prepend_path()`.

This prevents the process from abort()ing when `hlua_prepend_path()` fails
for some reason.

see GitHub Issue #1409

This is a very minor issue that can't happen in practice. No backport needed.
---
  src/hlua.c | 27 +--
  1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/src/hlua.c b/src/hlua.c
index b03d8bb46..1dbaaa3e0 100644
--- a/src/hlua.c
+++ b/src/hlua.c
@@ -11167,8 +11167,31 @@ static int hlua_config_prepend_path(char **args, int 
section_type, struct proxy
}
LIST_APPEND(_path_list, >l);
  
-	hlua_prepend_path(hlua_states[0], type, path);

-   hlua_prepend_path(hlua_states[1], type, path);
+   /* Handle the global state and the per-thread state for the first
+* thread. The remaining threads will be initialized based on
+* prepend_path_list.
+*/
+   for (size_t i = 0; i < 2; i++) {
+   lua_State *L = hlua_states[i];
+   const char *error;
+
+   if (setjmp(safe_ljmp_env) != 0) {
+   lua_atpanic(L, hlua_panic_safe);
+   if (lua_type(L, -1) == LUA_TSTRING)
+   error = lua_tostring(L, -1);
+   else
+   error = "critical error";
+   fprintf(stderr, "lua-prepend-path: %s.\n", error);
+   exit(1);
+   } else {
+   lua_atpanic(L, hlua_panic_ljmp);
+   }
+
+   hlua_prepend_path(L, type, path);
+
+   lua_atpanic(L, hlua_panic_safe);
+   }
+
return 0;
  
  err2:




Thanks Tim, now merged !

--
Christopher Faulet



Re: Use variables in healthchecks

2021-10-08 Thread Christopher Faulet

Le 10/7/21 à 11:59 AM, Basti Schubert a écrit :

Hi there,

i'd like to use variables in healtchecks to set the host header to the
ip of the server that is beeing checked, something like this:


http-check connect port 443 ssl
http-check send meth GET uri /ecp/healthcheck.htm ver HTTP/1.1 hdr host 
$SERVERIP
http-check expect status 200


Problem is: i don't know if this is possible and if there is already a
variable for such configs (can %si from the log format section be used
for this?)

I'm on haproxy 2.2.x



Hi,

It is possible to use sample fetches and log-format variables. Thus, you can 
indeed rely on "%si". But you must use 2.2.14 or newer.


--
Christopher Faulet



[ANNOUNCE] haproxy-2.4.7

2021-10-04 Thread Christopher Faulet

Hi,

HAProxy 2.4.7 was released on 2021/10/04. It added 2 new commits
after version 2.4.5.

No, it is no an error. The 2.4.6 was released few hours ago to fix a
regression of the 2.4.5. But I was a little too enthusiastic because
another bug was reported by Yves Lafon. Thus we decided to not announce the
2.4.6 to immediately release the 2.4.7.

The first regression was introduced by the commit feca2a453 ("BUG/MINOR:
filters: Always set FLT_END analyser when CF_FLT_ANALYZE flag is
set"). Because of a typo in the request analyzers when a filter was attached
to a stream, it was possible to hang on the filter release stage. HTTP
filters (HTTP compression and cache) didn't seem to be affected by the bug
in HTTP mode. However, in TCP mode, data forwarding was blocked until the
client or the server timeout expiration.

The second regression was introduced by the commit db8ba1069 ("BUG/MEDIUM:
http-ana: Reset channels analysers when returning an error"). The request
analyzers were not properly cleared when a redirect rule from a redirect
ruleset (via the "redirect ..." directive) was applied. This means that some
HTTP request processing were possible on an already flushed request channel,
leading to crash.

If you planned to upgrade to the 2.4.5 or the 2.4.6, please use the 2.4.7
instead.

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/2.4/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.4.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.4.git
   Changelog: http://www.haproxy.org/download/2.4/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Christopher Faulet (2):
  BUG/MEDIUM: http-ana: Clear request analyzers when applying redirect rule
  BUG/MEDIUM: filters: Fix a typo when a filter is attached blocking the 
release

--
Christopher Faulet



Re: [ANNOUNCE] haproxy-2.4.5

2021-10-04 Thread Christopher Faulet

Le 10/4/21 à 11:05 AM, Herzfeld, Patrick a écrit :

Ciao Christopher

I see the new release is already on the haproxy.org website, will there be an 
official release mail to the mailing list or no?



I'm adding the mailing list to notify everyone.

We are hunting another bug. Unfortunately, it was reported after the 2.4.6 was 
released. We try to figure out if it is a major problem or not for now. But it 
crashes HAProxy. So, a 2.4.7 will probably be released very soon.



--
Christopher Faulet



Re: [ANNOUNCE] haproxy-2.4.5

2021-10-03 Thread Christopher Faulet

Le 10/3/21 à 7:46 PM, Matthias Fechner a écrit :

Am 03.10.2021 um 08:53 schrieb Christopher Faulet:



Damned ! You're right...

It is a typo in the commit feca2a453 ("BUG/MINOR: filters: Always set FLT_END 
analyzer when CF_FLT_ANALYZE flag is set"). It also affects the 2.5-DEV. 


thanks a lot Christopher for the quick fix.
Just prepared a new package for FreeBSD and it is working fine.

I will ask the maintainer to include this patch till a new release is out.



Thanks Matthias. FYI, I will emit a new release this morning.


--
Christopher Faulet



Re: [ANNOUNCE] haproxy-2.4.5

2021-10-03 Thread Christopher Faulet

Le 10/3/21 à 9:06 AM, Vincent Bernat a écrit :

  ❦  3 October 2021 08:53 +02, Christopher Faulet:


I will push a fix. As a workaround, you can temporarily disable the HTTP
compression filter.


Will you release 2.4.6 or should we push packages for 2.4.5 with the
patch? For Debian/Ubuntu, I didn't push packages for 2.4.5 yet.



It is probably better to make a new release. At least for everyone relying on 
the sources instead of a distribution package.


--
Christopher Faulet



Re: [ANNOUNCE] haproxy-2.4.5

2021-10-03 Thread Christopher Faulet

Le 10/2/21 à 10:54 AM, Matthias Fechner a écrit :

Am 01.10.2021 um 18:09 schrieb Christopher Faulet:

HAProxy 2.4.5 was released on 2021/10/01. It added 69 new commits
after version 2.4.4.



Damned ! You're right...

It is a typo in the commit feca2a453 ("BUG/MINOR: filters: Always set FLT_END 
analyzer when CF_FLT_ANALYZE flag is set"). It also affects the 2.5-DEV.


The patch is pretty simple:

diff --git a/src/filters.c b/src/filters.c
index 136a3e80b..f64c192bd 100644
--- a/src/filters.c
+++ b/src/filters.c
@@ -475,7 +475,7 @@ flt_stream_start(struct stream *s)
}
if (strm_li(s) && (strm_li(s)->analysers & AN_REQ_FLT_START_FE)) {
s->req.flags |= CF_FLT_ANALYZE;
-   s->req.analysers |= AN_RES_FLT_END;
+   s->req.analysers |= AN_REQ_FLT_END;
}
return 0;
 }


I will push a fix. As a workaround, you can temporarily disable the HTTP 
compression filter.


Thanks for the report !

--
Christopher Faulet



[ANNOUNCE] haproxy-2.4.5

2021-10-01 Thread Christopher Faulet
 http://www.haproxy.org/download/2.4/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Amaury Denoyelle (8):
  BUG/MINOR: connection: prevent null deref on mux cleanup task allocation
  BUILD: ist: prevent gcc11 maybe-uninitialized warning on istalloc
  BUG/MINOR: server: allow 'enable health' only if check configured
  MINOR: server: implement a refcount for dynamic servers
  MINOR: global: define MODE_STOPPING
  BUG/MINOR: server: do not use refcount in free_server in stopping mode
  MINOR: server: return the next srv instance on free_server
  BUG/MINOR: stats: use refcount to protect dynamic server on dump

Christopher Faulet (26):
  MINOR: lua: Add a flag on lua context to know the yield capability at run 
time
  BUG/MINOR: lua: Yield in channel functions only if lua context can yield
  BUG/MINOR: lua: Don't yield in channel.append() and channel.set()
  BUG/MINOR: stream: Don't release a stream if FLT_END is still registered
  BUG/MEDIUM: http-ana: Reset channels analysers when returning an error
  BUG/MINOR: filters: Always set FLT_END analyser when CF_FLT_ANALYZE flag 
is set
  BUG/MINOR: filters: Set right FLT_END analyser depending on channel
  BUG/MEDIUM: mux-h1: Remove "Upgrade:" header for requests with payload
  MINOR: htx: Skip headers with no value when adding a header list to a 
message
  CLEANUP: mux-h1: Remove condition rejecting upgrade requests with payload
  BUG/MEDIUM: stream-int: Don't block SI on a channel policy if EOI is 
reached
  BUG/MAJOR: mux-h1: Don't eval input data if an error was reported
  BUG/MINOR: tcpcheck: Improve LDAP response parsing to fix LDAP check
  BUG/MINOR: h1-htx: Fix a typo when request parser is reset
  BUG/MEDIUM: mux-h1: Adjust conditions to ask more space in the channel 
buffer
  BUG/MEDIUM: stream-int: Notify stream that the mux wants more room to 
xfer data
  BUG/MEDIUM: stream: Stop waiting for more data if SI is blocked on 
RXBLK_ROOM
  MINOR: stream-int: Set CO_RFL transient/persistent flags apart in 
si_cs_rcv()
  MINOR: htx: Add an HTX flag to know when a message is fragmented
  MINOR: htx: Add a function to know if the free space wraps
  BUG/MEDIUM: stream-int: Defrag HTX message in si_cs_recv() if necessary
  MINOR: stream-int: Notify mux when the buffer is not stuck when calling 
rcv_buf
  BUG/MINOR: mux-h1/mux-fcgi: Sanitize TE header to only send "trailers"
  MINOR: arg: Be able to forbid unresolved args when building an argument 
list
  BUG/MINOR: tcpcheck: Don't use arg list for default proxies during parsing
  BUG/MINOR: tcp-rules: Stop content rules eval on read error and 
end-of-input

David Carlier (1):
  BUILD: tools: get the absolute path of the current binary on NetBSD.

Dragan Dosen (2):
  BUG/MINOR: flt-trace: fix an infinite loop when random-parsing is set
  BUG/MINOR: http-ana: increment internal_errors counter on response error

Emeric Brun (1):
  DOC: peers: fix doc "enable" statement on "peers" sections

William Lallemand (3):
  BUG/MINOR: systemd: ExecStartPre must use -Ws
  DOC: management: certificate files must be sanitized before injection
  MINOR: Makefile: add MEMORY_POOLS to the list of DEBUG_xxx options

Willy Tarreau (25):
  BUG/MINOR: compat: make sure __WORDSIZE is always defined
  CLEANUP: pools: factor all malloc_trim() calls into trim_all_pools()
  MINOR: pools: automatically disable malloc_trim() with external allocators
  MINOR: pools: use mallinfo2() when available instead of mallinfo()
  BUG/MINOR: cli/payload: do not search for args inside payload
  BUILD: activity: use #ifdef not #if on USE_MEMORY_PROFILING
  BUILD/MINOR: defaults: eliminate warning on MAXHOSTNAMELEN with -Wundef
  BUILD/MINOR: ssl: avoid a build warning on LIBRESSL_VERSION with -Wundef
  IMPORT: slz: silence a build warning with -Wundef
  BUILD/MINOR: regex: avoid a build warning on USE_PCRE2 with -Wundef
  BUILD: ssl: next round of build warnings on LIBRESSL_VERSION_NUMBER
  BUILD: ssl: fix two remaining occurrences of #if USE_OPENSSL
  BUILD: tools: properly guard __GLIBC__ with defined()
  BUG/MINOR: vars: improve accuracy of the rules used to check expression 
validity
  MINOR: sample: add missing ARGC_ entries
  BUG/MINOR: vars: properly set the argument parsing context in the 
expression
  BUG/MINOR: vars: truncate the variable name in error reports about scope.
  BUG/MINOR: vars: do not talk about global section in CLI errors for 
set-var
  BUILD: compiler: fixed a missing test on  defined(__GNUC__)
  BUILD: halog: fix a -Wundef warning on non-glibc systems
  BUILD: threads: fix -Wundef for _POSIX_PRIORITY_SCHEDULING on libmusl
  BUG/MEDIUM: leastconn: fix rare possibility of divide by zero
  

Re: AW: Disabling HTTP/1.1 pipelining

2021-09-24 Thread Christopher Faulet

Le 9/21/21 à 6:00 PM, Stefan Behte a écrit :

Hi Christopher,

thank you for the hint, I'm aware of the different ways to mitigate DDoS with 
rate limits etc., I was just curious about the pipelining vector. :)

http://www.haproxy.org/download/2.4/doc/configuration.txt says:

" By default HAProxy operates in keep-alive mode with regards to persistent
   connections: for each connection it processes each request and response, and
   leaves the connection idle on both sides between the end of a response and
   the start of a new request. This mode may be changed by several options such
   as "option http-server-close" or "option httpclose". Setting "option
   http-server-close" enables HTTP connection-close mode on the server side
   while keeping the ability to support HTTP keep-alive and pipelining on the
   client side."

"1.1. The HTTP transaction model" and " timeout http-keep-alive" also mention 
pipelining.



Section 1.1 mainly describes generalities about the HTTP protocol. Only the end 
of the section is focused on HAProxy and it is specified it only supports 
keep-alive mode, not the pipelining.  However, I agree it is pretty confusing 
because pipelining is mentioned in "option http-server-close" and "timeout 
http-keep-alive" descriptions.


In fact, the ambiguities comes from the fact that HAProxy does not performed any 
HTTP pipelining. But the client is free to send several requests in same time. 
No error will be triggered. However, the requests will be processed the one 
after the other. Thus, HAProxy does not perform any HTTP pipelining but it does 
not forbid it.



So I guess I did just misunderstand the documentation and it would be nice to 
just clarify it in the docs that haproxy does not support HTTP/1.1 pipelining.


I agree. Pipelining should at least be removed from "option http-server-close" 
description. And section 1.1 should be reword to be clear on this point.



--
Christopher Faulet



Re: Disabling HTTP/1.1 pipelining

2021-09-20 Thread Christopher Faulet

Le 9/17/21 à 1:20 PM, Stefan Behte a écrit :

Hi everyone,

surely many on this list have heard about the meris botnet 
(https://krebsonsecurity.com/2021/09/krebsonsecurity-hit-by-huge-new-iot-botnet-meris/) 
which uses HTTP/1.1 pipelining for layer 7 attacks.


As far as I can see, it's not possible to disallow HTTP pipelining in haproxy, 
so the best possibility could be "option httpclose"?


Of course, this does not solve everything when a ~100k botnet is attacking, but 
it could ease the initial load / mitigate the pipelining vector a bit, as the 
attack clients have longer RTT.


Or maybe I am missing something?


Hi,

HAproxy does not support HTTP pipelining. But it may be configured to mitigate 
ddos attack. There are several mechanisms that you can use, depending on your 
applications. A quick search on the net about "haproxy ddos prevention" will 
give you several hints.


Regards,
--
Christopher Faulet



Re: [PATCH 1/2] CLEANUP: Include check.h in flt_spoe.c

2021-09-20 Thread Christopher Faulet

Le 9/16/21 à 5:38 PM, Tim Duesterhus a écrit :

This is required for the prototype of spoe_prepare_healthcheck_request().
---
  src/flt_spoe.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/flt_spoe.c b/src/flt_spoe.c
index 28a5a24f8..aeb68c889 100644
--- a/src/flt_spoe.c
+++ b/src/flt_spoe.c
@@ -17,6 +17,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 



Thanks Tim! Both patches merged now.

--
Christopher Faulet



Re: Is it possible to capture the body of http responses?

2021-09-15 Thread Christopher Faulet

Le 9/14/21 à 3:14 AM, Ryan Burn a écrit :



On Thu, Sep 9, 2021 at 12:22 AM Christopher Faulet <mailto:cfau...@haproxy.com>> wrote:


Le 8/11/21 à 2:53 AM, Ryan Burn a écrit :
 > I'm working on integrating HAProxy with traceable.ai
<http://traceable.ai> <http://traceable.ai <http://traceable.ai>>'s
 > security product.
 >
 > As part of the integration, we'd like to capture the contents of any http
 > responses processed by HAProxy and send them to a service either via SPOA
or an
 > RPC call from Lua. The response contents are used by the product to help
 > identify possible security threats.
 >
 > I've tried a few things, but haven't found a reliable way to capture the
 > contents of response bodies. Is this possible with HAProxy?
 >
 > Here are the approaches I've explored so far:
 >
 > 1. I used the "res.body" fetch but that only provides the contents
sometimes (I
 > presume if it's available in a buffer):
 >

https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/extcap.conf#L19

<https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/extcap.conf#L19>

 >

<https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/extcap.conf#L19

<https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/extcap.conf#L19>>
 >
 > 2. I also tried accessing the contents of the response channel from a Lua
 > action, but that fails with "Cannot manipulate HAProxy channels in HTTP 
mode"
 >

https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/response.lua#L5

<https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/response.lua#L5>

 >

<https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/response.lua#L5

<https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/response.lua#L5>>


About the sample fetches, on HAProxy 2.3 and lower, there is no way to get 
the
response payload because it is not possible to wait for it. There is no
equivalent to the "http-buffer-request" option on the response side. On
HAProxy-2.4, it is possible by using "wait-for-body" HTTP rule, available on
the
request and the response side. However, it is still limited by the buffer 
size.


Thanks Christopher! Do you know how to access the response body from a SPOA if 
you add the "wait-for-body"?


I added the wait-for-proxy rules to my example project, but the "res.body" 
argument still doesn't consistently provide the full body.
https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/haproxy.cfg#L15-L16 
<https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/haproxy.cfg#L15-L16>


I've checked your configuration and your SPOE message is sent on the 
'on-http-response' event. This event is triggered before 'http-response' ruleset 
evaluation. Thus the 'wait-for-body' action is not performed yet at this stage. 
Here, you should use a SPOE group and send it using 'send-spoe-group' action. 
The same should be done on the request side.


--
Christopher Faulet



Re: Is it possible to capture the body of http responses?

2021-09-09 Thread Christopher Faulet

Le 8/11/21 à 2:53 AM, Ryan Burn a écrit :
I'm working on integrating HAProxy with traceable.ai <http://traceable.ai>'s 
security product.


As part of the integration, we'd like to capture the contents of any http 
responses processed by HAProxy and send them to a service either via SPOA or an 
RPC call from Lua. The response contents are used by the product to help 
identify possible security threats.


I've tried a few things, but haven't found a reliable way to capture the 
contents of response bodies. Is this possible with HAProxy?


Here are the approaches I've explored so far:

1. I used the "res.body" fetch but that only provides the contents sometimes (I 
presume if it's available in a buffer):
https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/extcap.conf#L19 
<https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/extcap.conf#L19>


2. I also tried accessing the contents of the response channel from a Lua 
action, but that fails with "Cannot manipulate HAProxy channels in HTTP mode"
https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/response.lua#L5 
<https://github.com/rnburn/haproxy-extcap/blob/master/test/docker/response.lua#L5>


Hi Ryan,

About the SPOE, it is on the todo-list to add streaming feature to be able to 
send request/response payload to an agent and be able to rewrite it. 
Unfortunately, to make it clean and usable, it requires a total refactoring and 
for now, I haven't found the time to work on it.


About the sample fetches, on HAProxy 2.3 and lower, there is no way to get the 
response payload because it is not possible to wait for it. There is no 
equivalent to the "http-buffer-request" option on the response side. On 
HAProxy-2.4, it is possible by using "wait-for-body" HTTP rule, available on the 
request and the response side. However, it is still limited by the buffer size.


With the Lua, it is only possible by writing a filter using the experimental Lua 
filter API, available in HAProxy 2.5. This API is really experimental for now, 
but it may be a solution to analyze the whole message payload, regardless its 
size. However, It may be painful because the API may be incomplete and because 
dealing with multiple buffers is not simple, especially if you don't want to 
forward the payload before the end of analysis.


--
Christopher Faulet



Re: http-request return bytes_read from v2.3 to v2.4

2021-07-26 Thread Christopher Faulet

Le 7/25/21 à 1:26 PM, William Dauchy a écrit :

Hello,

While upgrading from v2.3.x to v2.4.x I noticed a difference of bytes
read in requests http logs with this simple frontend config:

frontend foo
 bind 127.0.0.1:8000
 http-request return

in v2.3, the logs (%B) tells me 54 bytes, in v2.4, I am getting 49. So
a difference of 5 bytes per request.

I was simply curious to understand that behavior change. Is it a
change in the way we count bytes in stats, or is there really a
difference in the payload we write?
When I look at a dump, the TCP payloads are identical with 57 bytes in
both versions. And the overall length of the packet is 123 bytes.
Maybe that's expected, but I wanted some clarification.



Hi William,

The response size reported in the log messages is related to the internal 
representation of the HTTP message. It is a limitation of the current design. It 
means this size is not equal to the raw size of the message and may change when 
the HTX representation changes. In this case, in 2.4, the start-line 
representation was changed, a 32-bits integer was removed from hx_sl structure. 
In addition, the end-of-message block (EOM) was removed from the HTX 
representation. It counted for 1 byte. This explains the difference of 5 bytes 
between 2.3 and 2.4.


--
Christopher Faulet



[ANNOUNCE] haproxy-2.0.23

2021-07-16 Thread Christopher Faulet

Hi,

HAProxy 2.0.23 was released on 2021/07/16. It added 97 new commits
after version 2.0.22.

All commits were already mentioned in announcements of the last 2.3 and 2.2
releases. So I will be brief. Here is the list of main bugs fixed by this
release:

  * An issue in the CONTINUATION frames parsing in h2 leading to spurious
wakeups.

  * A bug in the shutdowns detection for h2 connections.

  * A possible deadlock if "set maxconn server" command was used when there
was a pending connection ready to be dequeued.

  * A bug in the HTX defragmentation leading to crash. The bug might be
encountered in the HTTP compression filter or in HTTP header
replacement.

  * An old bug preventing the dequeuing for servers with a very low maxconn
because the load balancing was not skipped when a new connection was
picked from the proxy's or server's queue.

  * A bug in the sock part leading to high CPU usage because some early
connection failures might be missed.

  * A thread-safety issue with the SHCTX code when compiled with
USE_PRIVATE_CACHE mode. It was not using any locks.

  * An issue with the abortonclose option. It was not working since a while.

  * A bug in the HTTP compression leading to truncated or corrupted
responses.

  * "url_ip"/"url_port" sample fetches not properly handling url parsing
errors.

  * A bug in the cpu-map notation when both processes and threads were
specified, most specifically P-Q/1 or 1/P-Q notation.

  * Some issues affecting the peers synchronization.

  * H1 idle connections on server side receiving data not closed. Because of
this bug, it was possible to send pending 408-Request-time-out responses
to clients.

  * Wrong number of retries reported in logs if no connection was
attempted. Since the beginning, when the session was aborted before any
connection attempt to any server, the backend retries value was
reported, instead of 0.

  * A bug with the method sample fetch when an exotic method is found.

Note that resolvers performance issues fixed in upper versions were not
fixed in this one because the code is too different. This part is a
perpetual source of bugs. It was too risky to backport the fixes. If you
experience any issue with the resolvers, please consider to upgrade to a
newer version.

Thanks everyone for your help and your contributions!

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/2.0/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.0.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.0.git
   Changelog: http://www.haproxy.org/download/2.0/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Amaury Denoyelle (6):
  BUG/MINOR: server: free srv.lb_nodes in free_server
  BUG/MEDIUM: config: fix cpu-map notation with both process and threads
  BUG/MINOR: http_fetch: fix possible uninit sockaddr in fetch_url_ip/port
  BUG/MAJOR: server: prevent deadlock when using 'set maxconn server'
  BUG/MINOR: stick-table: insert srv in used_name tree even with fixed id
  BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check

Christopher Faulet (34):
  DOC: Explicitly state only IPv4 are supported by forwardfor/originalto 
options
  BUG/MEDIUM: threads: Ignore current thread to end its harmless period
  BUG/MINOR: http-fetch: Make method smp safe if headers were already 
forwarded
  BUG/MINOR: http_htx: Remove BUG_ON() from http_get_stline() function
  BUG/MINOR: logs: Report the true number of retries if there was no 
connection
  BUG/MINOR: mux-h1: Release idle server H1 connection if data are received
  BUG/MAJOR: mux-h2: Properly detect too large frames when decoding headers
  BUG/MEDIUM: mux-h2: Fix dfl calculation when merging CONTINUATION frames
  BUG/MEDIUM: mux-h2: Properly handle shutdowns when received with data
  BUG/MINOR: htx: Preserve HTX flags when draining data from an HTX message
  BUG/MINOR: applet: Notify the other side if data were consumed by an 
applet
  BUG/MINOR: hlua: Don't rely on top of the stack when using Lua buffers
  BUG/MINOR: stream: Decrement server current session counter on L7 retry
  BUG/MINOR: stream: Reset stream final state and si error type on L7 retry
  MINOR: channel: Rely on HTX version if appropriate in channel_may_recv()
  BUG/MINOR: stream-int: Don't block reads in si_update_rx() if chn may 
receive
  MEDIUM: mux-h1: Don't block reads when waiting for the other side
  REGTESTS: Add script to test abortonclose option
  BUG/MEDIUM: fi

[ANNOUNCE] haproxy-2.2.15

2021-07-16 Thread Christopher Faulet

Hi,

HAProxy 2.2.15 was released on 2021/07/16. It added 90 new commits
after version 2.2.14.

This release is very similar to the 2.3.11/2.3.12. To sum up, most
noticeable bugs fixed in this release are:

  * A possible deadlock if "set maxconn server" command was used when there
was a pending connection ready to be dequeued.

  * A possible infinite loop in process_stream() when a connection error was
reported while the stream was waiting for a retry.

  * A possible race between free() and pool_alloc() in the pools lockless
variant.

  * A bug in the HTX defragmentation leading to crash. The bug might be
encountered in the HTTP compression filter or in HTTP header
replacement.

  * An old bug preventing the dequeuing for servers with a very low maxconn
because the load balancing was not skipped when a new connection was
picked from the proxy's or server's queue.

  * A bug in the sock part leading to high CPU usage because some early
connection failures might be missed.

  * A thread-safety issue with the SHCTX code when compiled with
USE_PRIVATE_CACHE mode. It was not using any locks.

  * Most of resolvers performance issues and several other bugs in this area.

  * An issue with the abortonclose option. It was not working since a while.

  * A bug in the HTTP compression leading to truncated or corrupted
responses.

  * A bug with synchronous connect in tcpcheck when several connections come
one after the other.

  * "url_ip"/"url_port" sample fetches not properly handling url parsing
errors.

In addition, the http-ignore-probes is now respected for H2
connections. When this option is set, no errors are reported anymore when
connections are aborted during preface. And the FCGI multiplexer was
slightly improved to send a relative path instead of a normalized URI to an
application and to expose SERVER_SOFTWARE parameter by default. Finally, as
a consequence of the bug fixed in the pools, the code was simplified. The
lockless implementation is used everywhere, resulting in the removal of the
very old locked implementation that was kept for non-capable
architectures. As a result, threads will now be faster on less common
architectures (e.g. i686, MIPS, PPC64, ...). The rest is less visible but
contains, as usual, cleanups, small fixes here and there, improvements...

It is strongly advised to update to this version. Thanks everyone for your
help and your contributions!

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/2.2/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.2.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.2.git
   Changelog: http://www.haproxy.org/download/2.2/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Alex (1):
  DOC: use the req.ssl_sni in examples

Alexandar Lazic (1):
  DOC/MINOR: move uuid in the configuration to the right alphabetical order

Amaury Denoyelle (5):
  BUG/MINOR: http_fetch: fix possible uninit sockaddr in fetch_url_ip/port
  BUG/MAJOR: server: prevent deadlock when using 'set maxconn server'
  BUG/MINOR: stick-table: insert srv in used_name tree even with fixed id
  BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check
  REGTESTS: fix maxconn update with agent-check

Christopher Faulet (35):
  BUG/MINOR: hlua: Don't rely on top of the stack when using Lua buffers
  BUG/MINOR: stream: Decrement server current session counter on L7 retry
  BUG/MINOR: stream: Reset stream final state and si error type on L7 retry
  BUG/MINOR: checks: Handle synchronous connect when a tcpcheck is started
  BUG/MINOR: checks: Reschedule check on observe mode only if fastinter is 
set
  MINOR: channel: Rely on HTX version if appropriate in channel_may_recv()
  BUG/MINOR: stream-int: Don't block reads in si_update_rx() if chn may 
receive
  MINOR: conn-stream: Force mux to wait for read events if abortonclose is 
set
  MEDIUM: mux-h1: Don't block reads when waiting for the other side
  BUG/MEDIUM: mux-h1: Properly report client close if abortonclose option 
is set
  REGTESTS: Add script to test abortonclose option
  BUG/MEDIUM: filters: Exec pre/post analysers only one time per filter
  BUG/MINOR: http-comp: Preserve HTTP_MSGF_COMPRESSIONG flag on the response
  BUG/MINOR: http-ana: Handle L7 retries on refused early data before K/A 
aborts
  BUG/MAJOR: stream-int: Release SI endpoint on server side ASAP on retry
  BUG/MEDIUM: compression: Add a flag to know the filter is still 
processing data
  BUG/MAJ

[ANNOUNCE] haproxy-2.3.11

2021-07-07 Thread Christopher Faulet
AJOR: server: fix deadlock when changing maxconn via agent-check
  REGTESTS: fix maxconn update with agent-check

Christopher Faulet (39):
  BUG/MINOR: mux-fcgi: Don't send normalized uri to FCGI application
  BUG/MINOR: htx: Preserve HTX flags when draining data from an HTX message
  BUG/MINOR: applet: Notify the other side if data were consumed by an 
applet
  BUG/MINOR: hlua: Don't rely on top of the stack when using Lua buffers
  BUG/MINOR: stream: Decrement server current session counter on L7 retry
  BUG/MINOR: stream: Reset stream final state and si error type on L7 retry
  BUG/MINOR: checks: Handle synchronous connect when a tcpcheck is started
  BUG/MINOR: checks: Reschedule check on observe mode only if fastinter is 
set
  MINOR: channel: Rely on HTX version if appropriate in channel_may_recv()
  BUG/MINOR: stream-int: Don't block reads in si_update_rx() if chn may 
receive
  MINOR: conn-stream: Force mux to wait for read events if abortonclose is 
set
  MEDIUM: mux-h1: Don't block reads when waiting for the other side
  BUG/MEDIUM: mux-h1: Properly report client close if abortonclose option 
is set
  REGTESTS: Add script to test abortonclose option
  BUG/MEDIUM: filters: Exec pre/post analysers only one time per filter
  BUG/MINOR: http-comp: Preserve HTTP_MSGF_COMPRESSIONG flag on the response
  BUG/MINOR: http-ana: Handle L7 retries on refused early data before K/A 
aborts
  BUG/MAJOR: stream-int: Release SI endpoint on server side ASAP on retry
  BUG/MEDIUM: compression: Add a flag to know the filter is still 
processing data
  BUG/MAJOR: htx: Fix htx_defrag() when an HTX block is expanded
  BUG/MINOR: mux-fcgi: Expose SERVER_SOFTWARE parameter by default
  DOC: lua: Add a warning about buffers modification in HTTP
  BUILD: cfgparse-ssl: Remove const from defpx param in keylog parsing 
function
  BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn is set from the CLI
  BUG/MINOR: server/cli: Fix locking in function processing "set server" 
command
  MINOR: tcp-act: Add set-src/set-src-port for "tcp-request content" rules
  DOC: config: Add missing actions in "tcp-request session" documentation
  BUG/MINOR: tcpcheck: Fix numbering of implicit HTTP send/expect rules
  BUG/MINOR: server-state: load SRV resolution only if params match the 
config
  BUG/MINOR: server: Forbid to set fqdn on the CLI if SRV resolution is 
enabled
  MINOR: resolvers: Clean server in a dedicated function when removing a 
SRV item
  MINOR: resolvers: Remove server from named_servers tree when removing a 
SRV item
  BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution 
status
  BUG/MINOR: resolvers: Use resolver's lock in resolv_srvrq_expire_task()
  BUG/MINOR: resolvers: Always attach server on matching record on 
resolution
  BUG/MINOR: resolvers: Reset server IP when no ip is found in the response
  MINOR: resolvers: Reset server IP on error in 
resolv_get_ip_from_response()
  BUG/MEDIUM: resolvers: Make 1st server of a template take part to SRV 
resolution
  Revert "MINOR: tcp-act: Add set-src/set-src-port for "tcp-request content" 
rules"

Daniel Black (1):
  DOC: config: use CREATE USER for mysql-check

Dirkjan Bussink (1):
  BUG/MINOR: checks: return correct error code for srv_parse_agent_check

Emeric Brun (17):
  BUG/MEDIUM: peers: initialize resync timer to get an initial full resync
  BUG/MEDIUM: peers: register last acked value as origin receiving a resync 
req
  BUG/MEDIUM: peers: stop considering ack messages teaching a full resync
  BUG/MEDIUM: peers: reset starting point if peers appears longly 
disconnected
  BUG/MEDIUM: peers: reset commitupdate value in new conns
  BUG/MEDIUM: peers: re-work updates lookup during the sync on the fly
  BUG/MEDIUM: peers: reset tables stage flags stages on new conns
  MINOR: peers: add informative flags about resync process for debugging
  BUG/MEDIUM: dns: reset file descriptor if send returns an error
  BUG/MEDIUM: dns: send messages on closed/reused fd if fd was detected 
broken
  BUG/MINOR: resolvers: answser item list was randomly purged or errors
  MEDIUM: resolvers: add a ref on server to the used A/ answer item
  MEDIUM: resolvers: add a ref between servers and srv request or used SRV 
record
  BUG/MAJOR: resolvers: segfault using server template without SRV RECORDs
  BUG/MINOR: stick-table: fix several printf sign errors dumping tables
  DOC: stick-table: add missing documentation about gpt0 stored type
  BUG/MINOR: peers: fix data_type bit computation more than 32 data_types

Remi Tricot-Le Breton (15):
  BUG/MEDIUM: ebtree: Invalid read when looking for dup entry
  BUG/MINOR: server: Missing calloc return value check in srv_parse_source
  BUG/MIN

[ANNOUNCE] haproxy-2.4.2

2021-07-07 Thread Christopher Faulet
 http: implement scheme-based normalization
  MEDIUM: h1-htx: apply scheme-based normalization on h1 requests
  MEDIUM: h2: apply scheme-based normalization on h2 requests
  REGTESTS: add http scheme-based normalization test

Christopher Faulet (19):
  BUG/MINOR: server-state: load SRV resolution only if params match the 
config
  BUG/MINOR: server: Forbid to set fqdn on the CLI if SRV resolution is 
enabled
  BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn is set from the CLI
  MINOR: resolvers: Clean server in a dedicated function when removing a 
SRV item
  MINOR: resolvers: Remove server from named_servers tree when removing a 
SRV item
  BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution 
status
  BUG/MINOR: resolvers: Use resolver's lock in resolv_srvrq_expire_task()
  BUG/MINOR: server/cli: Fix locking in function processing "set server" 
command
  MINOR: tcp-act: Add set-src/set-src-port for "tcp-request content" rules
  DOC: config: Add missing actions in "tcp-request session" documentation
  CLEANUP: dns: Remove a forgotten debug message
  BUG/MINOR: resolvers: Always attach server on matching record on 
resolution
  BUG/MINOR: resolvers: Reset server IP when no ip is found in the response
  MINOR: resolvers: Reset server IP on error in 
resolv_get_ip_from_response()
  BUG/MINOR: tcpcheck: Fix numbering of implicit HTTP send/expect rules
  BUG/MINOR: mqtt: Fix parser for string with more than 127 characters
  BUG/MINOR: mqtt: Support empty client ID in CONNECT message
  BUG/MEDIUM: resolvers: Make 1st server of a template take part to SRV 
resolution
  Revert "MINOR: tcp-act: Add set-src/set-src-port for "tcp-request content" 
rules"

Daniel Black (1):
  DOC: config: use CREATE USER for mysql-check

David Carlier (1):
  BUILD: Makefile: fix linkage for Haiku.

Dirkjan Bussink (1):
  BUG/MINOR: checks: return correct error code for srv_parse_agent_check

Emeric Brun (3):
  BUG/MINOR: stick-table: fix several printf sign errors dumping tables
  BUG/MINOR: peers: fix data_type bit computation more than 32 data_types
  DOC: stick-table: add missing documentation about gpt0 stored type

Tim Duesterhus (1):
  BUG/MINOR: cache: Correctly handle existing-but-empty 'accept-encoding' 
header

Willy Tarreau (2):
  BUG/MEDIUM: sock: make sure to never miss early connection failures
  BUG/MINOR: cli: fix server name output in "show fd"

--
Christopher Faulet



Re: proposed enhancement to mysql-check - accept account locked/password expired errors

2021-07-05 Thread Christopher Faulet

Le 7/1/21 à 7:14 AM, Daniel Black a écrit :

It seems users are still disturbed at creating passwordless users in
mysql for mysql-check.
https://discourse.haproxy.org/t/haproxy-mysql-check-user-removal/6685

I certainly understand not wanting to implement the truly ugly
implementation that is the protocol to support multiple password
mechanisms ( mysql_native_password, mysql_caching_sha256_password
etc).

As a middle ground, would you be willing to take a haproxy patch that:
* accepts an account locked error on the client as a valid state?; and/or
* offer the CLIENT_CAN_HANDLE_EXPIRED_PASSWORDS client capability and
accept the password expired error ( 1820) as a valid state?

This would enable users to deploy a user account for MySQL and MariaDB
that is truly unusable and make the assessment to use this
functionality easier from a risk management point of view.

On the mysql/mariadb server side both of these are Notes like below:

2021-07-01T04:41:03.144498Z 4554 [Note] Your password has expired. To
log in you must change it using a client that supports expired
passwords.
2021-07-01T04:57:30.706593Z 5056 [Note] Access denied for user
'haproxy'@'10.0.2.100'. Account is locked.

This has the consequences that:
* these aren't counted as connection errors therefore reaching the
default 100 max_connect_errors isn't hit because of this.
* dropping MySQL's log_error_verbosity=2 from 3(default) will hide
these informational messages from the logs that would otherwise be
quite noisy while not hiding anything else too significant.

In the same area of code, dropping the MySQL-4.0 protocol support is
probably called for given the MySQL-5.0 was end of support in January
9, 2012 https://endoflife.software/applications/databases/mysql . 4.1
protocol is still in use.



Hi,

First of all, it is a good idea to remove support of "pre-41" option. It is not 
the default option anymore and I guess it is old enough to be removed. It is now 
on my todo list.


About the MySQL check itself and the way it works, I'm open to any suggestion. 
From the check point of view, your proposal seems reasonable. Of course, the 
actual behavior must still be supported (passwordless account). However, from 
the MySQL administration point of view, I don't know if it is acceptable or not. 
I'm not a MySQL admin and I don't know if it is acceptable or not to change the 
log level.


I guess that if no one has a better idea or any objections about this change, 
you can provide a patch.


--
Christopher Faulet



Re: [PATCH] DOC: use CREATE USER for mysql-check

2021-07-05 Thread Christopher Faulet

Le 7/1/21 à 4:09 AM, Daniel Black a écrit :

CREATE USER has been the standard way of creating users since
MySQL-5.0 (2005).

The current syntax of INSERT INTO mysql.user won't actually work
on MariaDB-10.4+.

Because haproxy doesn't use any resources the MySQL executable comment
syntax provides resource contraints to make it more palatable
to risk adverse users.

/*!50701 is a syntax recognised by MySQL and MariaDB 5.7.1+ when
resource contraints where added.

/*M!100201 is a MariaDB executable comment syntax recognised for MariaDB
for the 10.2.1 where the MAX_STATEMENT_TIME was added.

This patch may be backported as far as 2.0.
---
  doc/configuration.txt | 11 ++-
  1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 5d1c97bd3..a9bbf6fa4 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -8982,12 +8982,13 @@ option mysql-check [ user  [ { post-41 | 
pre-41 } ] ]
one Client Authentication packet, and one QUIT packet, to correctly close
MySQL session. We then parse the MySQL Handshake Initialization packet 
and/or
Error packet. It is a basic but useful test which does not produce error nor
-  aborted connect on the server. However, it requires adding an authorization
-  in the MySQL table, like this :
+  aborted connect on the server. However, it requires an unlocked authorised
+  user without a password. To create a basic limited user in MySQL with
+  optional resource limits:
  
-  USE mysql;

-  INSERT INTO user (Host,User) values ('','');
-  FLUSH PRIVILEGES;
+  CREATE USER ''@''
+  /*!50701 WITH MAX_QUERIES_PER_HOUR 1 MAX_UPDATES_PER_HOUR 0 */
+  /*M!100201 MAX_STATEMENT_TIME 0.0001 */;
  
If you don't specify a username (it is deprecated and not recommended), the

check only consists in parsing the Mysql Handshake Initialization packet or



Merged now, thanks !

--
Christopher Faulet



Re: [PATCH]: BUILD/MEDIUM: set-mark openbsd support

2021-07-05 Thread Christopher Faulet

Le 7/3/21 à 10:22 AM, David CARLIER a écrit :

Hi here a follow-up of the previous patch but this time for OpenBSD.



Thanks, applied now !

--
Christopher Faulet



Re: Weird behavior of spoe between http and https requests

2021-06-14 Thread Christopher Faulet
 16:01:01 Msg Count :5:
Jun 11 16:01:01 reggata-001 spoe-url[112969]: 2021/06/11 16:01:01 Arg Name  
:my_path:
Jun 11 16:01:01 reggata-001 spoe-url[112969]: 2021/06/11 16:01:01 Arg Value 
:/test:
Jun 11 16:01:01 reggata-001 spoe-url[112969]: 2021/06/11 16:01:01 Arg Name  
:my_src:
Jun 11 16:01:01 reggata-001 spoe-url[112969]: 2021/06/11 16:01:01 Arg Value 
::
Jun 11 16:01:01 reggata-001 spoe-url[112969]: 2021/06/11 16:01:01 Arg Name  
:my_referer:
Jun 11 16:01:01 reggata-001 spoe-url[112969]: 2021/06/11 16:01:01 Arg Value 
:%!s():
Jun 11 16:01:01 reggata-001 spoe-url[112969]: 2021/06/11 16:01:01 Arg Name  
:my_sid:
Jun 11 16:01:01 reggata-001 spoe-url[112969]: 2021/06/11 16:01:01 Arg Value :11:
Jun 11 16:01:01 reggata-001 spoe-url[112969]: 2021/06/11 16:01:01 Arg Name  
:my_req_host:
Jun 11 16:01:01 reggata-001 spoe-url[112969]: 2021/06/11 16:01:01 Arg Value 
::
```

But when I make a https request I get only the path.

```
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Msg Name  
:agent-on-http-req:
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Msg Count :5:
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Arg Name  
:my_path:
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Arg Value 
:/test:
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Arg Name  
:my_src:
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Arg Value 
:0.0.0.0:
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Arg Name  ::
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Arg Value 
:%!s():
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Arg Name  ::
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Arg Value 
:%!s():
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Arg Name  ::
Jun 11 15:55:32 reggata-001 spoe-url[112869]: 2021/06/11 15:55:32 Arg Value 
:%!s():
```

Please can somebody tell me what's my mistake, thank you?


The problem can be easily reproduces when the bind lines is replaces with '::'

  Working: *:80
Not Working: :::80

Then works also the HTTPS part.

It looks like that '*:80' goes different Way then ':::80'



Hi Alex,

I'm unable to reproduce the issue. Everything works as expected, with all 
combinations of HTTP/HTTPS and IPv4/IPv6. It may be an issue with your agent. 
"my_src" value is displayed as an IPv4 while it should be an IPv6. Could you 
check your agent is properly decoding IPv6 values ?


You may also try to do a network capture between HAProxy and your agent.

--
Christopher Faulet



Re: [PATCH] DOC/MINOR: move uuid in the configuration to the right, alphabetical order

2021-06-02 Thread Christopher Faulet

Le 6/1/21 à 12:35 AM, Aleksandar Lazic a écrit :

Fix alphabetical order of uuid



merged now, thanks !

--
Christopher Faulet



Re: Maybe stupid question but, I don't see a fetch method for %rt => StreamID

2021-06-02 Thread Christopher Faulet

Le 6/1/21 à 8:26 PM, Aleksandar Lazic a écrit :

On 01.06.21 14:23, Tim Düsterhus wrote:

Aleks,

On 6/1/21 10:30 AM, Aleksandar Lazic wrote:

This phrasing is understandable to me, but now I'm wondering if this is the 
best solution. Maybe the already existing user-configurable unique request ID 
should instead be sent to the SPOE and then logged?

https://cbonte.github.io/haproxy-dconv/2.2/configuration.html#7.3.6-unique-id

The request_counter (%rt) you mentioned could be embedded into this unique-id.


Well this uniqe-id is not send as Stream ID to SPOA receiver, due to this fact 
can't you debug which stream is the troubled one.


Yes, that's why I suggested that the SPOE is extended to also include this 
specific ID somewhere (just) for logging purposes.


Yep.
Any opinion from the other community Members?



The SID provided in the SPOE log message is the one used in the SPOP frame 
header. This way it is possible to match a corresponding log message emitted by 
the agent.


Regarding the format for this log message, its original purpose was to diagnose 
problems. Instead of adding custom information, I guess the best would be to 
have a "log-format" directive. At least to not break existing tools parsing 
those log messages. But to do so, all part of the current message must be 
available via log variables and/or sample fetches. And, at first glance, it will 
be hard to achieve (sample fetches are probably easier though).


Regarding the stream_uniq_id sample fetch, it is a good idea to add it. In fact, 
when it makes sense, a log variable must also be accessible via a sample fetch. 
Tim's remarks about the patch are valid. For the scope, INTRN or L4CLI, I don't 
know. I'm inclined to choose INTRN.


--
Christopher Faulet



Re: [2.2.11] 100% CPU again

2021-04-29 Thread Christopher Faulet

Le 22/04/2021 à 09:00, Maciej Zdeb a écrit :
śr., 21 kwi 2021 o 13:53 Christopher Faulet <mailto:cfau...@haproxy.com>> napisał(a):


The fix was merge in upstream :

* BUG/MAJOR: mux-h2: Properly detect too large frames when decoding headers
    (http://git.haproxy.org/?p=haproxy.git;a=commit;h=07f88d75
<http://git.haproxy.org/?p=haproxy.git;a=commit;h=07f88d75>)


Thanks! Building 2.2.13 and 2.3.9 with patches. I'll return with feedback if 
anything pops out again.


FYI, the 2.3.10 and 2.2.14 were released, with the fix.

--
Christopher Faulet



[ANNOUNCE] haproxy-2.2.14

2021-04-29 Thread Christopher Faulet

Hi,

HAProxy 2.2.14 was released on 2021/04/29. It added 39 new commits after
version 2.2.13.

On week after the 2.3.10, this release contains more or less the same
fixes. Especially the one about the CONTINUATION frames parsing in the H2
multiplexer leading to spurious wakeups and the other one about H2 shutdowns
detection bug in some cases. Here is list of other fixes, cut/pasted for the
2.3.10 announce:

  * Amaury fixed a bug in the cpu-map notation when both processes and
threads were specified, most specifically P-Q/1 or 1/P-Q notation.

  * The possible deadlock under thread isolation of processes built with
DEBUG_UAF, supposed to be fixed in a previous release was not
totally. It is fixed now.

  * Thayne McCombs fixed the field converter. The size of the sample buffer
was not properly updated. He also fixed the parsing of "us" unit for
timers.

  * fc_http_major and bc_http_major sample fetches were fixed to fail when
called for connections attached to a non-HTX multiplexer. In addition it
is now possible to call them from the health-checks.

  * The support of log-format strings in a health-check context was
improved. It is now possible to use backend and server related formats
(%s, %sc, %b, %bi, %bp, %si and %sp). To make it possible, an identifier
was set on the dummy frontend used for health-checks.

  * Several SSL sample fetches (ssl_bc_*) were fixed to work when called in
a health-check context.

  * H1 idle connections on server side are now closed if some data are
received. This avoids to send pending 408-Request-time-out responses to
clients. Note it is still possible to receive a 408 response from a
server if it is received in same time the request is sent.

  * The true number of retries is now reported in logs if no connection was
attempted. Since the beginning, when the session was aborted before any
connection attempt to any server, the backend retries value was
reported, instead of 0.

  * BUG_ON() statements were removed from http_get_stline() function. This
function already returns NULL when it fails to get the start-line from
an HTX message. There is no reason to restrict it, especially because
there are some valid cases to call it on an empty buffer or after data
forwarding.

  * William fixed two bugs in the master-worker affecting the peers
synchronization. Still about peers, Emeric fixed bugs during reloads
preventing the peers synchronization to work properly.

  * The method sample fetch was fixed to be sure to start-line is still
there when an exotic method is found. For default methods (GET, POST..),
transaction flags are used. But for unknown methods, we get it from the
HTX message. In this case, the start-line must be there.

In addition, because this release was delayed by one week, it contains some
fixes not included in the 2.3.10. Emeric fixed several bugs on the peers to
improve the synchronization process. A bug in lua HTTP applet was also fixed
to be sure to notify the producer side when some data are consumed by the
applet, to not block data receipt. And finally the FCGI multiplexer was
slightly improved to send a relative path instead of a normalized URI to an
application. These fixes were also merged in the 2.3-maint.

As usual, it is strongly encouraged to update to this version. Thanks
everyone for your help and your contributions!

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/2.2/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.2.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.2.git
   Changelog: http://www.haproxy.org/download/2.2/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Amaury Denoyelle (2):
  BUG/MINOR: server: free srv.lb_nodes in free_server
  BUG/MEDIUM: config: fix cpu-map notation with both process and threads

Christopher Faulet (18):
  DOC: Explicitly state only IPv4 are supported by forwardfor/originalto 
options
  MINOR: No longer rely on deprecated sample fetches for predefined ACLs
  BUG/MEDIUM: threads: Ignore current thread to end its harmless period
  BUG/MINOR: checks: Set missing id to the dummy checks frontend
  MINOR: logs: Add support of checks as session origin to format lf strings
  BUG/MINOR: connection: Fix fc_http_major and bc_http_major for TCP 
connections
  MINOR: connection: Make bc_http_major compatible with tcp-checks
  BUG/MINOR: ssl-samples: Fix ssl_bc_* samples when called from a 
health-check
  BUG/MINOR: http-fetch: Make method smp safe if headers we

Re: [PATCH] DOC: general: fix example in set-timeout

2021-04-28 Thread Christopher Faulet

Le 27/04/2021 à 13:03, Aleksandar Lazic a écrit :

Hi.

attach the fix for set-timeout.



Thanks, now merged !

--
Christopher Faulet



Re: [PATCH] DOC: general: fix white spaces for HTML converter

2021-04-26 Thread Christopher Faulet

Le 24/04/2021 à 13:09, Aleksandar Lazic a écrit :

Hi.

The HTML converter expects some formats to recognize if a keyword is a
keyword.

Regards
alex



Thanks, now merged !

--
Christopher Faulet



Re: [PATCH] typo fixes

2021-04-26 Thread Christopher Faulet

Le 24/04/2021 à 10:28, Илья Шипицин a écrit :

hello,

one more typo fixing.

Ilya


Thanks, now merged !

--
Christopher Faulet



Re: [PATCH 1/4] DOC: Fix indentation for `path-strip-dot` normalizer

2021-04-23 Thread Christopher Faulet

Thanks Tim! The series was merged (but too late for the release :)

--
Christopher Faulet



[ANNOUNCE] haproxy-2.3.10

2021-04-23 Thread Christopher Faulet

Hi,

HAProxy 2.3.10 was released on 2021/04/23. It added 33 new commits after
version 2.3.9.

The most notable fixes of this release are about the H2 multiplexer. Two
bugs were fixed in the CONTINUATION frames parsing. The first one was about
wakeups in loop of the h2 connections if only part of a CONTINUATION frame
header was received into a full demux buffer. In this case, we were unable
to detect the frames were too large to be processed. The H2 connection was
infinitely waiting for more data with a full demux buffer and was woken up
in loops to process already received data. While it is a pretty annoying bug
because it consumes all idle CPU for nothing, it is not a DOS bug because
incoming traffic is still processed. The second bug was about the frames
size calculation when CONTINUATION and HEADERS frames were merged. Only
available data were counted instead of the total frames size. The last bug
on the H2 multiplexer was about the shutdowns detection. When a shutdown was
received along with data, it was ignored. Thus, in this case, the H2
connection remained blocked until its timeout expiration.

Here is the list of other fixes for this release:

  - Amaury fixed a bug in the cpu-map notation when both processes and
threads were specified, most specifically P-Q/1 or 1/P-Q notation.

  - The possible deadlock under thread isolation of processes built with
DEBUG_UAF, supposed to be fixed in a previous release was not
totally. It is fixed now.

  * Thayne McCombs fixed the field converter. The size of the sample buffer
was not properly updated. He also fixed the parsing of "us" unit for
timers.

  * fc_http_major and bc_http_major sample fetches were fixed to fail when
called for connections attached to a non-HTX multiplexer. In addition it
is now possible to call them from the health-checks.

  * The support of log-format strings in a health-check context was
improved. It is now possible to use backend and server related formats
(%s, %sc, %b, %bi, %bp, %si and %sp). To make it possible, an identifier
was set on the dummy frontend used for health-checks.

  * Several SSL sample fetches (ssl_bc_*) were fixed to work when called in
a health-check context.

  * The silent-drop action was fixed to work for IPv6 when permissions are
insufficient.

  * H1 idle connections on server side are now closed if some data are
received. This avoids to send pending 408-Request-time-out responses to
clients. Note it is still possible to receive a 408 response from a
server if it is received in same time the request is sent.

  * The true number of retries is now reported in logs if no connection was
attempted. Since the beginning, when the session was aborted before any
connection attempt to any server, the backend retries value was
reported, instead of 0.

  * BUG_ON() statements were removed from http_get_stline() function. This
function already returns NULL when it fails to get the start-line from
an HTX message. There is no reason to restrict it, especially because
there are some valid cases to call it on an empty buffer or after data
forwarding.

  * IP parsing in hdr_ip() sample fetch is now stricter. It was fixed to
never return invalid IP addresses.

  * William fixed two bugs in the master-worker affecting the peers
synchronization. Still about peers, Emeric fixed bugs during reloads
preventing the peers synchronization to work properly.

  * The update on the CLI of the default SSL certificate used not to work
correctly as the previous one was not removed, resulting in a random
behavior namely on the SNI.

  * The method sample fetch was fixed to be sure to start-line is still
there when an exotic method is found. For default methods (GET, POST..),
transaction flags are used. But for unknown methods, we get it from the
HTX message. In this case, the start-line must be there.

Thanks everyone for your help and your contributions !

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/2.3/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.3.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.3.git
   Changelog: http://www.haproxy.org/download/2.3/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Amaury Denoyelle (2):
  BUG/MINOR: server: free srv.lb_nodes in free_server
  BUG/MEDIUM: config: fix cpu-map notation with both process and threads

Christopher Faulet (15):
  DOC: Explicitly state only IPv4 are supported by forwardfor/originalto 
options
  MINOR: No longer rely on depreca

Re: [2.2.11] 100% CPU again

2021-04-21 Thread Christopher Faulet

Le 21/04/2021 à 08:48, Maciej Zdeb a écrit :

I'm very happy you managed to reproduce a similar issue! :)


The fix was merge in upstream :

* BUG/MAJOR: mux-h2: Properly detect too large frames when decoding headers
  (http://git.haproxy.org/?p=haproxy.git;a=commit;h=07f88d75)

It has not been backported yet. But it may be manually applied on the 2.3 and 
2.2 if required. If so, optionally, this other one may be applied too :


* BUG/MEDIUM: mux-h2: Fix dfl calculation when merging CONTINUATION frames
  (http://git.haproxy.org/?p=haproxy.git;a=commit;h=cb1847c7)

--
Christopher Faulet



Re: [PATCH 0/3] Add a `strip-dot` normalizer

2021-04-21 Thread Christopher Faulet

Le 21/04/2021 à 00:22, Maximilian Mader a écrit :

Hi Christopher,

Tim and I were talking while he was working on the URI normalizers
and as a fun evening exercise I decided to write a simple normalizer
to remove single dots from the path component.
While looking over his code to get an understanding of how things work
I found a minor bug in his implementation of the query sort normalizer
and included a patch for that as well.

Best regards,



Thanks, The series was merged !

--
Christopher Faulet



Re: [2.2.11] 100% CPU again

2021-04-21 Thread Christopher Faulet

Le 21/04/2021 à 08:48, Maciej Zdeb a écrit :

I'm very happy you managed to reproduce a similar issue! :)

Does it affect 2.3.9? I've experienced 100% cpu on it also. But on 2.3 ALL 
threads loops on _do_poll/epoll_wait and threads did not hang on a particular h2 
session. :( I'll check twice and return in the new thread for 2.3.9, because as 
for now it looks like a different bug.


The bug affects all HAProxy versions supporting CONTINUATION frames. When it 
happens, the buggy H2 sessions are waiting for more data with a full buffer, 
resulting to wakeups in loop. But the threads don't hang. Incoming connections 
are still accepted and processed. But all the CPU is consumed.


However, if you observe loops on _do_poll/epoll_wait, it is probably another 
bug. Because here with a "perf top" on haproxy, you should have H2 functions in 
the top 10.


I'll push my fix this morning. So you may just wait a bit to confirm it is the 
same bug or not.


--
Christopher Faulet



Re: [2.2.11] 100% CPU again

2021-04-20 Thread Christopher Faulet

Le 19/04/2021 à 19:54, Maciej Zdeb a écrit :

Hi,

After a couple weeks running HAProxy 2.2.11, one server started to loop on 
thread 9. If I'm reading this correctly something went wrong on h2c at 
0x7fd7b08d0530.



[...]


### (gdb) p *((struct h2c *)0x7fd7b08d0530)
$1 = {conn = 0x7fd785b87990, st0 = H2_CS_FRAME_P, errcode = H2_ERR_NO_ERROR, 
flags = 8, streams_limit = 100, max_id = 0, rcvd_c = 0, rcvd_s = 0,

   ddht = 0x7fd7b0820d60, dbuf = {size = 16384,
     area = 0x7fd7b1dec0b0 
"¤\226\205\064\f\212jܧâ\201\004A\fN\177jC]t\027\221cÌd°à\033\\+\205µ?¬Ø÷èÏô¥\006êU1\024\235O", 
data = 16384, head = 48},
   dsi = 1, dfl = 16383, dft = 1 '\001', dff = 1 '\001', dpl = 0 '\000', 
last_sid = -1, mbuf = {{size = 32,
       area = 0x2 , data = 1, head = 
1}, {size = 0, area = 0x0, data = 0, head = 0} , {
       size = 0, area = 0x0, data = 0, head = 12}, {size = 1249, area = 
0x7fd7b080 "ðHX²×\177", data = 140564347486336, head = 0}}, msi = -1, mfl = 0,
   mft = 32 ' ', mff = -96 ' ', miw = 65535, mws = 65535, mfs = 16384, timeout = 
2, shut_timeout = 2, nb_streams = 0, nb_cs = 0, nb_reserved = 0,
   stream_cnt = 0, proxy = 0x5557136cbe60, task = 0x7fd792079be0, streams_by_id 
= {b = {0x0, 0x0}}, send_list = {n = 0x7fd7b08d09d8, p = 0x7fd7b08d09d8},
   fctl_list = {n = 0x7fd7b08d09e8, p = 0x7fd7b08d09e8}, blocked_list = {n = 
0x7fd7b08d09f8, p = 0x7fd7b08d09f8}, buf_wait = {target = 0x0, wakeup_cb = 0x0,
     list = {next = 0x7fd7b08d0a18, prev = 0x7fd7b08d0a18}}, wait_event = 
{tasklet = 0x7fd79e235bf0, events = 0}}




Hi Maciej,

I'm able to reproduce a similar bug hacking the nghttp2 client to send at most 
16383 bytes per frame (instead of 16384). By sending too large headers, we are 
falling into a wakeup loop, waiting for more data while the buffer is full.


I have a fix but I must check with Willy how to proceed because I'm not 100% 
sure for now.


--
Christopher Faulet



Re: [PATCH v2 0/8] URI normalization / Issue #714

2021-04-19 Thread Christopher Faulet

Le 17/04/2021 à 13:23, Tim Düsterhus a écrit :

I added the experimental marking in the 3rd patch this morning.

Generally I think that it looks solid enough, though. During development
I carefully researched the relevant documentation (e.g. the URI RFC) and
tested the behavior of different clients and servers. It also comes with
quite a few tests ensuring that the normalizers behave like I expect
them to.

Nonetheless I might have missed something and correct handling of URIs
is a sensitive part of the request handling, so an experimental note
still is appropriate.

All in all: I think that the 8 v2 patches + the 3 patches from this
morning together result in something that is appropriate for HAProxy 2.4.


Tim,

I pushed all the series. Thanks !

--
Christopher Faulet



Re: [PATCH v2 0/8] URI normalization / Issue #714

2021-04-17 Thread Christopher Faulet

Le 16/04/2021 à 20:14, Tim Düsterhus a écrit :

Willy,
Christopher,

On 4/16/21 6:56 PM, Willy Tarreau wrote:

Thanks Tim,

The series looks good to me. Except if Willy has any comments, I'll merge it 
soon.


No need for me to double-check, I'll trust you (and you know I like
to complain afterwards :-))


I'll probably complain myself afterwards as well :-)

As a heads up I'll definitely add a few more converters (most notably
the percent decoder is missing) and maybe I'll rename the existing ones
in the configuration for consistency once I finish this up.

So: It should definitely considered *EXPERIMENTAL* for 2.4.



Hi Tim,

I'm a bit annoyed by the renaming of existing normalizers. Except if you plan to 
do your modification before the release, it is a bad idea to change the name of 
 a configuration parameter once introduced in a stable release. The feature 
itself may be experimental because some bugs are expected but from the 
configuration point of view, it should be stable.


However, if you want to wait a bit to finish your work, I can push your patches 
in the next branch, pending for the next 2.5. This way, you'll have all the time 
to modify it. And because it is a standalone feature, we may plan to backport it 
to 2.4 if necessary.


--
Christopher Faulet



Re: [PATCH v2 0/8] URI normalization / Issue #714

2021-04-16 Thread Christopher Faulet

Le 15/04/2021 à 21:45, Tim Duesterhus a écrit :

Christopher,

again: Thank you for the very helpful review. In this series you can find v2 of
my URI normalization patches. I hope I did not forget to incorporate any of
your feedback.

Some general notes:

- I completely cleaned up the commit history to group similar patches (e.g. the
   two patches for dotdot) and to avoid later commits completely refactoring
   earlier commits (e.g. the error handling).
- As suggested I am now returning the error code and taking a `struct ist *dst`.
- The values of `enum uri_normalizer_err` are cleaned up.
- I cleaned up the error handling in `http_action_normalize_uri`.
- I am now using `istdiff()`.
- Dynamic allocation is no more.
- I fixed some parts of the code style (`struct ist* foo` -> `struct ist *foo`).
- I const'ified as much as possible.



Thanks Tim,

The series looks good to me. Except if Willy has any comments, I'll merge it 
soon.

--
Christopher Faulet



Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-14 Thread Christopher Faulet

Le 10/04/2021 à 00:34, Robin H. Johnson a écrit :

On Fri, Apr 09, 2021 at 10:14:26PM +0200, Christopher Faulet wrote:

It seems you have a blocking call in one of your lua script. The threads dump
shows many threads blocked in hlua_ctx_init. Many others are executing lua.
Unfortunately, for a unknown reason, there is no stack traceback.

All of our Lua is string handling. Parsing headers, and then building
TXN variables as well as a set of applets that return responses in cases
where we don't want to go to a backend server as the response is simple
enough to generate inside the LB.


For the 2.3 and prior, the lua scripts are executed under a global lock. Thus
blocking calls in a lua script are awful, because it does not block only one
thread but all others too. I guess the same issue exists on the 1.8, but there
is no watchdog on this version. Thus, time to time HAProxy hangs and may report
huge latencies but, at the end it recovers and continues to process data. It is
exactly the purpose of the watchdog, reporting hidden bugs related to spinning
loops and deadlocks.

Nothing in this Lua code should have blocking calls at all.
The Lua code has zero calls to external services, no sockets, no sleeps,
no print, no Map.new (single call in the Lua startup, not inside any
applet or fetch), no usage of other packages, no file handling, no other
IO.

I'm hoping I can get $work to agree to fully open-source the Lua, so you
can see this fact and review the code to confirm that it SHOULD be
non-blocking.


I trust you on this point. If you are not using external component, it should 
indeed be ok. So, it is probably a contention issue on the global Lua lock. If 
you are able to generate and inspect a core file, it should help you to figure 
out what really happens.




However, I may be wrong. It may be just a contention problem because your are
executing lua with 64 threads and a huge workload. In this case, you may give a
try to the 2.4 (under development). There is a way to have a separate lua
context for each thread loading the scripts with "lua-load-per-thread"
directive. Out of curiosity, on the 1.8, are you running HAProxy with several
threads or are you spawning several processes?

nbthread=64, nbproc=1 on both 1.8/2.x


It is thus surprising, if it is really a contention issue, that you never 
observed slow down on the 1.8. There is no watchdog, but the thread 
implementation is a bit awkward on the 1.8. 2.X are better on this point, the 
best being the 2.4.



Yes, we're hoping to try 2.4.x, just working on some parts to get there.



--
Christopher Faulet



Re: About the 'Hot Restarts' of haproxy

2021-04-13 Thread Christopher Faulet

Le 13/04/2021 à 18:15, John Lauro a écrit :
Sounds like the biggest part of hot restarts is the cost of leaving the old 
process running as they have a lot of long running TCP connections, and if you 
do a lot of restarts the memory requirements build up.  Not much of an issue for 
short lived http requests (although it would be nice if keep alive wasn't 
followed on the old haproxy processes so they could die quicker).




Well, AFAIK, Envoy handles hot restarts exactly the same way HAProxy does. The 
old process tries to finish to process in-flight requests. The connections are 
not kept-alive. The old process closes all idle connections. But it must still 
wait the responses for already started requests. Both Envoy and HAProxy do it 
this way. Note there is an option to kill the old process after a given time.


The article is a bit biased and inaccurate because it suggests HAproxy does not 
support hot restarts while Envoy do it. In fact, The real difference here is the 
ability to dynamically add or remove servers with Envoy. Thanks to this feature, 
most of time, there is no reason to restart it. Thus, hot restarts are not an 
issue anymore. On HAProxy side, as Willy said, this feature is under development.


--
Christopher Faulet



Re: [RFC PATCH 4/8] MINOR: uri_normalizer: Add a `sort-query` normalizer

2021-04-13 Thread Christopher Faulet

Le 13/04/2021 à 18:05, Tim Düsterhus a écrit :

Christopher,

On 4/13/21 4:59 PM, Christopher Faulet wrote:

+/* Sorts the parameters within the given query string. Returns an ist
containing
+ * the new path and backed by `trash` or IST_NULL if the `len` not
sufficiently
+ * large to store the resulting path.
+ */
+struct ist uri_normalizer_query_sort(const struct ist query, const
char delim, char *trash, size_t len)
+{


I hadn't noticed it till now, but please don't use "trash" variable
name, it is confusing with the trash buffer used almost everywhere.
Especially because of my next comment :)


In fact the http-request normalize-uri action will pass a trash buffer
pointer into the function, that's why I found the name somewhat fitting.
But the exact method signature is up for discussion anyway (see my other
email).



It is the caller point of view. For the normalizers, it is just a destination 
buffer, not necessarily a trash buffer. Here, it is just to avoid confusion with 
the global variable.


--
Christopher Faulet



Re: [RFC PATCH 3/8] MINOR: uri_normalizer: Add a `dotdot` normalizer to http-request normalize-uri

2021-04-13 Thread Christopher Faulet

Le 13/04/2021 à 18:03, Tim Düsterhus a écrit :

Christopher,

On 4/13/21 4:38 PM, Christopher Faulet wrote:

At the end it remains your choice. The function is quite good. I just
wonder if it could be valuable to also handle single dot-segment here in
addition to double dot-segment. Thus, the normalizer should be renamed
"dot-segments" or something similar.


I planned to add a separate normalizer for that. This keeps the
functions simple and easy to check for correctness. It also allows the
administrator to cherry-pick exactly the normalizers they desire and
that do not break their backend. In the old discussion Willy said that
not everything that might be possible to normalize can actually be
normalized when combined with legacy software.


Ok, that make sense.




Another point is about the dot encoding. It may be good to handle
encoded dot (%2E), may be via an option. And IMHO, the way empty
segments are handle is a bit counter intuitive. Calling "merge-slashes"
normalizer first is a solution of course, but this means rewriting twice
the uri. We must figure out what is the main expectation for this
normalizer. Especially because ignoring empty segment when dot-segments
are merged is not exactly the same than merge all slashes.


The percent encoding of the dot will be handled by a 'percent-decode'
normalizer that decodes percent encoded unreserved characters (RFC 3986,
section 2.3). The administrator then first would use the percent-decode
normalizer, then the merge-slashes normalizer, then the dotdot normalizer.


Well, it is a bit different here. Because someone could choose to not decode 
unreserved characters but want to handle encoded dot in dotdot normalizer.




Yes, it means rewriting the URI several times. But it is nice, explicit
and composes well.


On this point, you're right. It is far cleaner this way.



We can later figure out whether we want to provide "combined
normalizers", such as a 'filesystem' normalizer that would combine the
'.', '..' and '//' normalizers in an efficient way. Adding something
like that later is easy. Changing the behavior of a normalizer later is
hard.


That's true. Depending on feebacks, it will be possible to add more normalizers. 
I'm fine with that.




That's why I'd like to keep them simple "Unix style" for now. Make them
do one thing, make them do it well.


Note I was first surprised that leading dot-segments were preserved,
before reading the 6th patch because for me it is the most important
part. But I'm fine with an option in a way or another.



It's a result of how I approached the development. I wanted to not
rebase my branch more than necessary. I will probably merge the two
patches and change the default once the general approach is approved :-)


Well, it is not a problem. You can keep it in two patches if it is easier for 
you.

--
Christopher Faulet



Re: [RFC PATCH 0/8] URI normalization / Issue #714

2021-04-13 Thread Christopher Faulet

Le 13/04/2021 à 17:45, Tim Düsterhus a écrit :

Christopher,

On 4/13/21 2:41 PM, Christopher Faulet wrote:

Sorry for the delay. I'll comment your patches by replying inline when


No delay experienced. You said that you'd try this week and it's still
this week. So this is fine :-)


appropriate. The quality of the whole series is pretty good. Thanks for
the effort. Just a suggestion to simplify the error handling introduced
in the 7th patch. You may use following prototype for your normalizers:

    int uri_normalizer_NAME(const struct ist path, struct ist *dst);

returning for instance 0 on success and anything else on error. This
way, it is easy to return an enum instead of an integer to be able to
handle errors.



Yes, I struggled a bit with the method signature, due to the following
constraints:

1. I did not want to allocate the result buffer within the normalizer
itself, because this makes the method less flexible with regard to both
allocation and freeing.


It is indeed cleaner to allocate it in the caller. We should avoid 
responsibility sharing. It is always confusing and leads to bugs.



2. I need to somehow pass a size of the target buffer to prevent buffer
overflows.


Passing a pointer on an ist is a good way to announce the max size when the 
normalizer is called and to update it to set the final size of the new 
path/query. By contract, the caller must provide an ist owning an allocated buffer.



Thus I can't simply take a `struct ist*` for the destination, as an ist
cannot communicate the size of the underlying buffer. I could
technically take a `struct buffer`, but I'd still like the result to
reside in an ist, because this is what the HTX API expects.


Hum, I don't understand. If you create an ist using the trash buffer this way:

   struct ist dst = ist2(replace->area, replace->size);

You can pass a pointer on dst. In the normalizer, you can update its size. It is 
thus possible to use dst when calling http_replace_req_path() or 
http_replace_req_query().




Your suggested signature would work if I would allocate a trash buffer
within the normalizer. Should I do this? Is it safe to return a pointer
to a trash buffer from a function? I remember something about these
buffers being reused, accidentally destroying the contained information
if one is not careful.


No, it is indeed a very bad idea. the trash buffer must be allocated in the 
caller. You already choose the right way on this point. But you can still a 
pointer on an ist, locally build from the trash buffer.


--
Christopher Faulet



Re: [RFC PATCH 8/8] MINOR: uri_normalizer: Add a `percent-upper` normalizer

2021-04-13 Thread Christopher Faulet

Le 08/04/2021 à 20:59, Tim Duesterhus a écrit :

Willy,
Christopher,

and this final one adds a normalizer to turn the hex digits of percent
encoding into uppercase. Uppercase is the variant preferred by the URI RFC, so
this is what we do.



This one looks good.

--
Christopher Faulet



Re: [RFC PATCH 7/8] MINOR: uri_normalizer: Support returning detailed errors from uri normalization

2021-04-13 Thread Christopher Faulet
fail_alloc;
case URI_NORMALIZER_ERR_INVALID_INPUT:
ret = ACT_RET_INV;
goto leave;
}

  leave:
free_trash_chunk(replace);
return ret;

  fail_alloc:
if (!(s->flags & SF_ERR_MASK))
s->flags |= SF_ERR_RESOURCE;
ret = ACT_RET_ERR;
goto leave;

  fail_rewrite:
_HA_ATOMIC_ADD(>fe->fe_counters.failed_rewrites, 1);
if (s->flags & SF_BE_ASSIGNED)
_HA_ATOMIC_ADD(>be->be_counters.failed_rewrites, 1);
if (sess->listener && sess->listener->counters)
_HA_ATOMIC_ADD(>listener->counters->failed_rewrites, 1);
if (objt_server(s->target))

_HA_ATOMIC_ADD(&__objt_server(s->target)->counters.failed_rewrites, 1);

if (!(s->txn->req.flags & HTTP_MSGF_SOFT_RW)) {
    ret = ACT_RET_ERR;
if (!(s->flags & SF_ERR_MASK))
s->flags |= SF_ERR_PRXCOND;
}
goto leave;
}

--
Christopher Faulet



Re: [RFC PATCH 6/8] MINOR: uri_normalizer: Add support for supressing leading `../` for dotdot normalizer

2021-04-13 Thread Christopher Faulet

Le 08/04/2021 à 20:59, Tim Duesterhus a écrit :

Willy,
Christopher,

most of the patch is moving around the config parser to support ingesting the
new argument.



This one looks good.

--
Christopher Faulet



Re: [RFC PATCH 5/8] OPTIMIZE: uri_normalizer: Optimize allocations in uri_normalizer_query_sort

2021-04-13 Thread Christopher Faulet

Le 08/04/2021 à 20:59, Tim Duesterhus a écrit :

Willy,
Christopher,

I did not perform any measurements at all. But not reallocating for every
parameter should be better :-)



This one may be useless if you use the trash buffer to store the query 
parameters.

--
Christopher Faulet



Re: [RFC PATCH 4/8] MINOR: uri_normalizer: Add a `sort-query` normalizer

2021-04-13 Thread Christopher Faulet

Le 08/04/2021 à 20:59, Tim Duesterhus a écrit :

Willy,
Christopher,

This one comes with dynamic allocation. The next patch will add an optimization
for a small number of arguments. However dynamic allocation within the main
processing logic is pretty ugly, so this should be looked at further.



Dynamic allocation should be avoided here. Definitely. I may propose you an 
alternative by using the trash buffer area to store the ist array, via a cast. 
It may be considered a bit ugly at first glance, but if you think about it as a 
trash memory space, it is not so shocking. We've already used this trick for the 
HTX and regular buffers. If you do it that way, by default, it gives you 1024 
slots. And you may return an alloc failure beyond this value. It seems reasonable :)


  
+/* Compares two query parameters by name. Query parameters are ordered

+ * as with memcmp. Shorter parameter names are ordered lower. Identical
+ * parameter names are compared by their pointer to maintain a stable
+ * sort.
+ */
+static int query_param_cmp(const void *a, const void *b)
+{
+   const struct ist param_a = *(struct ist*)a;
+   const struct ist param_b = *(struct ist*)b;
+   const struct ist param_a_name = iststop(param_a, '=');
+   const struct ist param_b_name = iststop(param_b, '=');
+
+   int cmp = memcmp(istptr(param_a_name), istptr(param_b_name), 
MIN(istlen(param_a_name), istlen(param_b_name)));
+
+   if (cmp != 0)
+   return cmp;
+
+   if (istlen(param_a_name) < istlen(param_b_name))
+   return -1;
+
+   if (istlen(param_a_name) > istlen(param_b_name))
+   return 1;
+
+   if (istptr(param_a) < istptr(param_b))
+   return -1;
+
+   if (istptr(param_a) > istptr(param_b))
+   return 1;
+
+   return 0;
+}


To make it more simple, you may use istdiff instead.


+
+/* Sorts the parameters within the given query string. Returns an ist 
containing
+ * the new path and backed by `trash` or IST_NULL if the `len` not sufficiently
+ * large to store the resulting path.
+ */
+struct ist uri_normalizer_query_sort(const struct ist query, const char delim, 
char *trash, size_t len)
+{


I hadn't noticed it till now, but please don't use "trash" variable name, it is 
confusing with the trash buffer used almost everywhere. Especially because of my 
next comment :)



+   struct ist scanner = istadv(query, 1);
+   struct ist *params = NULL;
+   struct ist newquery = ist2(trash, 0);
+   size_t param_count = 0;
+   size_t i;
+
+   if (len < istlen(query))
+   return IST_NULL;
+
+   while (istlen(scanner) > 0) {
+   const struct ist param = istsplit(, delim);
+   struct ist *realloc = reallocarray(params, param_count + 1, 
sizeof(*realloc));


Here, instead of a dynamic array, you may use the trash buffer area (declared 
from outside the loop). For instance:


struct ist *params = (struct ist *)b_orig();
size_t max_param = b_size() / sizeof(*params);


+
+   if (!realloc)
+   goto fail;
+
+   params = realloc;
+
+   params[param_count] = param;
+   param_count++;
+   }
+
+   qsort(params, param_count, sizeof(*params), query_param_cmp);
+
+   newquery = __istappend(newquery, '?');
+   for (i = 0; i < param_count; i++) {
+   if (i > 0)
+   newquery = __istappend(newquery, '&');
+
+   if (istcat(, params[i], len) < 0)
+   goto fail;
+   }
+
+   free(params);
+
+   return newquery;
+
+  fail:
+   free(params);
+
+   return IST_NULL;
+}
  
  /*

   * Local variables:




--
Christopher Faulet



Re: [RFC PATCH 3/8] MINOR: uri_normalizer: Add a `dotdot` normalizer to http-request normalize-uri

2021-04-13 Thread Christopher Faulet

Le 08/04/2021 à 20:59, Tim Duesterhus a écrit :

Willy,
Christopher,

I'm not very happy with the normalization logic, because am processing the URI
in reverse. This requires me to directly access offsets instead of using the
`ist` API. However this way I don't need to backtrack once I encounter a `../`
which I consider to be a win.


It is not shocking. The function is readable, it is not a real problem. Maybe we 
can introduce the istoff() function to get the pointer at a given offset. You 
may also choose to fully rely on pointers with a negative index. I know you want 
to use the ist api as far as possible, but it is not always the easiest way :)


At the end it remains your choice. The function is quite good. I just wonder if 
it could be valuable to also handle single dot-segment here in addition to 
double dot-segment. Thus, the normalizer should be renamed "dot-segments" or 
something similar.


Another point is about the dot encoding. It may be good to handle encoded dot 
(%2E), may be via an option. And IMHO, the way empty segments are handle is a 
bit counter intuitive. Calling "merge-slashes" normalizer first is a solution of 
course, but this means rewriting twice the uri. We must figure out what is the 
main expectation for this normalizer. Especially because ignoring empty segment 
when dot-segments are merged is not exactly the same than merge all slashes.


Note I was first surprised that leading dot-segments were preserved, before 
reading the 6th patch because for me it is the most important part. But I'm fine 
with an option in a way or another.


--
Christopher Faulet



Re: [RFC PATCH 2/8] MINOR: uri_normalizer: Add `http-request normalize-uri`

2021-04-13 Thread Christopher Faulet

Le 08/04/2021 à 20:59, Tim Duesterhus a écrit :

Willy,
Christopher,

something simple for a start. This one adds the http-request action and a
very simple normalizer to test whether it works. Turns out it does :-)

You can see the new `ist` helpers in action already. I'm pretty happy that
I was able to implement this completely with the new `ist` API.



Just small comments but otherwise, this one looks good.

[...]


+/* Parses the http-request normalize-uri action. It expects a single 

+ * argument, corresponding too a value in `enum act_normalize_uri`.
+ *
+ * It returns ACT_RET_PRS_OK on success, ACT_RET_PRS_ERR on error.
+ */
+static enum act_parse_ret parse_http_normalize_uri(const char **args, int 
*orig_arg, struct proxy *px,
+   struct act_rule *rule, char 
**err)
+{
+   int cur_arg = *orig_arg;
+
+   rule->action_ptr = http_action_normalize_uri;
+   rule->release_ptr = NULL;
+
+   if (!*args[cur_arg] ||
+   (*args[cur_arg + 1] && strcmp(args[cur_arg + 1], "if") != 0 && 
strcmp(args[cur_arg + 1], "unless") != 0)) {



Testing "if" or "unless" is useless here. If no normalizer name is provided, 
this will be catch in the else statement. Especially because this test will be 
removed by the 6th patch.



+   memprintf(err, "expects exactly 1 argument ");
+   return ACT_RET_PRS_ERR;
+   }
+
+   if (strcmp(args[cur_arg], "merge-slashes") == 0) {
+   rule->action = ACT_NORMALIZE_URI_MERGE_SLASHES;
+   }
+   else {
+   memprintf(err, "unknown normalizer '%s'", args[cur_arg]);
+   return ACT_RET_PRS_ERR;


The list of supported normalizers may help the user here.


+   }
+   cur_arg++;
+
+   *orig_arg = cur_arg;
+   return ACT_RET_PRS_OK;
+}
+



--
Christopher Faulet



Re: [RFC PATCH 1/8] MINOR: uri_normalizer: Add uri_normalizer module

2021-04-13 Thread Christopher Faulet

Le 08/04/2021 à 20:59, Tim Duesterhus a écrit :

Willy,
Christopher,

I used uri_auth.[ch] as the basis for the source file structure (comments and
stuff).



Thanks, nothing to say about this one :)

--
Christopher Faulet



Re: [RFC PATCH 0/8] URI normalization / Issue #714

2021-04-13 Thread Christopher Faulet

Le 08/04/2021 à 20:59, Tim Duesterhus a écrit :

Willy,
Christopher,

Not sure who of you is better suited to review this series, so I'm adding both
of you :-)

I'm tagging this as RFC, because it's large and quite a bit outside of my
comfort zone. However the patches are as clean as possible. They include full
documentation and each normalizer comes with a bunch of reg-tests ensuring they
behave like I expect them to behave. So if you have nothing to complain, then
this series is in a mergeable state. I plan to add a few more normalizers after
this passes an initial review.

I'll add additional remarks to each patch, explaining my decisions in more
detail.

Best regards

Tim Düsterhus (8):
   MINOR: uri_normalizer: Add uri_normalizer module
   MINOR: uri_normalizer: Add `http-request normalize-uri`
   MINOR: uri_normalizer: Add a `dotdot` normalizer to http-request
 normalize-uri
   MINOR: uri_normalizer: Add a `sort-query` normalizer
   OPTIMIZE: uri_normalizer: Optimize allocations in
 uri_normalizer_query_sort
   MINOR: uri_normalizer: Add support for supressing leading `../` for
 dotdot normalizer
   MINOR: uri_normalizer: Support returning detailed errors from uri
 normalization
   MINOR: uri_normalizer: Add a `percent-upper` normalizer

  Makefile   |   3 +-
  doc/configuration.txt  |  58 +
  include/haproxy/action-t.h |   9 +
  include/haproxy/uri_normalizer-t.h |  32 +++
  include/haproxy/uri_normalizer.h   |  33 +++
  reg-tests/http-rules/normalize_uri.vtc | 314 +
  src/http_act.c | 213 +
  src/uri_normalizer.c   | 298 +++
  8 files changed, 959 insertions(+), 1 deletion(-)
  create mode 100644 include/haproxy/uri_normalizer-t.h
  create mode 100644 include/haproxy/uri_normalizer.h
  create mode 100644 reg-tests/http-rules/normalize_uri.vtc
  create mode 100644 src/uri_normalizer.c



Tim,

Sorry for the delay. I'll comment your patches by replying inline when 
appropriate. The quality of the whole series is pretty good. Thanks for the 
effort. Just a suggestion to simplify the error handling introduced in the 7th 
patch. You may use following prototype for your normalizers:


  int uri_normalizer_NAME(const struct ist path, struct ist *dst);

returning for instance 0 on success and anything else on error. This way, it is 
easy to return an enum instead of an integer to be able to handle errors.



--
Christopher Faulet



Re: [ANNOUNCE] haproxy-2.4-dev16

2021-04-12 Thread Christopher Faulet

Le 12/04/2021 à 09:40, Илья Шипицин a écrit :


Dear Team,

can we address at least #1112, #1119 before 2.4 is released ?

Of course, thanks for the reminder !

--
Christopher Faulet



Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-09 Thread Christopher Faulet

Le 09/04/2021 à 19:26, Robin H. Johnson a écrit :

Hi,

Maciej had said they were going to create a new thread, but I didn't see
one yet.

I want to start by noting problem was much worse on 2.2.8 & 2.2.9, and
that 2.2.13 & 2.3.9 don't get entirely hung at 100% anymore: a big
thanks for that initial work in fixing the issue.

As I mentioned in my other mail asking for a 1.8.30 release, we're
experiencing this problem in DigitalOcean's HAProxy instances used to
run the Spaces product.

I've been trying to dig out deeper detail as well with a debug threads
version, but I have the baseline error output from 2.3.9 here to share,
after passing redaction review. This dump was generated with vbernat's
PPA version of 2.3.9, not any internal builds.

We have struggled to reproduce the problem in testing environments, it
only turns up at the biggest regions, and plotting occurances of the
issue over the time dimension suggest that it might have some partial
correlation w/ a weird workload input.

The dumps do suggest Lua is implicated as well, and we've got some
extensive Lua code, so it's impossible to rule it out as contributing to
the problem (We have been discussing plans to move it to SPOA instead).

The Lua code in question hasn't changed significantly in nearly 6
months, and it was problem-free on the 1.8 series (having a test suite
for the Lua code has been invaluable).



Hi,

It seems you have a blocking call in one of your lua script. The threads dump 
shows many threads blocked in hlua_ctx_init. Many others are executing lua. 
Unfortunately, for a unknown reason, there is no stack traceback.


For the 2.3 and prior, the lua scripts are executed under a global lock. Thus 
blocking calls in a lua script are awful, because it does not block only one 
thread but all others too. I guess the same issue exists on the 1.8, but there 
is no watchdog on this version. Thus, time to time HAProxy hangs and may report 
huge latencies but, at the end it recovers and continues to process data. It is 
exactly the purpose of the watchdog, reporting hidden bugs related to spinning 
loops and deadlocks.


However, I may be wrong. It may be just a contention problem because your are 
executing lua with 64 threads and a huge workload. In this case, you may give a 
try to the 2.4 (under development). There is a way to have a separate lua 
context for each thread loading the scripts with "lua-load-per-thread" 
directive. Out of curiosity, on the 1.8, are you running HAProxy with several 
threads or are you spawning several processes?


--
Christopher Faulet



Re: [RFC PATCH 0/8] URI normalization / Issue #714

2021-04-09 Thread Christopher Faulet

Le 08/04/2021 à 20:59, Tim Duesterhus a écrit :

Willy,
Christopher,

Not sure who of you is better suited to review this series, so I'm adding both
of you :-)

I'm tagging this as RFC, because it's large and quite a bit outside of my
comfort zone. However the patches are as clean as possible. They include full
documentation and each normalizer comes with a bunch of reg-tests ensuring they
behave like I expect them to behave. So if you have nothing to complain, then
this series is in a mergeable state. I plan to add a few more normalizers after
this passes an initial review.

I'll add additional remarks to each patch, explaining my decisions in more
detail.



Thanks Tim, I'll try to review your patches next week.


--
Christopher Faulet



Re: [PATCH] BUG/MINOR: tools: fix parsing "us" unit for timers

2021-04-05 Thread Christopher Faulet

Le 02/04/2021 à 22:12, Thayne McCombs a écrit :

Commit c20ad0d8dbd1bb5707bbfe23632415c3062e046c (BUG/MINOR: tools:  make
parse_time_err() more strict on the timer validity) broke parsing the "us"
unit in timers. It caused `parse_time_err()` to return the string "s",
which indicates an error.

Now if the "u" is followed by an "s" we properly continue processing the
time instead of immediately failing.



Thanks, now merged !

--
Christopher Faulet



Re: [2.2.9] 100% CPU usage

2021-04-02 Thread Christopher Faulet

Le 31/03/2021 à 13:28, Maciej Zdeb a écrit :

Hi,

Well it's a bit better situation than earlier because only one thread is looping 
forever and the rest is working properly. I've tried to verify where exactly the 
thread looped but doing "n" in gdb fixed the problem :( After quitting gdb 
session all threads were idle. Before I started gdb it looped about 3h not 
serving any traffic, because I've put it into maintenance as soon as I observed 
abnormal cpu usage.




Hi Maciej,

I'm puzzled. It seems to be a different issue than the first one. You reported 
an issue during reloads, on the old processes. There was a loop because of a 
deadlock, but the traces showed the watchdog was fired, probably because of the 
lua (this must be confirm though).


Here, it is a loop on a living process, right ? Reading the trace, it is for now 
a bit hard to figure out what happens. If it is reproducible, you may try to use 
"perf top" command, selecting the right thread ID with "-t" argument.


In addition, if the loop always falls in the H2 multiplexer, it could be good to 
print the h2C structure to have more info on its internal state.


And to be complete, the output of "show activity", "show fd" and "show sess all" 
may help. Because it still loops with no traffic, it should be small. "show 
threads" may be good, but HAProxy should be compiled with USE_THREAD_DUMP=1 to 
have an advanced dump.


Sorry to ask you so much work, it is pretty hard to track this kind of bug.

Thanks !
--
Christopher Faulet



Re: is it possible to disable option httpchk per backend?

2021-03-26 Thread Christopher Faulet

Le 25/03/2021 à 17:53, Mariusz Gronczewski a écrit :

Hi,

is it possible to disable "option httpchk" in specific backend when it
is enabled in defaults block? I have config where basically every
backend sans one is http so I'd like to keep that in defaults and just
disable it in tcp backend (which is backend for SPOE/A) but it seems to
be one of very few options that do not have "no option httpchk".



Hi,

Indeed, you can't. But you may override it with the right health-check type. For 
instance "option tcp-check". Or better "option spop-check" if it is a SPOE backend.


--
Christopher Faulet



Re: [2.2.9] 100% CPU usage

2021-03-25 Thread Christopher Faulet

Le 25/03/2021 à 13:38, Maciej Zdeb a écrit :

Hi,

I deployed a patched (with volatile hlua_not_dumpable) HAProxy and so far so 
good, no looping. Christopher I saw new patches with hlua_traceback used 
instead, looks much cleaner to me, should I verify them instead? :)


Christopher & Willy I've forgotten to thank you for help!

Yes please, try the last 2.2 snapshot. It is a really a better way to fix this 
issue because the Lua traceback is never ignored. And it is really safer to not 
allocate memory in the debugger.


So now, we should be able to figure out why the Lua fires the watchdog. Because, 
under the hood, it is the true issue :)


--
Christopher Faulet



Re: [2.2.9] 100% CPU usage

2021-03-24 Thread Christopher Faulet

Le 24/03/2021 à 10:16, Willy Tarreau a écrit :

On Wed, Mar 24, 2021 at 10:11:19AM +0100, Maciej Zdeb wrote:

Wow, that's it! :)

0x55d94949e965 <+53>: addl   $0x1,%fs:0xfffdd688
0x55d94949e96e <+62>: callq  0x55d9494782c0 
0x55d94949e973 <+67>: subl   $0x1,%fs:0xfffdd688
[...]
0x55d94949e99f <+111>: ja 0x55d94949e9b0 
0x55d94949e9a1 <+113>: mov%rbx,%rdi
0x55d94949e9a4 <+116>: callq  0x55d949477d50 
0x55d94949e9a9 <+121>: test   %rax,%rax
[...]
0x55d94949e9c1 <+145>: addl   $0x1,%fs:0xfffdd688
0x55d94949e9ca <+154>: callq  0x55d949477b50 
0x55d94949e9cf <+159>: subl   $0x1,%fs:0xfffdd688



Cool!


Ok I'll make hlua_not_dumpable volatile instead of applying compiler
barriers.


Yes, it's safer for the long term as nobody will have to think about
it anymore. We'll have to integrate this one as well.


Thanks Guys, I will push a fix to make the variable volatile.

However, reading the other trace Maciej sent (bussy_thread_peers.txt), it seems 
possible to stop a memory allocation from other places. Thus, I guess we must 
find a more global way to prevent the lua stack dump.


--
Christopher Faulet



Re: [2.2.9] 100% CPU usage

2021-03-23 Thread Christopher Faulet

Le 23/03/2021 à 11:14, Maciej Zdeb a écrit :

Hi Christopher,

Bad news, patches didn't help. Attaching stacktraces, now it looks that spoe 
that executes lua scripts (free_thread_spue_lua.txt) tried to malloc twice. :(




It is most probably because of compiler optimizations. Some compiler barriers 
are necessary to avoid instructions reordering. It is the purpose of attached 
patches. Sorry to ask you it again, but could you make some tests ?


Thanks !
--
Christopher Faulet
>From b4f55500b514e5bfcdaba938cbd2b0ba3cfb2f62 Mon Sep 17 00:00:00 2001
From: Christopher Faulet 
Date: Tue, 23 Mar 2021 15:17:22 +0100
Subject: [PATCH] BUG/MEDIUM: hlua: Use compiler barrier to protect
 hlua_not_dumpable increments

In hlua_alloc() function, the hlua_not_dumpable variable is incremented
before any call to realloc() and decremented just after. To be sure it
really works, we must prevent any instruction reordering. Thus compiler
barriers have been added to do so.

The patch fixes a bug in the commit a61789a1d ("MEDIUM: lua: Use a
per-thread counter to track some non-reentrant parts of lua"). It must be
backported as far as 2.0.
---
 src/hlua.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/hlua.c b/src/hlua.c
index 5bef67a48b..9fc30336f1 100644
--- a/src/hlua.c
+++ b/src/hlua.c
@@ -8196,7 +8196,9 @@ static void *hlua_alloc(void *ud, void *ptr, size_t osize, size_t nsize)
 		if (ptr)
 			zone->allocated -= osize;
 		hlua_not_dumpable++;
+		__ha_compiler_barrier();
 		free(ptr);
+		__ha_compiler_barrier();
 		hlua_not_dumpable--;
 		return NULL;
 	}
@@ -8207,7 +8209,9 @@ static void *hlua_alloc(void *ud, void *ptr, size_t osize, size_t nsize)
 			return NULL;
 
 		hlua_not_dumpable++;
+		__ha_compiler_barrier();
 		ptr = malloc(nsize);
+		__ha_compiler_barrier();
 		hlua_not_dumpable--;
 		if (ptr)
 			zone->allocated += nsize;
@@ -8219,7 +8223,9 @@ static void *hlua_alloc(void *ud, void *ptr, size_t osize, size_t nsize)
 		return NULL;
 
 	hlua_not_dumpable++;
+	__ha_compiler_barrier();
 	ptr = realloc(ptr, nsize);
+	__ha_compiler_barrier();
 	hlua_not_dumpable--;
 	if (ptr)
 		zone->allocated += nsize - osize;
-- 
2.30.2

>From 8eee5c4ba6eb92ceece0414b716c075c81d689ab Mon Sep 17 00:00:00 2001
From: Christopher Faulet 
Date: Tue, 23 Mar 2021 15:16:52 +0100
Subject: [PATCH] BUG/MEDIUM: hlua: Use compiler barrier to protect
 hlua_not_dumpable increments

In hlua_alloc() function, the hlua_not_dumpable variable is incremented
before any call to realloc() and decremented just after. To be sure it
really works, we must prevent any instruction reordering. Thus compiler
barriers have been added to do so.

The patch fixes a bug in the commit a61789a1d ("MEDIUM: lua: Use a
per-thread counter to track some non-reentrant parts of lua"). It must be
backported as far as 2.0.
---
 src/hlua.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/hlua.c b/src/hlua.c
index 990080b8c..ba7f93a99 100644
--- a/src/hlua.c
+++ b/src/hlua.c
@@ -8245,7 +8245,9 @@ static void *hlua_alloc(void *ud, void *ptr, size_t osize, size_t nsize)
 		if (ptr)
 			zone->allocated -= osize;
 		hlua_not_dumpable++;
+		__ha_compiler_barrier();
 		free(ptr);
+		__ha_compiler_barrier();
 		hlua_not_dumpable--;
 		return NULL;
 	}
@@ -8256,7 +8258,9 @@ static void *hlua_alloc(void *ud, void *ptr, size_t osize, size_t nsize)
 			return NULL;
 
 		hlua_not_dumpable++;
+		__ha_compiler_barrier();
 		ptr = malloc(nsize);
+		__ha_compiler_barrier();
 		hlua_not_dumpable--;
 		if (ptr)
 			zone->allocated += nsize;
@@ -8268,7 +8272,9 @@ static void *hlua_alloc(void *ud, void *ptr, size_t osize, size_t nsize)
 		return NULL;
 
 	hlua_not_dumpable++;
+	__ha_compiler_barrier();
 	ptr = realloc(ptr, nsize);
+	__ha_compiler_barrier();
 	hlua_not_dumpable--;
 	if (ptr)
 		zone->allocated += nsize - osize;
-- 
2.30.2

>From 35c1b30ef2d725a37a82f0a293b2bd0bd7c473e2 Mon Sep 17 00:00:00 2001
From: Christopher Faulet 
Date: Tue, 23 Mar 2021 15:10:46 +0100
Subject: [PATCH] BUG/MEDIUM: hlua: Use compiler barrier to protect
 hlua_not_dumpable increments

In hlua_alloc() function, the hlua_not_dumpable variable is incremented
before any call to realloc() and decremented just after. To be sure it
really works, we must prevent any instruction reordering. Thus compiler
barriers have been added to do so.

The patch fixes a bug in the commit a61789a1d ("MEDIUM: lua: Use a
per-thread counter to track some non-reentrant parts of lua"). It must be
backported as far as 2.0.
---
 src/hlua.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/hlua.c b/src/hlua.c
index 962195a60..d47973dcf 100644
--- a/src/hlua.c
+++ b/src/hlua.c
@@ -8639,7 +8639,9 @@ static void *hlua_alloc(void *ud, void *ptr, size_t osize, size_t nsize)
 	 */
 	if (likely(~zone->limit == 0)) {
 		hlua_not_dumpable++;
+		__ha_compiler_barrier();
 		ptr = realloc(ptr, nsize);
+		__ha_compiler_barrier();
 		hlua_not_dumpable--;
 		r

[ANNOUNCE] haproxy-1.6.16

2021-03-19 Thread Christopher Faulet

Hi,

HAProxy 1.6.16 was released on 2021/03/19. It added 71 new commits
after version 1.6.15.

The 1.6 branch is EOL now. Thus, it is the last 1.6 release. No further
release should be expected. It contains all pending patches marked to be
backported to the 1.6 to leave it in a proper state. Have a look at the
changelog below for the complete list of fixes.

You should have no reason to deploy it in a production environment. Use the
1.8 or above instead. However, if possible directly move on the 2.0 or the
2.2. No specific support should no longer be expected on the 1.6. This
branch will not receive fixes anymore.

Thanks everyone for you help and your contributions !

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/1.6/src/
   Git repository   : http://git.haproxy.org/git/haproxy-1.6.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-1.6.git
   Changelog: http://www.haproxy.org/download/1.6/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Amaury Denoyelle (1):
  BUG/MINOR: config: Fix memory leak on config parse listen

Baptiste Assmann (1):
  BUG/MINOR: http_act: don't check capture id in backend

Christopher Faulet (24):
  BUG/MEDIUM: stream: Be sure to release allocated captures for TCP streams
  BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules
  BUG/MINOR: stick-table: Use MAX_SESS_STKCTR as the max track ID during 
parsing
  MINOR: proxy/http-ana: Add support of extra attributes for the cookie 
directive
  BUG/MINOR: rules: Increment be_counters if backend is assigned for a 
silent-drop
  BUG/MEDIUM: lua: Reset analyse expiration timeout before executing a lua 
action
  BUG/MEDIUM: pattern: Add a trailing \0 to match strings only if possible
  BUG/MEDIUM: channel: Be aware of SHUTW_NOW flag when output data are 
peeked
  BUG/MINOR: tcp-rules: Set the inspect-delay when a tcp-response action 
yields
  BUG/MEDIUM: map/lua: Return an error if a map is loaded during runtime
  BUG/MINOR: lua: Check argument type to convert it to IPv4/IPv6 arg 
validation
  BUG/MINOR: lua: Check argument type to convert it to IP mask in arg 
validation
  BUG/MEDIUM: pattern: Renew the pattern expression revision when it is 
pruned
  MINOR: hlua: Display debug messages on stderr only in debug mode
  BUG/MINOR: http-fetch: Fix calls w/o parentheses of the cookie sample 
fetches
  BUG/MINOR: tools: make parse_time_err() more strict on the timer validity
  BUG/MINOR: tools: Reject size format not starting by a digit
  BUG/MINOR: server: Fix server-state-file-name directive
  CLEANUP: deinit: release global and per-proxy server-state variables on 
deinit
  BUG/MINOR: server: Don't call fopen() with server-state filepath set to 
NULL
  BUG/MINOR: sample: Always consider zero size string samples as unsafe
  BUG/MINOR: server: Init params before parsing a new server-state line
  BUG/MINOR: http-ana: Only consider dst address to process originalto 
option
  BUG/MINOR: connection: Use the client's dst family for adressless servers

David Carlier (1):
  DOC: email change of the DeviceAtlas maintainer

Dragan Dosen (1):
  BUG/MEDIUM: pattern: fix memory leak in regex pattern functions

Emeric Brun (1):
  CLEANUP: channel: fix comment in ci_putblk.

Emmanuel Hocdet (1):
  BUG/MINOR: ssl: fix crt-list neg filter for openssl < 1.1.1

Jerome Magnin (2):
  BUG/MINOR: stream: don't mistake match rules for store-request rules
  BUG/MINOR: pattern: handle errors from fgets when trying to load patterns

Joao Morais (1):
  BUG/MINOR: config: Update cookie domain warn to RFC6265

Maciej Zdeb (1):
  BUG/MINOR: http-fetch: Extract cookie value even when no cookie name

Mathias Weiersmueller (1):
  DOC: clarify matching strings on binary fetches

Thierry Fournier (1):
  MINOR: common: mask conversion

Tim Duesterhus (4):
  BUG/MINOR: dns: Make dns_query_id_seed unsigned
  BUG/MAJOR: proxy_protocol: Properly validate TLV lengths
  BUG/MEDIUM: fetch: Fix hdr_ip misparsing IPv4 addresses due to missing NUL
  BUG/MINOR: http_act: don't check capture id in backend (2)

William Dauchy (3):
  BUG/MINOR: dns: allow 63 char in hostname
  BUG/MINOR: namespace: avoid closing fd when socket failed in my_socketat
  DOC: agent-check: fix typo in "fail" word expected reply

William Lallemand (2):
  BUG/MINOR: ssl: verifyhost is case sensitive
  DOC: ssl: crt-list negative filters are only a hint

Willy Tarreau (26):
  BUG/MINOR: listener: also clear the error flag on a pa

[ANNOUNCE] haproxy-1.7.13

2021-03-19 Thread Christopher Faulet

Hi,

HAProxy 1.7.13 was released on 2021/03/19. It added 114 new commits
after version 1.7.12.

The previous release is quite old now. So this one is emitted to flush the
pipe. Honestly it will be too long to remember the context of all fixes,
most of them quite old now. Refer to the following changelog to see if you
may be hit by a known and fixed bug. However, if you are still running on
the 1.7.12, it is probably not the case.

The EOL for the 1.7 is planned at the end of this year. It may be a good
idea to think to upgrade. If you decide to take the plunge, it could be
better to directly move on the 2.0 or the 2.2.

Thanks everyone for you help and your contributions !

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/1.7/src/
   Git repository   : http://git.haproxy.org/git/haproxy-1.7.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-1.7.git
   Changelog: http://www.haproxy.org/download/1.7/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Amaury Denoyelle (2):
  BUG/MINOR: config: Fix memory leak on config parse listen
  MINOR: counters: fix a typo in comment

Baptiste Assmann (1):
  BUG/MINOR: http_act: don't check capture id in backend

Christopher Faulet (35):
  BUG/MEDIUM: stream: Be sure to release allocated captures for TCP streams
  BUG/MINOR: http-rules: Remove buggy deinit functions for HTTP rules
  BUG/MINOR: stick-table: Use MAX_SESS_STKCTR as the max track ID during 
parsing
  BUG/MINOR: tcp-rules: Fix memory releases on error path during action 
parsing
  MINOR: proxy/http-ana: Add support of extra attributes for the cookie 
directive
  BUG/MINOR: rules: Preserve FLT_END analyzers on silent-drop action
  BUG/MINOR: rules: Increment be_counters if backend is assigned for a 
silent-drop
  MINOR: http-rules: Add a flag on redirect rules to know the rule direction
  MINOR: http-rules: Handle the rule direction when a redirect is evaluated
  BUG/MINOR: http-ana: Reset request analysers on error when waiting for 
response
  BUG/MEDIUM: lua: Reset analyse expiration timeout before executing a lua 
action
  BUG/MEDIUM: pattern: Add a trailing \0 to match strings only if possible
  BUG/MEDIUM: channel: Be aware of SHUTW_NOW flag when output data are 
peeked
  BUG/MINOR: tcp-rules: Set the inspect-delay when a tcp-response action 
yields
  BUG/MEDIUM: map/lua: Return an error if a map is loaded during runtime
  BUG/MINOR: lua: Check argument type to convert it to IPv4/IPv6 arg 
validation
  BUG/MINOR: lua: Check argument type to convert it to IP mask in arg 
validation
  BUG/MEDIUM: pattern: Renew the pattern expression revision when it is 
pruned
  MINOR: hlua: Display debug messages on stderr only in debug mode
  BUG/MEDIUM: filters: Don't try to init filters for disabled proxies
  BUG/MINOR: filters: Skip disabled proxies during startup only
  BUG/MINOR: http-fetch: Fix calls w/o parentheses of the cookie sample 
fetches
  BUG/MINOR: tools: make parse_time_err() more strict on the timer validity
  BUG/MINOR: tools: Reject size format not starting by a digit
  BUG/MINOR: stick-table: Always call smp_fetch_src() with a valid arg list
  BUG/MINOR: server: Fix server-state-file-name directive
  CLEANUP: deinit: release global and per-proxy server-state variables on 
deinit
  BUG/MINOR: server: Don't call fopen() with server-state filepath set to 
NULL
  BUG/MINOR: sample: Always consider zero size string samples as unsafe
  BUG/MINOR: server: Init params before parsing a new server-state line
  BUG/MINOR: http-ana: Only consider dst address to process originalto 
option
  BUG/MINOR: tcp-act: Don't forget to set the original port for IPv4 
set-dst rule
  BUG/MINOR: connection: Use the client's dst family for adressless servers
  BUG/MINOR: hlua: Don't strip last non-LWS char in 
hlua_pushstrippedstring()
  BUG/MEDIUM: filters: Set CF_FL_ANALYZE on channels when filters are 
attached

Daniel Corbett (1):
  BUG/MINOR: stats: Fix color of draining servers on stats page

David Carlier (1):
  DOC: email change of the DeviceAtlas maintainer

Dragan Dosen (1):
  BUG/MEDIUM: pattern: fix memory leak in regex pattern functions

Emeric Brun (1):
  CLEANUP: channel: fix comment in ci_putblk.

Emmanuel Hocdet (1):
  BUG/MINOR: ssl: fix crt-list neg filter for openssl < 1.1.1

Jerome Magnin (3):
  BUG/MINOR: stream: don't mistake match rules for store-request rules
  BUG/MINOR: pattern: handle errors from fgets when trying to load patterns
  BUG/MI

Re: [2.2.9] 100% CPU usage

2021-03-19 Thread Christopher Faulet

Le 16/03/2021 à 13:46, Maciej Zdeb a écrit :
Sorry for spam. In the last message I said that the old process (after reload) 
is consuming cpu for lua processing and that's not true, it is processing other 
things also.


I'll take a break. ;) Then I'll verify if the issue exists on 2.3 and maybe 2.4 
branch. For each version I need a week or two to be sure the issue does not 
occur. :(


If 2.3 and 2.4 behaves the same way the 2.2 does, I'll try to confirm if there 
is any relation between infinite loops and custom configuration:

- lua scripts (mainly used for header generation/manipulation),
- spoe (used for sending metadata about each request to external service),
- peers (we have a cluster of 12 HAProxy servers connected to each other).


Hi Maciej,

I've read more carefully your backtraces, and indeed, it seems to be related to 
lua processing. I don't know if the watchdog is triggered because of the lua or 
if it is just a side-effect. But the lua execution is interrupted inside the 
memory allocator. And malloc/realloc are not async-signal-safe. Unfortunately, 
when the lua stack is dumped, the same allocator is also used. At this stage, 
because a lock was not properly released, HAProxy enter in a deadlock.


On other threads, we loop in the watchdog, waiting for the hand to dump the 
thread information and that explains the 100% CPU usage you observe.


So, to prevent this situation, the lua stack must not be dumped if it was 
interrupted inside an unsafe part. It is the easiest way we found to workaround 
this bug. And because it is pretty rare, it should be good enough.


However, I'm unable to reproduce the bug. Could you test attached patches please 
? I attached patched for the 2.4, 2.3 and 2.2. Because you experienced this bug 
on the 2.2, it is probably easier to test patches for this version.


If fixed, it could be good to figure out why the watchdog is triggered on your 
old processes.


--
Christopher Faulet
>From a61789a1d62fd71c751189faf5371740dd375f33 Mon Sep 17 00:00:00 2001
From: Christopher Faulet 
Date: Fri, 19 Mar 2021 15:16:28 +0100
Subject: [PATCH 1/2] MEDIUM: lua: Use a per-thread counter to track some
 non-reentrant parts of lua

Some parts of the Lua are non-reentrant. We must be sure to carefully track
these parts to not dump the lua stack when it is interrupted inside such
parts. For now, we only identified the custom lua allocator. If the thread
is interrupted during the memory allocation, we must not try to print the
lua stack wich also allocate memory. Indeed, realloc() is not
async-signal-safe.

In this patch we introduce a thread-local counter. It is incremented before
entering in a non-reentrant part and decremented when exiting. It is only
performed in hlua_alloc() for now.
---
 include/haproxy/hlua.h |  1 +
 src/hlua.c | 13 +++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/haproxy/hlua.h b/include/haproxy/hlua.h
index 6aca17f39..36629e637 100644
--- a/include/haproxy/hlua.h
+++ b/include/haproxy/hlua.h
@@ -50,6 +50,7 @@ void hlua_applet_tcp_fct(struct appctx *ctx);
 void hlua_applet_http_fct(struct appctx *ctx);
 struct task *hlua_process_task(struct task *task, void *context, unsigned int state);
 
+extern THREAD_LOCAL unsigned int hlua_not_dumpable;
 #else /* USE_LUA */
 
 / For use when Lua is disabled /
diff --git a/src/hlua.c b/src/hlua.c
index 42d5be2a5..962195a60 100644
--- a/src/hlua.c
+++ b/src/hlua.c
@@ -274,6 +274,9 @@ struct hlua_mem_allocator {
 
 static struct hlua_mem_allocator hlua_global_allocator THREAD_ALIGNED(64);
 
+ /* > 0 if lua is in a non-rentrant part, thus with a non-dumpable stack */
+THREAD_LOCAL unsigned int hlua_not_dumpable = 0;
+
 /* These functions converts types between HAProxy internal args or
  * sample and LUA types. Another function permits to check if the
  * LUA stack contains arguments according with an required ARG_T
@@ -8634,8 +8637,12 @@ static void *hlua_alloc(void *ud, void *ptr, size_t osize, size_t nsize)
 	/* a limit of ~0 means unlimited and boot complete, so there's no need
 	 * for accounting anymore.
 	 */
-	if (likely(~zone->limit == 0))
-		return realloc(ptr, nsize);
+	if (likely(~zone->limit == 0)) {
+		hlua_not_dumpable++;
+		ptr = realloc(ptr, nsize);
+		hlua_not_dumpable--;
+		return ptr;
+	}
 
 	if (!ptr)
 		osize = 0;
@@ -8649,7 +8656,9 @@ static void *hlua_alloc(void *ud, void *ptr, size_t osize, size_t nsize)
 			return NULL;
 	} while (!_HA_ATOMIC_CAS(>allocated, , new));
 
+	hlua_not_dumpable++;
 	ptr = realloc(ptr, nsize);
+	hlua_not_dumpable--;
 
 	if (unlikely(!ptr && nsize)) // failed
 		_HA_ATOMIC_SUB(>allocated, nsize - osize);
-- 
2.30.2

>From 83926a04febe94f332c6f45c98773286678346d3 Mon Sep 17 00:00:00 2001
From: Christopher Faulet 
Date: Fri, 19 Mar 2021 15:41:08 +0100
Subject: [PATCH 2/2] BUG/MEDIUM: debug/lua: Don't dump the lua stack if not
 dumpable

When we t

[ANNOUNCE] haproxy-2.0.21

2021-03-18 Thread Christopher Faulet
eading one byte after the
buffer.

  * Address assignment for address-less servers was fixed to be able to
handle IPv4 and IPv6 by setting the right address family.

  * The set-dst action was buggy. The original port was not set for IPv4
address. It was a problem for UNIX sockets, the port was not set to 0
and had an undefined value.

  * The processing of the HTTP originalto action was fixed to only consider
the destination address. The address family of the source address was
tested and not the destination one.

  * A memory leak was fixed by Rémi on the error path of the sample
expression parser and allocation failures are now handled when the
concat() converter is called.

  * An issue in filters (compression, spoe, etc) could block response
headers in empty responses with no content-length.

  * There was a risk of temporary CLOSE_WAIT on aborted H2 connections since
the recent fixes for truncated responses. Note that these ones would
vanish on timeout anyway, hence it was more annoying than dramatic.

  * Errors on connections would not prevent SSL handshake from being
performed, leading to wasted CPU cycles that could sometimes maintain
the load artificially high during contention.

  * An alignment issue in the XXHash code affecting ARMv6/v7 running in
32-bit mode on 64-bit kernels was addressed ; it could cause bus errors
and crashes in 32-bit chroots or containers when using the pattern LRU
cache.

  * Lua's core.get_info() got broken in previous version due to the missing
definition of INF_BUILD_INFO in stats.

  * a few other really minor issues were addressed

Thanks everyone for your help and your contributions !

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/2.0/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.0.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.0.git
   Changelog: http://www.haproxy.org/download/2.0/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Adis Nezirovic (1):
  BUG/MEDIUM: stats: add missing INF_BUILD_INFO definition

Amaury Denoyelle (2):
  BUG/MINOR: config: fix leak on proxy.conn_src.bind_hdr_name
  BUG/MINOR: backend: hold correctly lock when killing idle conn

Baptiste Assmann (1):
  BUG/MINOR: resolvers: new callback to properly handle SRV record errors

Bertrand Jacquin (2):
  BUG/MINOR: mworker: define _GNU_SOURCE for strsignal()
  BUILD/MINOR: lua: define _GNU_SOURCE for LLONG_MAX

Christopher Faulet (30):
  BUG/MINOR: init: Use a dynamic buffer to set HAPROXY_CFGFILES env variable
  BUG/MEDIUM: filters/htx: Fix data forwarding when payload length is 
unknown
  BUG/MINOR: stick-table: Always call smp_fetch_src() with a valid arg list
  BUG/MEDIUM: mux-h2: Be sure to enter in demux loop even if dbuf is empty
  BUG/MEDIUM: mux-h1: Always set CS_FL_EOI for response in MSG_DONE state
  BUG/MINOR: tools: Fix a memory leak on error path in parse_dotted_uints()
  BUG/MINOR: server: Fix server-state-file-name directive
  CLEANUP: deinit: release global and per-proxy server-state variables on 
deinit
  BUG/MINOR: server: Don't call fopen() with server-state filepath set to 
NULL
  BUG/MINOR: server: Remove RMAINT from admin state when loading server 
state
  BUG/MINOR: sample: Always consider zero size string samples as unsafe
  BUG/MINOR: server: Init params before parsing a new server-state line
  BUG/MINOR: server: Be sure to cut the last parsed field of a server-state 
line
  BUG/MEDIUM: mux-h1: Fix handling of responses to CONNECT other than 200-ok
  BUG/MEDIUM: resolvers: Reset server address and port for obselete SRV 
records
  BUG/MEDIUM: resolvers: Reset address for unresolved servers
  BUG/MINOR: mux-h1: Immediately report H1C errors from h1_snd_buf()
  BUG/MINOR: http-ana: Only consider dst address to process originalto 
option
  BUG/MINOR: tcp-act: Don't forget to set the original port for IPv4 
set-dst rule
  BUG/MINOR: connection: Use the client's dst family for adressless servers
  BUG/MEDIUM: spoe: Kill applets if there are pending connections and nbthread 
> 1
  DOC: spoe: Add a note about fragmentation support in HAProxy
  BUG/MINOR: http-ana: Don't increment HTTP error counter on read 
error/timeout
  BUG/MINOR: hlua: Don't strip last non-LWS char in 
hlua_pushstrippedstring()
  BUG/MEDIUM: filters: Set CF_FL_ANALYZE on channels when filters are 
attached
  BUG/MINOR: proxy/session: Be sure to have a listener to increment its 
counters
  BUG/MINOR: se

[ANNOUNCE] haproxy-2.1.12

2021-03-18 Thread Christopher Faulet
ed. Since the beginning, in
contrary to what the documentation said, this directive was not able to
be used with no parameter. To use the backend name as file name, the
undocumented parameter "use-backend-name" had to be used instead. Now,
both modes are supported. Another issue about the server-state file was
fixed. If the name of local server-state file was too long, the fopen()
function was called with a NULL file name. Finally, the RMAINT admin
mode is now ignored when a server state is loaded. Before, when this
admin mode was set, an error was reported, preventing the server state
to be loaded.

  * The tracked sessions counter was not atomically incremented, resulting
in occasional slightly off values.

  * The smp_is_safe() function was fixed. Zero size string samples were not
systematically considered as unsafe. In some circumstances, it was
possible to consider such samples as safe by reading one byte after the
buffer.

  * Address assignment for address-less servers was fixed to be able to
handle IPv4 and IPv6 by setting the right address family.

  * The set-dst action was buggy. The original port was not set for IPv4
address. It was a problem for UNIX sockets, the port was not set to 0
and had an undefined value.

  * The processing of the HTTP originalto action was fixed to only consider
the destination address. The address family of the source address was
tested and not the destination one.

  * Jérôme fixed a bug about the "strict-limits" global option when it is
used in conjunction with master-worker. With this option, when a
setrlimit fails, HAproxy must always exit. But it was only done when the
master-worker mode wasn't used. It now works as expected.

  * A memory leak was fixed by Rémi on the error path of the sample
expression parser and allocation failures are now handled when the
concat() converter is called.

  * An issue in filters (compression, spoe, etc) could block response
headers in empty responses with no content-length.

  * There was a risk of temporary CLOSE_WAIT on aborted H2 connections since
the recent fixes for truncated responses. Note that these ones would
vanish on timeout anyway, hence it was more annoying than dramatic.

  * The CLI's "abort ssl cert" would purge the old instead of new SSL info.

  * Errors on connections would not prevent SSL handshake from being
performed, leading to wasted CPU cycles that could sometimes maintain
the load artificially high during contention.

  * An alignment issue in the XXHash code affecting ARMv6/v7 running in
32-bit mode on 64-bit kernels was addressed ; it could cause bus errors
and crashes in 32-bit chroots or containers when using the pattern LRU
cache.

  * Lua's core.get_info() got broken in previous version due to the missing
 definition of INF_BUILD_INFO in stats.

  * a few other really minor issues were addressed

Thanks everyone for your help and your contributions !

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/2.1/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.1.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.1.git
   Changelog: http://www.haproxy.org/download/2.1/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Adis Nezirovic (1):
  BUG/MEDIUM: stats: add missing INF_BUILD_INFO definition

Amaury Denoyelle (2):
  BUG/MINOR: config: fix leak on proxy.conn_src.bind_hdr_name
  BUG/MINOR: backend: hold correctly lock when killing idle conn

Baptiste Assmann (1):
  BUG/MINOR: resolvers: new callback to properly handle SRV record errors

Bertrand Jacquin (2):
  BUG/MINOR: mworker: define _GNU_SOURCE for strsignal()
  BUILD/MINOR: lua: define _GNU_SOURCE for LLONG_MAX

Christopher Faulet (31):
  BUG/MINOR: init: Use a dynamic buffer to set HAPROXY_CFGFILES env variable
  BUG/MEDIUM: filters/htx: Fix data forwarding when payload length is 
unknown
  BUG/MINOR: stick-table: Always call smp_fetch_src() with a valid arg list
  BUG/MINOR: http-ana: Don't increment HTTP error counter on internal errors
  BUG/MEDIUM: mux-h1: Always set CS_FL_EOI for response in MSG_DONE state
  BUG/MINOR: tools: Fix a memory leak on error path in parse_dotted_uints()
  BUG/MINOR: server: Fix server-state-file-name directive
  CLEANUP: deinit: release global and per-proxy server-state variables on 
deinit
  BUG/MINOR: server: Don't call fopen() with server-state filepath set to 
NULL
  BUG/MINOR: server: Remove RMAINT from 

[ANNOUNCE] haproxy-2.2.11

2021-03-18 Thread Christopher Faulet



Hi,

HAProxy 2.2.11 was released on 2021/03/18. It added 31 new commits
after version 2.2.10.

This release comes few days after the 2.3.7 with the same aim, fix the
resolvers part. In the 2.2.10, some fixes about the cached records
expiration revealed design problems leading to pretty annoying bugs around
the SRV records. Servers based on SRV resolution may enter in a flapping
loop, switching them from UP state to DOWN and UP again at each
resolution. The current design has clearly shown its limitations and it is
already planned to refactor it. However, this will not take part of the
stable releases. Thus several patches were pushed to fix all known issues,
based on the current design. First the detection of cached additional
records was fixed, just like the record expiration loop, fixing this way the
flapping servers bug. The triggering of extra DNS resolutions to overcome
the lack of some additional records in a SRV response was improved, as the
ability to stop them. And finally, the time unit for cached records is now
the millisecond. It should mitigate some edge cases with short-live records
and heavy resolution frequencies. Thanks to the feedback of the community on
the 2.3.7, this all seems to be behind us now.

However note there remains performance issues because of the current design
that may trigger the watchdog if you use huge number of servers in a
server-template declaration. It is hard to say from how many servers a crash
may occur. This will be addressed with the resolvers refactoring. If
possible, some fixes will be backported in stable versions. But it might
required too huge change to be feasible. It is too early to know.

Note also there is still an identified bug in the listeners part, but not
fixed yet. It is a AB/BA deadlock problem that may occur when HAProxy is
stopped. Maciej also reports on the mailing list a 100% CPU usage bug for few
seconds on old processes after releads. Not sure it is the same bug. We are
still investigating.

Here are the list of other fixes :

  * An issue leading to possible infinite loops because of a double locking
effect in the mt lists was fixed by Olivier. If MT_LIST_TRY_ADDQ()
macro, it was possible to try to lock twice the same element, making the
second lock attempt to fail in loop. It happened when there is exactly
one element in the mt-list and we tried to add it again into the same
list.

  * The filters part was fixed to be sure the end analyzer (flt_end_analyse)
is always called for the request and the response, especially when the
request analysis is finished before the response start.

  * William fixed possible bugs about the listeners. Listeners are not
necessarily present when the client is an applet (peers, spoe, Lua) and
we need to be careful when updating counters. It was too hard to say
whether those could be triggered but there was at least one way
consisting in adding TCP rules to an SPOE backend.

  * Two bugs in the tcpchecks were fixed. The first one was about the
agent-checks when mixed with the health-checks. When a agent was setting
the server state, instead of updating the agent's .health threshold, the
health-check's one was updated. Thus the agent was competing with the
health-check and might mark a DOWN server as UP, while the precedence
should be on the health-checks. The second bug was a double free error
on invalid tcp-check/http-check rule.

  * Emeric fixed a ref counting bug into the stick-tables. Setting multiple
http track-scX rules generated never expiring entries. This bug was
introduced in the 2.2, when the http actions management was refactored.

  * Willy fixed a bug in the frequency counters because they were using the
thread's own time as the start of the current period leading to
non-monotonic updates in case of contention. See the commit message for
details. Now, freq counters rely on a global monotonic time.

Thanks to everyone for this release !

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/2.2/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.2.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.2.git
   Changelog: http://www.haproxy.org/download/2.2/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Baptiste Assmann (1):
  MINOR: resolvers: new function find_srvrq_answer_record()

Christopher Faulet (23):
  BUG/MINOR: hlua: Don't strip last non-LWS char in 
hlua_pushstrippedstring()
  BUG/MEDIUM: filters: Set CF_FL_ANALYZE on channels when filters are 
attached
  BUG/MINOR: tcpcheck: Update .health

Re: [2.2.9] 100% CPU usage

2021-03-17 Thread Christopher Faulet

Le 16/03/2021 à 13:46, Maciej Zdeb a écrit :
Sorry for spam. In the last message I said that the old process (after reload) 
is consuming cpu for lua processing and that's not true, it is processing other 
things also.


I'll take a break. ;) Then I'll verify if the issue exists on 2.3 and maybe 2.4 
branch. For each version I need a week or two to be sure the issue does not 
occur. :(


If 2.3 and 2.4 behaves the same way the 2.2 does, I'll try to confirm if there 
is any relation between infinite loops and custom configuration:

- lua scripts (mainly used for header generation/manipulation),
- spoe (used for sending metadata about each request to external service),
- peers (we have a cluster of 12 HAProxy servers connected to each other).


Hi Maciej,

Just to let you know we identified a potential AB/BA lock problem when HAProxy 
is stopping a listener. At least on the 2.2. The listeners code has changed in 
the 2.3 but we have no idea if the bug can be triggered too on 2.3 or 2.4. 
However, we have not found the time to investigate more for now. So, stay tuned.


--
Christopher Faulet



[ANNOUNCE] haproxy-2.3.7

2021-03-16 Thread Christopher Faulet

Hi,

HAProxy 2.3.7 was released on 2021/03/16. It added 62 new commits
after version 2.3.6.

This release is mainly about two subjects : The fix of bugs into the
resolvers part, mainly revealed since the last release and several
scalability improvements backported from the development version.

In the 2.3.6, some fixes in the resolvers part about the cached records
expiration revealed design problems leading to pretty annoying bugs around
the SRV records. Servers based on SRV resolution might enter in a flapping
loop, switching them from UP state to DOWN and UP again at each
resolution. The current design has clearly shown its limitations and it is
already planned to refactor it. However, this will not take part of the
stable releases. Thus several patches were pushed to (let's hope so) fix all
known issues, based on the current design. First the detection of cached
additional records was fixed, just like the record expiration loop, fixing
this way the flapping servers bug. The triggering of extra DNS resolutions
to overcome the lack of some additional records in a SRV response was
improved, as the ability to stop them. And finally, the time unit for cached
records is now the millisecond. It should mitigate some edge cases with
short-live records and heavy resolution frequencies. Note that all of this
is also true for the 2.2 and a new release will be emitted soon, once this
one will have proved it reliability.

A few of the scalability fixes related to issues 922 and 958 that arrived
into 2.4-dev10 and 2.4-dev11 were now backported, as planned. These roughly
fall into 3 main categories :

  * architecture-specific atomic operations: support for LSE instructions on
armv8.1-a that has been in testing since November and found to be
necessary to avoid crashes on 16+ cores on arm64 machines

  * general locking cost reduction: some global lists were still heavily
updated (like streams, actconns, buffers), often causing excessive
contention and even rare panics under too many threads. These lists were
however never used in any particular order and could be were split by
thread.

  * SSL batching limits to tag heavy handshakes and avoid them excessively
delaying all other processing (mixed traffic is much smoother now)

While not as good as 2.4, this managed to substantially stabilize 2.3 for
large CPUs. After a few weeks, we'll also consider some of these patches for
2.2 so that one LTS version can work better on large machines.

Of course, as usual, there are also some other fixes here and there. Mainly :

 * The filters part was fixed to be sure the end analyzer (flt_end_analyse) is
   always called for the request and the response, especially when the request
   analysis is finished before the response start.

 * William fixed possible bugs about the listeners. Listeners are not
   necessarily present when the client is an applet (peers, spoe, Lua) and
   we need to be careful when updating counters. It was too hard to say
   whether those could be triggered but there was at least one way
   consisting in adding TCP rules to an SPOE backend.

 * Two bugs in the tcpchecks were fixed. The first one was about the
   agent-checks when mixed with the health-checks. When a agent was setting the
   server state, instead of updating the agent's .health threshold, the
   health-check's one was updated. Thus the agent was competing with the
   health-check and might mark a DOWN server as UP, while the precedence should
   be on the health-checks. The second bug was a double free error on invalid
   tcp-check/http-check rule.

 * Emeric fixed a ref counting bug into the stick-tables. Setting multiple
   http track-scX rules generated never expiring entries. This bug was
   introduced in the 2.2, when the http actions management was refactored.


Thanks to everyone for this release !


Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/2.3/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.3.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.3.git
   Changelog: http://www.haproxy.org/download/2.3/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Amaury Denoyelle (1):
  BUG/MINOR: backend: fix condition for reuse on mode HTTP

Baptiste Assmann (1):
  MINOR: resolvers: new function find_srvrq_answer_record()

Christopher Faulet (23):
  BUG/MINOR: hlua: Don't strip last non-LWS char in 
hlua_pushstrippedstring()
  BUG/MEDIUM: filters: Set CF_FL_ANALYZE on channels when filters are 
attached
  BUG/MINOR: tcpcheck: Update .health threshold of agent inside an 
agent-check
  BUG/MINOR

Re: [2.2.9] 100% CPU usage

2021-03-05 Thread Christopher Faulet

Le 05/03/2021 à 11:35, Maciej Zdeb a écrit :

Hi Christopher,

Thanks, I'll check but it'll take a couple days because the issue is quite rare. 
I'll return with feedback!


Maybe the patch is not backported to 2.2 because of commit message that states 
only 2.3 branch?




That's it. And it was finally backported in 2.2 and 2.1.

--
Christopher Faulet



Re: [2.2.9] 100% CPU usage

2021-03-04 Thread Christopher Faulet

Le 04/03/2021 à 14:01, Maciej Zdeb a écrit :

Hi,

Sometimes after HAProxy reload it starts to loop infinitely, for example 9 of 10 
threads using 100% CPU (gdb sessions attached). I've also dumped the core file 
from gdb.



Hi Maciej,

The 2.2.1O is out. But I'm afraid that a fix is missing. Could you test with the 
attached patch please ? On top of the 2.2.9 or 2.2.10, as you want.


Thanks,
--
Christopher Faulet
>From 6406528c25ec73ff0a47696ac2decde95867fdda Mon Sep 17 00:00:00 2001
From: Olivier Houchard 
Date: Thu, 18 Feb 2021 23:55:30 +0100
Subject: [PATCH] BUG/MEDIUM: lists: Avoid an infinite loop in
 MT_LIST_TRY_ADDQ().

In MT_LIST_TRY_ADDQ(), deal with the "prev" field of the element before the
"next". If the element is the first in the list, then its next will
already have been locked when we locked list->prev->next, so locking it
again will fail, and we'll start over and over.

This should be backported to 2.3.

(cherry picked from commit 5567f41d0ab61dd6843535edc8081407d599024d)
Signed-off-by: Christopher Faulet 
(cherry picked from commit 6f682bea6ab08830d17ef3e973be6cc4d2474e69)
Signed-off-by: Christopher Faulet 
---
 include/haproxy/list.h | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/haproxy/list.h b/include/haproxy/list.h
index 548c9b0f52..6233b2c79f 100644
--- a/include/haproxy/list.h
+++ b/include/haproxy/list.h
@@ -300,28 +300,28 @@
 			__ha_barrier_store();  \
 			continue;  \
 		}  \
-		n2 = _HA_ATOMIC_XCHG(>next, MT_LIST_BUSY); \
-		if (n2 != el) { /* element already linked */   \
-			if (n2 != MT_LIST_BUSY)\
-el->next = n2; \
+		p2 = _HA_ATOMIC_XCHG(>prev, MT_LIST_BUSY); \
+		if (p2 != el) {\
+			if (p2 != MT_LIST_BUSY)\
+el->prev = p2; \
 			p->next = n;   \
 			__ha_barrier_store();  \
 			lh->prev = p;  \
 			__ha_barrier_store();  \
-			if (n2 == MT_LIST_BUSY)\
+			if (p2 == MT_LIST_BUSY)\
 continue;  \
 			break; \
 		}  \
-		p2 = _HA_ATOMIC_XCHG(>prev, MT_LIST_BUSY); \
-		if (p2 != el) {\
-			if (p2 != MT_LIST_BUSY)\
-el->prev = p2; \
+		n2 = _HA_ATOMIC_XCHG(>next, MT_LIST_BUSY); \
+		if (n2 != el) { /* element already linked */   \
+			if (n2 != MT_LIST_BUSY)\
+el->next = n2; \
 			p->next = n;   \
-			el->next = el; \
+			el->prev = el; \
 			__ha_barrier_store();  \
 			lh->prev = p;  \
 			__ha_barrier_store();  \
-			if (p2 == MT_LIST_BUSY)\
+			if (n2 == MT_LIST_BUSY)\
 continue;  \
 			break; \
 		}  \
-- 
2.29.2



[ANNOUNCE] haproxy-2.2.10

2021-03-03 Thread Christopher Faulet
e to consider such samples as safe by reading one byte after the
 buffer.

   - The HTTP return action was fixed when configured on the response
 side. The server response status code, if any, was used instead of the
 one of the HTTP return action.

   - Address assignment for address-less servers was fixed to be able to
 handle IPv4 and IPv6 by setting the right address family.

   - The set-dst action was buggy. The original port was not set for IPv4
 address. It was a problem for UNIX sockets, the port was not set to 0
 and had an undefined value.

   - The processing of the HTTP originalto action was fixed to only consider
 the destination address. The address family of the source address was
 tested and not the destination one.

   - Not a fix but an improvement. Thanks to Amaury, a connection header may
 now be specified on "http-check send" lines. This way, it is possible to
 implement a websocket upgrade check.

Thanks to everyone for this release !

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/2.2/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.2.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.2.git
   Changelog: http://www.haproxy.org/download/2.2/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Amaury Denoyelle (2):
  MINOR: check: do not ignore a connection header for http-check send
  BUG/MINOR: backend: hold correctly lock when killing idle conn

Baptiste Assmann (1):
  BUG/MINOR: resolvers: new callback to properly handle SRV record errors

Christopher Faulet (25):
  BUG/MINOR: http-ana: Don't increment HTTP error counter on internal errors
  BUG/MEDIUM: mux-h1: Always set CS_FL_EOI for response in MSG_DONE state
  BUG/MINOR: tools: Fix a memory leak on error path in parse_dotted_uints()
  BUG/MINOR: server: Fix server-state-file-name directive
  CLEANUP: deinit: release global and per-proxy server-state variables on 
deinit
  BUG/MINOR: server: Don't call fopen() with server-state filepath set to 
NULL
  BUG/MINOR: server: Remove RMAINT from admin state when loading server 
state
  BUG/MINOR: sample: Always consider zero size string samples as unsafe
  BUG/MEDIUM: spoe: Resolve the sink if a SPOE logs in a ring buffer
  BUG/MINOR: http-rules: Always replace the response status on a return 
action
  BUG/MINOR: server: Init params before parsing a new server-state line
  BUG/MINOR: server: Be sure to cut the last parsed field of a server-state 
line
  BUG/MEDIUM: mux-h1: Fix handling of responses to CONNECT other than 200-ok
  BUG/MINOR: resolvers: Fix condition to release received ARs if not 
assigned
  BUG/MINOR: resolvers: Only renew TTL for SRV records with an additional 
record
  BUG/MEDIUM: resolvers: Reset server address and port for obselete SRV 
records
  BUG/MEDIUM: resolvers: Reset address for unresolved servers
  CLEANUP: muxes: Remove useless if condition in show_fd function
  BUG/MINOR: mux-h1: Immediately report H1C errors from h1_snd_buf()
  BUG/MINOR: http-ana: Only consider dst address to process originalto 
option
  BUG/MINOR: tcp-act: Don't forget to set the original port for IPv4 
set-dst rule
  BUG/MINOR: connection: Use the client's dst family for adressless servers
  BUG/MEDIUM: spoe: Kill applets if there are pending connections and nbthread 
> 1
  DOC: spoe: Add a note about fragmentation support in HAProxy
  BUG/MINOR: http-ana: Don't increment HTTP error counter on read 
error/timeout

Dragan Dosen (2):
  BUG/MINOR: sample: secure convs that accept base64 string and var name as 
args
  BUG/MEDIUM: vars: make functions vars_get_by_{name,desc} thread-safe

Emeric Brun (1):
  CLEANUP: channel: fix comment in ci_putblk.

Eric Salama (1):
  BUG/MINOR: ssl: potential null pointer dereference in ckchs_dup()

Ilya Shipitsin (4):
  BUILD: ssl: fix typo in HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT macro
  BUILD: ssl: guard SSL_CTX_add_server_custom_ext with special macro
  BUILD: ssl: guard SSL_CTX_set_msg_callback with SSL_CTRL_SET_MSG_CALLBACK 
macro
  BUILD: ssl: introduce fine guard for OpenSSL specific SCTL functions

Tim Duesterhus (2):
  MINOR: Configure the `cpp` userdiff driver for *.[ch] in .gitattributes
  BUG/MINOR: mux-h2: Fix typo in scheme adjustment

William Dauchy (2):
  BUG/MINOR: server: re-align state file fields number
  DOC: tune: explain the origin of block size for ssl.cachesize

William Lallemand (1):
  BUG/MINOR: ssl/cli: potential null pointer d

"[ANNOUNCE] haproxy-2.3.6

2021-03-03 Thread Christopher Faulet
incremented, resulting
in occasional slightly off values.

  - The smp_is_safe() function was fixed. Zero size string samples were not
systematically considered as unsafe. In some circumstances, it was
possible to consider such samples as safe by reading one byte after the
buffer.

  - The HTTP return action was fixed when configured on the response
side. The server response status code, if any, was used instead of the
one of the HTTP return action.

  - The matching on the "no-maint" option in the stats page was buggy,
hiding maintenance servers when "norefresh" option was used.

  - Address assignment for address-less servers was fixed to be able to
handle IPv4 and IPv6 by setting the right address family.

  - The set-dst action was buggy. The original port was not set for IPv4
address. It was a problem for UNIX sockets, the port was not set to 0
and had an undefined value.

  - The processing of the HTTP originalto action was fixed to only consider
the destination address. The address family of the source address was
tested and not the destination one.

  - Not a fix but an improvement. Thanks to Amaury, a connection header may
now be specified on "http-check send" lines. This way, it is possible to
implement a websocket upgrade check.

Thanks to everyone for this release !

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Wiki : https://github.com/haproxy/wiki/wiki
   Sources  : http://www.haproxy.org/download/2.3/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.3.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.3.git
   Changelog: http://www.haproxy.org/download/2.3/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Amaury Denoyelle (3):
  MINOR: check: do not ignore a connection header for http-check send
  BUG/MINOR: backend: hold correctly lock when killing idle conn
  BUG/MINOR: stats: fix compare of no-maint url suffix

Baptiste Assmann (1):
  BUG/MINOR: resolvers: new callback to properly handle SRV record errors

Christopher Faulet (25):
  BUG/MINOR: http-ana: Don't increment HTTP error counter on internal errors
  BUG/MEDIUM: mux-h1: Always set CS_FL_EOI for response in MSG_DONE state
  BUG/MINOR: tools: Fix a memory leak on error path in parse_dotted_uints()
  BUG/MINOR: server: Fix server-state-file-name directive
  CLEANUP: deinit: release global and per-proxy server-state variables on 
deinit
  BUG/MINOR: server: Don't call fopen() with server-state filepath set to 
NULL
  BUG/MINOR: server: Remove RMAINT from admin state when loading server 
state
  BUG/MINOR: sample: Always consider zero size string samples as unsafe
  BUG/MEDIUM: spoe: Resolve the sink if a SPOE logs in a ring buffer
  BUG/MINOR: http-rules: Always replace the response status on a return 
action
  BUG/MINOR: server: Init params before parsing a new server-state line
  BUG/MINOR: server: Be sure to cut the last parsed field of a server-state 
line
  BUG/MEDIUM: mux-h1: Fix handling of responses to CONNECT other than 200-ok
  BUG/MINOR: resolvers: Fix condition to release received ARs if not 
assigned
  BUG/MINOR: resolvers: Only renew TTL for SRV records with an additional 
record
  BUG/MEDIUM: resolvers: Reset server address and port for obselete SRV 
records
  BUG/MEDIUM: resolvers: Reset address for unresolved servers
  CLEANUP: muxes: Remove useless if condition in show_fd function
  BUG/MINOR: mux-h1: Immediately report H1C errors from h1_snd_buf()
  BUG/MINOR: http-ana: Only consider dst address to process originalto 
option
  BUG/MINOR: tcp-act: Don't forget to set the original port for IPv4 
set-dst rule
  BUG/MINOR: connection: Use the client's dst family for adressless servers
  BUG/MEDIUM: spoe: Kill applets if there are pending connections and nbthread 
> 1
  DOC: spoe: Add a note about fragmentation support in HAProxy
  BUG/MINOR: http-ana: Don't increment HTTP error counter on read 
error/timeout

Dragan Dosen (2):
  BUG/MINOR: sample: secure convs that accept base64 string and var name as 
args
  BUG/MEDIUM: vars: make functions vars_get_by_{name,desc} thread-safe

Emeric Brun (1):
  CLEANUP: channel: fix comment in ci_putblk.

Eric Salama (1):
  BUG/MINOR: ssl: potential null pointer dereference in ckchs_dup()

Ilya Shipitsin (4):
  BUILD: ssl: fix typo in HAVE_SSL_CTX_ADD_SERVER_CUSTOM_EXT macro
  BUILD: ssl: guard SSL_CTX_add_server_custom_ext with special macro
  BUILD: ssl: guard SSL_CTX_set_msg_callback with SSL_CTRL_SET_MSG_CALLBACK 
macro
  BUILD: ssl: introduce fine gu

Re: [PATCH 1/2] CLEANUP: Use ist2(const void*, size_t) whenever possible

2021-03-01 Thread Christopher Faulet

Le 28/02/2021 à 16:11, Tim Duesterhus a écrit :

Refactoring performed with the following Coccinelle patch:

 @@
 struct ist i;
 expression p, l;
 @@

 - i.ptr = p;
 - i.len = l;
 + i = ist2(p, l);


Both merged, thanks !

--
Christopher Faulet



Re: [PATCH] BUG/MEDIUM: contrib/prometheus-exporter: fix segfault in listener name dump

2021-02-24 Thread Christopher Faulet

Le 25/02/2021 à 00:53, William Dauchy a écrit :

We need to check whether listener is empty before doing anything; in
that case, we were trying to dump listerner name while name is null. So
simply move the counters check above, which validate all possible cases
when the listener is empty. This is very similar to what is done in
stats.c

see also the trace:

   Thread 1 "haproxy" received signal SIGSEGV, Segmentation fault.
   __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:120
   120     ../sysdeps/x86_64/multiarch/../strlen.S: No such file or directory.
   (gdb) bt
   #0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:120
   #1  0x555b716b in promex_dump_listener_metrics (htx=0x558fadf0, 
appctx=0x55926070) at contrib/prometheus-exporter/service-prometheus.c:722
   #2  promex_dump_metrics (htx=0x558fadf0, si=0x55925920, 
appctx=0x55926070) at contrib/prometheus-exporter/service-prometheus.c:1200
   #3  promex_appctx_handle_io (appctx=0x55926070) at 
contrib/prometheus-exporter/service-prometheus.c:1477
   #4  0x556f0c94 in task_run_applet (t=0x55926180, 
context=0x55926070, state=) at src/applet.c:88
   #5  0x556bc6d8 in run_tasks_from_lists 
(budgets=budgets@entry=0x7fffe374) at src/task.c:548
   #6  0x556bd1a0 in process_runnable_tasks () at src/task.c:750
   #7  0x55696cdd in run_poll_loop () at src/haproxy.c:2870
   #8  0x55697025 in run_thread_poll_loop (data=data@entry=0x0) at 
src/haproxy.c:3035
   #9  0x55596c90 in main (argc=, argv=0x7fffe818) 
at src/haproxy.c:3723
   quit)

this bug was introduced by commit
e3f7bd5ae9e969cbfe87e4130d06bff7a3e814c6 ("MEDIUM:
contrib/prometheus-exporter: add listen stats"), which is present for
2.4 only, so no backport needed.



Thanks William, now merged !

--
Christopher Faulet



Re: [PATCH 1/3] DOC: contrib/prometheus-exporter: remove htx reference

2021-02-19 Thread Christopher Faulet

Le 18/02/2021 à 23:05, William Dauchy a écrit :

now that htx is the default everywhere, we can remove the need to put
htx as a mandatory option to setup prometheus.



Thanks William, the series is now merged !

--
Christopher Faulet



Re: ring buffer + spoe = segfault

2021-02-19 Thread Christopher Faulet

Le 19/02/2021 à 09:31, Maciej Zdeb a écrit :

Hi,

By accident I discovered that when enabling logging with line "log 
ring@spoe-ring local2" in SPOE immediate segmentation fault occurs.


Steps to reproduce (configuration files attached):

# execute spoa-server from contrib directory:
./spoa -d -p 8080

# execute haproxy:
haproxy -f haproxy.cfg

# trigger:
curl http://localhost:8000 <http://localhost:8000>


# haproxy -v
HA-Proxy version 2.2.5-34b2b10 2020/11/05 - https://haproxy.org/ 
<https://haproxy.org/>

Status: long-term supported branch - will stop receiving fixes around Q2 2025.
Known bugs: http://www.haproxy.org/bugs/bugs-2.2.5.html 
<http://www.haproxy.org/bugs/bugs-2.2.5.html>
Running on: Linux 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 
x86_64


Kind regards,
Maciej


Maciej,

I pushed a fix (https://github.com/haproxy/haproxy/commit/1d7d0f8). Not 
backported yet.


Thanks !
--
Christopher Faulet



Re: ring buffer + spoe = segfault

2021-02-19 Thread Christopher Faulet

Le 19/02/2021 à 09:31, Maciej Zdeb a écrit :

Hi,

By accident I discovered that when enabling logging with line "log 
ring@spoe-ring local2" in SPOE immediate segmentation fault occurs.


Steps to reproduce (configuration files attached):

# execute spoa-server from contrib directory:
./spoa -d -p 8080

# execute haproxy:
haproxy -f haproxy.cfg

# trigger:
curl http://localhost:8000 <http://localhost:8000>


# haproxy -v
HA-Proxy version 2.2.5-34b2b10 2020/11/05 - https://haproxy.org/ 
<https://haproxy.org/>

Status: long-term supported branch - will stop receiving fixes around Q2 2025.
Known bugs: http://www.haproxy.org/bugs/bugs-2.2.5.html 
<http://www.haproxy.org/bugs/bugs-2.2.5.html>
Running on: Linux 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 
x86_64


Kind regards,
Maciej


Maciej,

Thanks for the report. Indeed, there is no post parsing resolution in the SPOE 
to find the sink corresponding to the ring buffer. It also fails on the 2.4-dev. 
I'll fix it soon.


--
Christopher Faulet



Re: [PATCH] MINOR: cli: add missing agent commands for set server

2021-02-18 Thread Christopher Faulet

Le 15/02/2021 à 17:22, William Dauchy a écrit :

we previously forgot to add `agent-*` commands.
Take this opportunity to rewrite the help string in a simpler way for
readability (mainly removing simple quotes)



Now merged, thanks !

--
Christopher Faulet



Re: [PATCH]: BUILD/MEDIUM da pcre2 support

2021-02-18 Thread Christopher Faulet

Le 16/02/2021 à 12:47, David Carlier a écrit :

Hi,

Here a patch proposal to update the DeviceAtlas Detection build in order to 
support the pcre2 library as option.


Would be nice if backported back to the supported stable releases.



Now merged, thanks !

--
Christopher Faulet



Re: [PATCH 0/3] prometheus: add listen stats

2021-02-15 Thread Christopher Faulet

Le 14/02/2021 à 23:22, William Dauchy a écrit :

Hello Christopher,

I know I'm a bit late regarding the merge window; this is however the
logical followup of my prometheus work, now adding listen stats.
patch 2/3 is a proposition but I'm ok to revisit.

William Dauchy (3):
   MEDIUM: stats: allow to select one field in `stats_fill_li_stats`
   MINOR: stats: add helper to get status string
   MEDIUM: contrib/prometheus-exporter: add listen stats

  contrib/prometheus-exporter/README|  24 +-
  .../prometheus-exporter/service-prometheus.c  | 274 --
  include/haproxy/listener-t.h  |   9 +
  include/haproxy/listener.h|   3 +
  include/haproxy/stats.h   |   2 +-
  reg-tests/contrib/prometheus.vtc  |   4 +
  src/hlua_fcn.c|   3 +-
  src/listener.c|  18 ++
  src/stats.c   | 166 +++
  9 files changed, 365 insertions(+), 138 deletions(-)



Thanks William, now merged, including the changes discussed offlist.

--
Christopher Faulet



Re: [PATCH 1/2] CLEANUP: check: fix get_check_status_info declaration

2021-02-15 Thread Christopher Faulet

Le 14/02/2021 à 22:26, William Dauchy a écrit :

we always put a \n between function name and `{`

Signed-off-by: William Dauchy 
---
  src/check.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/check.c b/src/check.c
index d37a55201..d17f747fe 100644
--- a/src/check.c
+++ b/src/check.c
@@ -178,8 +178,8 @@ const char *get_check_status_description(short 
check_status) {
  }
  
  /* Converts check_status code to short info */

-const char *get_check_status_info(short check_status) {
-
+const char *get_check_status_info(short check_status)
+{
const char *info;
  
  	if (check_status < HCHK_STATUS_SIZE)




Thanks William, both merged !

--
Christopher Faulet



Re: [PATCH] fix freebsd ci, update freebsd image

2021-02-12 Thread Christopher Faulet

Le 11/02/2021 à 19:35, Илья Шипицин a écrit :

Hello,

attached patch fix freebsd builds.

Ilya


Thanks, now merged !

--
Christopher Faulet



Re: [PATCH v3 0/5] cli commands for checks and agent

2021-02-12 Thread Christopher Faulet

Le 11/02/2021 à 22:51, William Dauchy a écrit :

Hello Christopher,

Here is some work to finish you week.
I believe I addressed all the points raised:
- warning are no longer emitted when we have "0" or "-" values
- I enhanced the warning output as well, and understood my mistake for
   my previous CLEANUP patch removing a space... so I removed this patch
   as well.
- Fixed all the chunk_appendf issues.
- I was a bit lazy to address the partial vs complete failure in parsing
   as I was a bit puzzled about the approach to take. I think it would be
   sad to duplicate the code for pre validation, but on the other hand I
   agree it was clear to assume the whole line was not applied at all. I
   however concluded it was acceptable to simply know the line was not
   fully applied. I believe in that case the user should worry. We can
   probably add a way to show where it stopped, but I felt discouraged
   with the srv resolution, in the middle of srv port, where it is harder
   to have a kinda generic way to know where we stopped.

William Dauchy (5):
   MEDIUM: cli: add check-addr command
   MEDIUM: cli: add agent-port command
   MEDIUM: server: add server-states version 2
   MEDIUM: server: support {check,agent}_addr, agent_port in server state
   MINOR: server: enhance error precision when applying server state



Ok, it is good for me. I will push it soon. Thanks William! And don't be too 
worry about the loading of server-state files. This part is a mess and should be 
refactored, at least to be readable and also to fix bugs on error path. I'm 
tempted to do so and a bit afraid too.


I just slightly amended the 3rd patch to handle the v2 in apply_server_state(). 
There is a test on the version when a state-file is local to a proxy. Just a 
minor change. And in the last one, I removed the "chunk_appendf(msg, "\n");" to 
move the LF in ha_warning() calls.


--
Christopher Faulet



Re: [PATCH] cleanup: remove unused variable assignment

2021-02-11 Thread Christopher Faulet

Le 10/02/2021 à 09:09, Илья Шипицин a écrit :

Hello,

this is pure cleanup.
should fix #1048



Thanks Ilya, now merged !

--
Christopher Faulet



Re: Issues with d13afbcce5e664f9cfe797eee8c527e5fa947f1b (haproxy-2.2) "mux-h1: Don't set CS_FL_EOI too early for protocol upgrade requests"

2021-02-10 Thread Christopher Faulet

Le 08/02/2021 à 14:31, Christian Ruppert a écrit :

Hi list, Christopher,

we're having issues with the mentioned commit / patch:
d13afbcce5e664f9cfe797eee8c527e5fa947f1b
https://git.haproxy.org/?p=haproxy-2.2.git;a=commit;h=d13afbcce5e664f9cfe797eee8c527e5fa947f1b

I can also reproduce it with 2.2.9 as well as 2.3.5. I don't have any
useful details yet, just the our Jira fails to load.
A curl against the site seams to work fine while browser requests
(chrome / firefox) seem to timeout or at least some.

See the attached log. The first 3 requests seem to be fine so far. Then,
much later, there's a 504 between more 200s.
I'm not sure yet why the other 200s there seem to wait / are logged
after the actual timeout happens. According to chrome's F12 there are
more requests still pending.
Ignore the 503 there. That seems to be an unrelated problem, since this
also happends with a working HAProxy.

Much later, the site loaded, sometimes broken though.

I'll try to prepare a config snipped if required.

Is there anything know already?



Hi,

Thanks to information that Christian provided me offlist, I've finally found and 
fixed the bug. The corresponding commit is :


commit a22782b597ee9a3bfecb18a66e29633c8e814216
Author: Christopher Faulet 
Date:   Mon Feb 8 17:18:01 2021 +0100

BUG/MEDIUM: mux-h1: Always set CS_FL_EOI for response in MSG_DONE state

During the message parsing, if in MSG_DONE state, the CS_FL_EOI flag must
always be set on the conn-stream if following conditions are met :

  * It is a response or
  * It is a request but not a protocol upgrade nor a CONNECT.

For now, there is no test on the message type (request or response). Thus
the CS_FL_EOI flag is not set for a response with a "Connection: upgrade"
header but not a 101 response.

This bug was introduced by the commit 3e1748bbf ("BUG/MINOR: mux-h1: Don't
set CS_FL_EOI too early for protocol upgrade requests"). It was backported
as far as 2.0. Thus, this patch must also be backported as far as 2.0.

However, it is not backported yet. Thanks Christian !
--
Christopher Faulet



Re: [PATCH v2 0/6] cli commands for checks and agent

2021-02-10 Thread Christopher Faulet

Le 08/02/2021 à 23:53, William Dauchy a écrit :

Hello Christopher,

Here is the v2 addressing the points raised yesterday.
The patch 4/6 clearly looks scary but I made sure to not change anything
crazy apart from adding support for a version 2. I will probably start
to dream about a server-state-file burning every night.
I hope this will be good enough, but otherwise, don't hesitate to put
more comments, I will come back with a v3.

William Dauchy (6):
   MEDIUM: cli: add check-addr command
   MEDIUM: cli: add agent-port command
   BUG/MEDIUM: server: re-align state file fields number
   MEDIUM: server: add server-states version 2
   MEDIUM: server: support {check,agent}_addr, agent_port in server state
   CLEANUP: server: add missing space in server-state error output

  doc/management.txt|  15 +-
  include/haproxy/server-t.h|  16 +-
  .../checks/1be_40srv_odd_health_checks.vtc|   2 +-
  .../checks/40be_2srv_odd_health_checks.vtc|   2 +-
  reg-tests/checks/4be_1srv_health_checks.vtc   |   6 +-
  src/proxy.c   |  41 +-
  src/server.c  | 929 ++
  7 files changed, 573 insertions(+), 438 deletions(-)



William,

Thanks for this new series. There are problems in the 5th patch. When a
server-state file is loaded, if health-check or agent-check are disabled for
a server, a warning is always emitted, independently on the parameters.
Especially if the parameters are both unset (NULL or "-" for addresses and
NULL and "0" for port). The format of this warning is also not really
convenient. While the warning emitted on the CLI is good, when the
server-state file is loaded, we have such warning :

[WARNING] 040/150958 (3623566) : server-state application failed for server 
'back-http/www' health checks are not enabled on this server.
agent checks are not enabled on this server.

I suppose it also explains your last patch ("CLEANUP: server: add missing
space in server-state error output").

In addition, the warning is a bit misleading because most of the server
state is loaded at this stage. Note this comment is also true for the
warnings about the SRV resolution. I guess we should instead emit a
different warning depending the server state is not loaded at all or
partially loaded with the list of skipped part. But at least if the warning
is well formatted, it should be good for this patch. Having 2 types of
warnings may be handled in another patch.

Finally, there is a more annoying bug, but pretty easy to fix. You must be
careful to not use chunk_appendf() with a variable format string (2nd arg).
Just like with the printf family functions, you must not call it with a
non-constant format string. Here, there is a problem if I manually update
the server-state file to set "%s%s%s%s%s%s" as health-check or agent-check
address.

Thus replace this line :

chunk_appendf(msg, warning);

by

chunk_appendf(msg, "%s", warning);

or better :

chunk_strcat(msg, warning);

To conclude, I may merge the first 4 patches if you want, but I guess you
will have to adapt the first ones to produce better warnings. So I will
wait your confirmation to proceed. However, I will merge the bug fix.
There is no reason to not take it.

--
Christopher Faulet



Re: Issues with d13afbcce5e664f9cfe797eee8c527e5fa947f1b (haproxy-2.2) "mux-h1: Don't set CS_FL_EOI too early for protocol upgrade requests"

2021-02-08 Thread Christopher Faulet

Le 08/02/2021 à 15:03, Christian Ruppert a écrit :

On 2021-02-08 14:46, Christopher Faulet wrote:

Le 08/02/2021 à 14:31, Christian Ruppert a écrit :

Hi list, Christopher,

we're having issues with the mentioned commit / patch:
d13afbcce5e664f9cfe797eee8c527e5fa947f1b
https://git.haproxy.org/?p=haproxy-2.2.git;a=commit;h=d13afbcce5e664f9cfe797eee8c527e5fa947f1b

I can also reproduce it with 2.2.9 as well as 2.3.5. I don't have any
useful details yet, just the our Jira fails to load.
A curl against the site seams to work fine while browser requests
(chrome / firefox) seem to timeout or at least some.

See the attached log. The first 3 requests seem to be fine so far.
Then,
much later, there's a 504 between more 200s.
I'm not sure yet why the other 200s there seem to wait / are logged
after the actual timeout happens. According to chrome's F12 there are
more requests still pending.
Ignore the 503 there. That seems to be an unrelated problem, since
this
also happends with a working HAProxy.

Much later, the site loaded, sometimes broken though.

I'll try to prepare a config snipped if required.

Is there anything know already?



Thanks Christian,  I'll take a look. Could you confirm or inform it
happens only with requests with a "Connection: upgrade" header ?


This frontend doesn't have H2 enabled explicit. I'm not really sure but
it looks like some of those delayed requests don't have the upgrade
header.



It is not related to the H2. The mentioned commit is about HTTP/1.1 protocol
upgrades, like WebsScket. But if some requests are delayed without any
"connection: upgrade" header, it is a bit strange.

If it is easily reproducible, it may be useful to enable the H1 mux traces.
To enable it, just use the following commands on the CLI socket :

echo "trace h1 sink buf0; trace h1 level developer; trace h1 verbosity complete; 
trace h1 start now" | socat - /path/to/cli-socket

This will capture detailed info on the H1 mux internals. Then you can later
check for captured events using "show events buf0". If you want to collect
them as they are logged, and archive them, you can use the following command :

(echo "show events buf0 -w -n" ; cat ) | socat - /path/to/cli-socket > file.log

If you do so, please send me privately the generated log file. In the mean
time, the config may help.

Thanks,
--
Christopher Faulet



Re: Issues with d13afbcce5e664f9cfe797eee8c527e5fa947f1b (haproxy-2.2) "mux-h1: Don't set CS_FL_EOI too early for protocol upgrade requests"

2021-02-08 Thread Christopher Faulet

Le 08/02/2021 à 14:31, Christian Ruppert a écrit :

Hi list, Christopher,

we're having issues with the mentioned commit / patch:
d13afbcce5e664f9cfe797eee8c527e5fa947f1b
https://git.haproxy.org/?p=haproxy-2.2.git;a=commit;h=d13afbcce5e664f9cfe797eee8c527e5fa947f1b

I can also reproduce it with 2.2.9 as well as 2.3.5. I don't have any
useful details yet, just the our Jira fails to load.
A curl against the site seams to work fine while browser requests
(chrome / firefox) seem to timeout or at least some.

See the attached log. The first 3 requests seem to be fine so far. Then,
much later, there's a 504 between more 200s.
I'm not sure yet why the other 200s there seem to wait / are logged
after the actual timeout happens. According to chrome's F12 there are
more requests still pending.
Ignore the 503 there. That seems to be an unrelated problem, since this
also happends with a working HAProxy.

Much later, the site loaded, sometimes broken though.

I'll try to prepare a config snipped if required.

Is there anything know already?



Thanks Christian,  I'll take a look. Could you confirm or inform it happens only 
with requests with a "Connection: upgrade" header ?


--
Christopher Faulet



Re: [PATCH 0/6] cli commands coherency

2021-02-08 Thread Christopher Faulet

Le 06/02/2021 à 20:47, William Dauchy a écrit :

Hello,

This is a followup from last week cleaning regarding check and agent
check. This patch series brings some more coherency on the CLI side. I
also put some minor cleaning.

William Dauchy (6):
   CLEANUP: check: fix some typo in comments
   CLEANUP: tools: typo in `strl2irc` mention
   MEDIUM: cli: add check-addr command
   MEDIUM: cli: add agent-port command
   MEDIUM: server: support {check,agent}_addr, agent_port in server state
   CLEANUP: server: add missing space in server-state error output

  doc/management.txt|  15 +-
  include/haproxy/server-t.h|   9 +-
  .../checks/1be_40srv_odd_health_checks.vtc|   2 +-
  .../checks/40be_2srv_odd_health_checks.vtc|   2 +-
  reg-tests/checks/4be_1srv_health_checks.vtc   |   6 +-
  src/check.c   |  18 +-
  src/proxy.c   |  41 ++--
  src/server.c  | 192 ++
  src/tools.c   |   2 +-
  9 files changed, 213 insertions(+), 74 deletions(-)



Thanks William ! I merged the first two patches. For the others, I've some 
questions/comments.


First, there is a test to be sure the agent-check is enabled before updating the 
agent address and/or port. Do you think it should also be done for the 
health-check? Because, for now, it is possible to set an address on a disabled 
(or even not configured) health-check. But it is not possible for an 
agent-check. Given that it is not possible to enable/disable an agent-check or 
an health-check from the CLI, I guess the warning for both makes sense.


Then, there are some problems when the server-state file is loaded. The trash 
chunk used by srv_update_state() is overwritten when 
update_server_agent_addr_port() is called. It is not obvious, but the 
get_trash_chunk() function returns cyclically one of two trash chunks. 
srv_update_state() uses a trash chunk. When called, 
update_server_check_addr_port() uses the other one. So, 
update_server_agent_addr_port() re-uses the first trash chunk, the same than 
srv_update_state(). A solution may be to allocate a dedicated chunk in 
srv_update_state(), using alloc_trash_chunk().


A more annoying problem happens when an old state file is loaded (prior these 
changes). Last parameters are not defined, params[18] (check-addr), params[19] 
(agent-addr) and params[20] (agent-port) are set to NULL. Thus, HAProxy crashes 
when the health-check address is compared to "-". In fact, this comes from the 
SRV_STATE_FILE_NB_FIELDS_VERSION_1 macro. It should be set to 24. You added 3 
new fields but just incremented it by 1. Hum... No, there is also another 
problem. In srv_state_parse_line(). SRV_STATE_FILE_NB_FIELDS_VERSION_1 must be 
equal to SRV_STATE_FILE_MAX_FIELDS. Or better, it should really reflect the 
argument number of the version 1, so 21. And it must be compared to srv_arg 
instead of arg. It is a bug since the 1.8 (commit 316947196).


However, this means with these changes, a cold restart is required. In the 2.4, 
5 new fields were added to the server-state file. Is it expected to perform a 
cold restart when upgrading to the 2.4 from a prior major version ? If so, I'm 
ok. But, it may be a good idea to change the state file version then to not 
silently ignore the state file. Otherwise, I guess it is possible to make these 
5 new fields optional, isn't it ?


William, I'm sorry if I'm not really clear but the subject is a bit foggy for me 
:) Do you feel confident to handle all the changes ?


--
Christopher Faulet



Re: [PATCH 1/2] MINOR: contrib/prometheus-exporter: use stats desc when possible followup

2021-02-08 Thread Christopher Faulet

Le 07/02/2021 à 20:42, William Dauchy a écrit :

Remove remaining descrition which are common to stats.c.

This patch is a followup of commit
82b2ce2f967d967139adb7afab064416fadad615 ("MINOR:
contrib/prometheus-exporter: use stats desc when possible"). I probably
messed up with one of my rebase because I'm pretty sure I removed them
at some point, but who knows what happened.

Signed-off-by: William Dauchy 
---
  .../prometheus-exporter/service-prometheus.c  | 35 ---
  1 file changed, 35 deletions(-)

diff --git a/contrib/prometheus-exporter/service-prometheus.c 
b/contrib/prometheus-exporter/service-prometheus.c
index 126962f5e..769389735 100644
--- a/contrib/prometheus-exporter/service-prometheus.c
+++ b/contrib/prometheus-exporter/service-prometheus.c
@@ -284,42 +284,7 @@ const struct promex_metric 
promex_st_metrics[ST_F_TOTAL_FIELDS] = {
  
  /* Description of overriden stats fields */

  const struct ist promex_st_metric_desc[ST_F_TOTAL_FIELDS] = {
-   [ST_F_PXNAME] = IST("The proxy name."),
-   [ST_F_SVNAME] = IST("The service name (FRONTEND for frontend, 
BACKEND for backend, any name for server/listener)."),
-   [ST_F_QCUR]   = IST("Current number of queued requests."),
-   [ST_F_QMAX]   = IST("Maximum observed number of queued 
requests."),
-   [ST_F_SCUR]   = IST("Current number of active sessions."),
-   [ST_F_SMAX]   = IST("Maximum observed number of active 
sessions."),
-   [ST_F_SLIM]   = IST("Configured session limit."),
-   [ST_F_STOT]   = IST("Total number of sessions."),
-   [ST_F_BIN]= IST("Current total of incoming bytes."),
-   [ST_F_BOUT]   = IST("Current total of outgoing bytes."),
-   [ST_F_DREQ]   = IST("Total number of denied requests."),
-   [ST_F_DRESP]  = IST("Total number of denied responses."),
-   [ST_F_EREQ]   = IST("Total number of request errors."),
-   [ST_F_ECON]   = IST("Total number of connection errors."),
-   [ST_F_ERESP]  = IST("Total number of response errors."),
-   [ST_F_WRETR]  = IST("Total number of retry warnings."),
-   [ST_F_WREDIS] = IST("Total number of redispatch warnings."),
[ST_F_STATUS] = IST("Current status of the service, per state label 
value."),
-   [ST_F_WEIGHT] = IST("Service weight."),
-   [ST_F_ACT]= IST("Current number of active servers."),
-   [ST_F_BCK]= IST("Current number of backup servers."),
-   [ST_F_CHKFAIL]= IST("Total number of failed check (Only counts 
checks failed when the server is up)."),
-   [ST_F_CHKDOWN]= IST("Total number of UP->DOWN transitions."),
-   [ST_F_LASTCHG]= IST("Number of seconds since the last UP<->DOWN 
transition."),
-   [ST_F_DOWNTIME]   = IST("Total downtime (in seconds) for the 
service."),
-   [ST_F_QLIMIT] = IST("Configured maxqueue for the server (0 meaning 
no limit)."),
-   [ST_F_PID]= IST("Process id (0 for first instance, 1 for second, 
...)"),
-   [ST_F_IID]= IST("Unique proxy id."),
-   [ST_F_SID]= IST("Server id (unique inside a proxy)."),
-   [ST_F_THROTTLE]   = IST("Current throttle percentage for the server, 
when slowstart is active, or no value if not in slowstart."),
-   [ST_F_LBTOT]  = IST("Total number of times a service was selected, 
either for new sessions, or when redispatching."),
-   [ST_F_TRACKED]= IST("Id of proxy/server if tracking is 
enabled."),
-   [ST_F_TYPE]   = IST("Service type (0=frontend, 1=backend, 2=server, 
3=socket/listener)."),
-   [ST_F_RATE]   = IST("Current number of sessions per second over last 
elapsed second."),
-   [ST_F_RATE_LIM]   = IST("Configured limit on new sessions per 
second."),
-   [ST_F_RATE_MAX]   = IST("Maximum observed number of sessions per 
second."),
[ST_F_CHECK_STATUS]   = IST("Status of last health check, per state label 
value."),
[ST_F_CHECK_CODE] = IST("layer5-7 code, if available of the last health 
check."),
[ST_F_CHECK_DURATION] = IST("Total duration of the latest server health 
check, in seconds."),



Both merged, thanks William !

--
Christopher Faulet



Re: [PATCH] typo fixes

2021-02-08 Thread Christopher Faulet

Le 06/02/2021 à 18:30, Илья Шипицин a écrit :

Hello,

another cleanup.



Now merged. Thanks !

--
Christopher Faulet



Re: [PATCH v2 0/5] fix check port/addr consistency

2021-02-04 Thread Christopher Faulet

Le 03/02/2021 à 12:19, William Dauchy a écrit :

On Wed, Feb 3, 2021 at 9:59 AM Christopher Faulet  wrote:

At first glance, I'm just a bit annoyed with the patch 5. In the documentation,
it is stated that "addr" option will be used for agent-check too. And there is
no info about interactions between "addr" and "agent-addr" options when both are
configured. For me, for an agent-check, the "agent-addr" option must be used in
priority, regardless the options order. If not defined, the "addr" option must
be used, if defined. And at the end, we use the server address by default if
none is specified.


ok I missed that part. it is a bit sad honestly, it makes things less explicit.


There is another problem with "port" and "agent-port" options. It is mentioned
in the documentation that "agent-port" is required to perform agent-checks. And
"port" option is not used for agent-check. It is not really consistent. I
propose to deal with port/agent-port options in the same way than
addr/agent-addr ones. And we keep the test to be sure to have a dedicated port
for the agent-check to not use the server's one. This way, we keep backward
compatibility and improve consistency.
Thus, if you agree with these changes, I guess we should keep SRV_F_AGENTADDR
flag and add SRV_F_AGENTPORT.


ok I will come back with a v3. I honestly don't like to have port/addr
used for agent, as stated previously because it is creating some non
explicit things. But indeed as we need to keep backward compat, I will
fix that.


About the CLI. I agree, the "check-addr" parameter on the "set server" command
must be added. And the "agent-port" parameter must bee added too. For me, it is
a consistency matter. But I understand it is also mandatory for dynamic
environments.


thanks, let me come back with a v3, so we can move forward for the
next improvements.



For the record, the series was finally fixed and pushed. Thanks William !

--
Christopher Faulet



Re: [PATCH v3 0/5] fix check port/addr consistency

2021-02-04 Thread Christopher Faulet

Le 03/02/2021 à 22:30, William Dauchy a écrit :

Hello Christopher,

Here is my last update on port/addr consistency. I addressed all the
point you mentioned. I hope I did not forgot anything.  I will come back
with `check-addr` and `agent-port` on the cli once those patches are
accepted.

William Dauchy (5):
   BUG/MINOR: cli: fix set server addr/port coherency with health checks
   MEDIUM: server: adding support for check_port in server state
   MEDIUM: check: remove checkport checkaddr flag
   BUG/MINOR: check: consitent way to set agentaddr
   MEDIUM: check: align agentaddr and agentport behaviour



Thanks William, it seems good. I've just a question (sorry :). When the server 
state file is loaded, why the check port is set only if there is a DNS 
resolution ? I know it was done this way before the support of check_port 
parameter in the server state. But I suppose that now we support it, it should 
always be set, isn't it ?


And is there any usage to add "agent-addr" in the server state file? Because it 
can also be set on the CLI. And later, same question will appear for 
"check-addr" and "agent-port".


I don't want to bother you again. So, I propose you to merge the first patch and 
to add a new one to not set the check port when the server state file is loaded. 
Then I can merge the third patch and amend the second one to move the check port 
assignment before merging it. And finally I can merge the fourth and fifth patches.


--
Christopher Faulet



Re: [PATCH v2 0/5] fix check port/addr consistency

2021-02-03 Thread Christopher Faulet

Le 02/02/2021 à 22:56, William Dauchy a écrit :

Hello Christopher,

As discussed, I revisited my previous series regarding check addr and
port consistency. I don't think I missed anything.

I won't hide my aim here, I would like to add support to set
`check.addr` on the cli like it is possible for the service port. More
and more people have a setup with containers having their own IP, but
with a health check on the host (e.g. on the consul side for example).
So it becomes more and more needed.

That's mostly why I wanted to clarify the situation first with the
different things I've seen while revisting this part of the code.



Thanks William, I will review the series soon.

At first glance, I'm just a bit annoyed with the patch 5. In the documentation, 
it is stated that "addr" option will be used for agent-check too. And there is 
no info about interactions between "addr" and "agent-addr" options when both are 
configured. For me, for an agent-check, the "agent-addr" option must be used in 
priority, regardless the options order. If not defined, the "addr" option must 
be used, if defined. And at the end, we use the server address by default if 
none is specified.


There is another problem with "port" and "agent-port" options. It is mentioned 
in the documentation that "agent-port" is required to perform agent-checks. And 
"port" option is not used for agent-check. It is not really consistent. I 
propose to deal with port/agent-port options in the same way than 
addr/agent-addr ones. And we keep the test to be sure to have a dedicated port 
for the agent-check to not use the server's one. This way, we keep backward 
compatibility and improve consistency.


Thus, if you agree with these changes, I guess we should keep SRV_F_AGENTADDR 
flag and add SRV_F_AGENTPORT.


About the CLI. I agree, the "check-addr" parameter on the "set server" command 
must be added. And the "agent-port" parameter must bee added too. For me, it is 
a consistency matter. But I understand it is also mandatory for dynamic 
environments.


Once that said, I don't really know how all of this is interacting with the 
servers state file. I don't really how this part works, so I may missed something.


--
Christopher Faulet



  1   2   3   4   5   6   >