Re: Clients occasionally see truncated responses
Thanks, Willy. I appreciate you taking the time to send a patch. Unfortunately, I'm still seeing the errors and multiple records being sent back to the clients When you say, I've just rechecked in the code and indeed, by default it will not limit. Does that mean tune.idletimer allows for the unlimited size buffer by default? The way I read the docs is that there must be at least 1ms idle before the larger buffers are available ( http://cbonte.github.io/haproxy-dconv/2.2/configuration.html#tune.idletimer). I'm curious if there are some cases where they aren't getting used. Although, it doesn't seem likely. Thanks for the info on streams. On Fri, Apr 9, 2021 at 1:53 AM Willy Tarreau wrote: > Hi Nathan, > > On Thu, Apr 08, 2021 at 06:15:10PM -0700, Nathan Konopinski wrote: > > One other behavior I've observed, nginx has an ssl_buffer_size ( > > http://nginx.org/en/docs/http/ngx_http_ssl_module.html#ssl_buffer_size) > > option that defaults to 16kb. As I decrease its value, I start seeing the > > same body size != content length errors from clients. > > Interesting, really, as it definitely confirms an issue with the way the > client handles TLS. > > > I tried experimenting with tune.idletimer to force the use of large ssl > > buffers (tune.ssl.maxrecord's default of no limit seems > than 16kb so I > > didn't bother adjusting it) but I'm still seeing errors. > > I've just rechecked in the code and indeed, by default it will not > limit. > > > From my reading of > > the docs, a stream must be idle for 1s by default before the larger ssl > > buffers are used. It also seems a stream must be idle for at least 1ms > > before the large buffers are available. Is that correct? > > I don't remember these details but the maxrecord is only an upper bound > so if you didn't set it it won't have any effect. > > > Is there a way to > > always use the larger ssl buffers without this detection? > > Based on your observation, I suspect that for whatever reason, the client > is bogus and expects certain data to be located within a single SSL record, > which is a violation of the standard, but bugs are present :-/ > > What might very likely happen here is that some records are shorter > than others just because the buffer wraps, and the openssl API doesn't > provide a vector-based send function. So you can't say "send 1kB till > the end of the buffer and loop back to the beginning for 3 extra kB". > Instead you have to perform one 1kB write and a second 3kB write, > resulting in two records. > > One workaround could be the following patch which always unwraps the > buffer before sending, always resulting in the largest possible record: > > diff --git a/src/ssl_sock.c b/src/ssl_sock.c > index f1b0668ab..2b47e6b10 100644 > --- a/src/ssl_sock.c > +++ b/src/ssl_sock.c > @@ -6071,6 +6071,7 @@ static size_t ssl_sock_to_buf(struct connection > *conn, void *xprt_ctx, struct bu > static size_t ssl_sock_from_buf(struct connection *conn, void *xprt_ctx, > const struct buffer *buf, size_t count, int flags) > { > struct ssl_sock_ctx *ctx = xprt_ctx; > + struct buffer aligned_buf = *buf; > ssize_t ret; > size_t try, done; > > @@ -6093,6 +6094,13 @@ static size_t ssl_sock_from_buf(struct connection > *conn, void *xprt_ctx, const s > #endif > > try = b_contig_data(buf, done); > + > + if (try < (b_data(buf) - done) && try < count) { > + b_slow_realign(&aligned_buf, trash.area, done); > + try = b_contig_data(&aligned_buf, done); > + buf = &aligned_buf; > + } > + > if (try > count) > try = count; > > It's not pretty but it should do the job by making certain that every > time the record size is limited by wrapping, the buffer is first > unwrapped and the record is maximized. I'm also attaching it to help > you apply it. It was made for 2.4-dev but I verified that it still > applies (with an offset) on 2.2. I'd be interested in knowing if that > addresses your problem. > > > I'm not sure what a stream represents in this context. > > It's a bidirectional communication between a client and a server in > the context of a single HTTP transaction. You can see this as the > application level in haproxy, which deals with request/response > forwarding between the two sides and which executes in turn all the > analysers and filters corresponding to the rules in your config. > You can have a look at doc/internals/muxes.png for a quick overview > (the stream is the entity at the top with carries the request buffer). > > Willy >
Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)
On Fri, Apr 09, 2021 at 10:14:26PM +0200, Christopher Faulet wrote: > It seems you have a blocking call in one of your lua script. The threads dump > shows many threads blocked in hlua_ctx_init. Many others are executing lua. > Unfortunately, for a unknown reason, there is no stack traceback. All of our Lua is string handling. Parsing headers, and then building TXN variables as well as a set of applets that return responses in cases where we don't want to go to a backend server as the response is simple enough to generate inside the LB. > For the 2.3 and prior, the lua scripts are executed under a global lock. Thus > blocking calls in a lua script are awful, because it does not block only one > thread but all others too. I guess the same issue exists on the 1.8, but > there > is no watchdog on this version. Thus, time to time HAProxy hangs and may > report > huge latencies but, at the end it recovers and continues to process data. It > is > exactly the purpose of the watchdog, reporting hidden bugs related to > spinning > loops and deadlocks. Nothing in this Lua code should have blocking calls at all. The Lua code has zero calls to external services, no sockets, no sleeps, no print, no Map.new (single call in the Lua startup, not inside any applet or fetch), no usage of other packages, no file handling, no other IO. I'm hoping I can get $work to agree to fully open-source the Lua, so you can see this fact and review the code to confirm that it SHOULD be non-blocking. > However, I may be wrong. It may be just a contention problem because your are > executing lua with 64 threads and a huge workload. In this case, you may give > a > try to the 2.4 (under development). There is a way to have a separate lua > context for each thread loading the scripts with "lua-load-per-thread" > directive. Out of curiosity, on the 1.8, are you running HAProxy with several > threads or are you spawning several processes? nbthread=64, nbproc=1 on both 1.8/2.x Yes, we're hoping to try 2.4.x, just working on some parts to get there. -- Robin Hugh Johnson E-Mail : robb...@orbis-terrarum.net Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 signature.asc Description: PGP signature
Re: [PATCH] MINOR: sample: add json_string
Tim. On 09.04.21 18:55, Tim Düsterhus wrote: Aleks, > I have taken a first look. Find my remarks below. Please note that for the actual > source code there might be further remarks by Willy (put in CC) or so. I might have > missed something or might have told you something incorrect. So maybe before making > changes wait for their opinion. Thank you for your feedback. > Generally I must say that I don't like the mjson library, because it uses 'int' for > sizes. It doesn't really bring the point home that it is a safe library. This one > looks much better to me: https://github.com/FreeRTOS/coreJSON. It does not support > JSON path, though. Not sure how much of an issue that would be? Well I have created a issue in coreJSON how to handle the "." in the key. https://github.com/FreeRTOS/coreJSON/issues/92 I have choose the mjson library because it was small and offers the JSON path feature. On 4/8/21 10:21 PM, Aleksandar Lazic wrote: From 7ecb80b1dfe37c013cf79bc5b5b1caa3c0112a6a Mon Sep 17 00:00:00 2001 From: Alekesandar Lazic Date: Thu, 8 Apr 2021 21:42:00 +0200 Subject: [PATCH] MINOR: sample: add json_string I'd add 'converter' to the subject line to make it clear that this is a conveter. This sample get's the value of a JSON key Typo: It should be 'gets'. --- Makefile | 3 +- doc/configuration.txt | 15 + include/import/mjson.h | 213 + reg-tests/sample_fetches/json_string.vtc | 25 + src/mjson.c | 1052 ++ src/sample.c | 63 ++ 6 files changed, 1370 insertions(+), 1 deletion(-) create mode 100644 include/import/mjson.h create mode 100644 reg-tests/sample_fetches/json_string.vtc create mode 100644 src/mjson.c diff --git a/doc/configuration.txt b/doc/configuration.txt index 01a01eccc..7f2732668 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -19043,6 +19043,21 @@ http_first_req : boolean This is the 'fetch' section. Move the documentation to the 'converter' section. > from some requests when a request is not the first one, or to help grouping requests in the logs. +json_string() : string I don't like the name. A few suggestions: - json_query - json_get - json_decode maybe json_get_string because there could be some more getter like bool, int, ... + Returns the string value of the given json path. It should be "JSON" (in uppercase) here and everywhere else. Okay and agree. + The is required. + This sample uses the mjson library https://github.com/cesanta/mjson + The json path syntax is defined in this repo https://github.com/json-path/JsonPath Overall the description of the converter does not read nicely / feels inconsistent compared to other converters / uses colloquial language. Let me suggest something like: Extracts the value located at from the JSON string in the input value. must be a valid JsonPath string as defined at https://goessner.net/articles/JsonPath/ I changed the link, because that appears to be the canonical reference. Okay. + Example : No space in front of the colon. + # get the value from the key kubernetes.io/serviceaccount/namespace + # => openshift-logging + http-request set-var(sess.json) req.hdr(Authorization),b64dec,json_string('$.kubernetes\\.io/serviceaccount/namespace') + + # get the value from the key iss + # => kubernetes/serviceaccount + http-request set-var(sess.json) req.hdr(Authorization),b64dec,json_string('$.iss') I don't like that the example is that specific for Kubernetes usage. A more general example would be preferred, because it makes it easier to understand the concept. The '$.iss' is the generic JWT field. https://tools.ietf.org/html/rfc7519#section-4.1 "iss" (Issuer) Claim But maybe I could look for a "normal" JSON sting and only JWT. method : integer + string Returns an integer value corresponding to the method in the HTTP request. For example, "GET" equals 1 (check sources to establish the matching). Value 9 diff --git a/include/import/mjson.h b/include/import/mjson.h new file mode 100644 index 0..ff46e7950 --- /dev/null +++ b/include/import/mjson.h @@ -0,0 +1,213 @@ [...] +// Aleksandar Lazic +// git clone from 2021-08-04 because of this fix +// https://github.com/cesanta/mjson/commit/7d8daa8586d2bfd599775f049f26d2645c25a8ee Please don't edit third party libraries, even if it is just a comment. This makes updating hard. Okay. diff --git a/reg-tests/sample_fetches/json_string.vtc b/reg-tests/sample_fetches/json_string.vtc new file mode 100644 index 0..fc387519b --- /dev/null +++ b/reg-tests/sample_fetches/json_string.vtc Again, this is a converter. Move the test into the appropriate folder. And please make sure you understand the difference between fetches and converters. Yeah the difference between fetchers and converters in no
Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)
Le 09/04/2021 à 19:26, Robin H. Johnson a écrit : Hi, Maciej had said they were going to create a new thread, but I didn't see one yet. I want to start by noting problem was much worse on 2.2.8 & 2.2.9, and that 2.2.13 & 2.3.9 don't get entirely hung at 100% anymore: a big thanks for that initial work in fixing the issue. As I mentioned in my other mail asking for a 1.8.30 release, we're experiencing this problem in DigitalOcean's HAProxy instances used to run the Spaces product. I've been trying to dig out deeper detail as well with a debug threads version, but I have the baseline error output from 2.3.9 here to share, after passing redaction review. This dump was generated with vbernat's PPA version of 2.3.9, not any internal builds. We have struggled to reproduce the problem in testing environments, it only turns up at the biggest regions, and plotting occurances of the issue over the time dimension suggest that it might have some partial correlation w/ a weird workload input. The dumps do suggest Lua is implicated as well, and we've got some extensive Lua code, so it's impossible to rule it out as contributing to the problem (We have been discussing plans to move it to SPOA instead). The Lua code in question hasn't changed significantly in nearly 6 months, and it was problem-free on the 1.8 series (having a test suite for the Lua code has been invaluable). Hi, It seems you have a blocking call in one of your lua script. The threads dump shows many threads blocked in hlua_ctx_init. Many others are executing lua. Unfortunately, for a unknown reason, there is no stack traceback. For the 2.3 and prior, the lua scripts are executed under a global lock. Thus blocking calls in a lua script are awful, because it does not block only one thread but all others too. I guess the same issue exists on the 1.8, but there is no watchdog on this version. Thus, time to time HAProxy hangs and may report huge latencies but, at the end it recovers and continues to process data. It is exactly the purpose of the watchdog, reporting hidden bugs related to spinning loops and deadlocks. However, I may be wrong. It may be just a contention problem because your are executing lua with 64 threads and a huge workload. In this case, you may give a try to the 2.4 (under development). There is a way to have a separate lua context for each thread loading the scripts with "lua-load-per-thread" directive. Out of curiosity, on the 1.8, are you running HAProxy with several threads or are you spawning several processes? -- Christopher Faulet
Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)
Hi Robin, W dniu pt., 9.04.2021 o 19:26 Robin H. Johnson napisał(a): > Maciej had said they were going to create a new thread, but I didn't see > one yet. > > I want to start by noting problem was much worse on 2.2.8 & 2.2.9, and > that 2.2.13 & 2.3.9 don't get entirely hung at 100% anymore: a big > thanks for that initial work in fixing the issue. > After Christopher patch using another function (that does not allocate memory) to dump LUA stack trace the situation is much better. It used 100% cpu only once and only one thread, however HAProxy didn't kill itself like in your situation. Since then the issue did not occur again, I have anything useful to share. :( I'm using 2.2.11 HAProxy but will upgrade next week to 2.3, I'll report if anything interesting pops out. > We have struggled to reproduce the problem in testing environments, it > only turns up at the biggest regions, and plotting occurances of the > issue over the time dimension suggest that it might have some partial > correlation w/ a weird workload input. > On our side the issue occured on a machine with unusual workload, hundred thousands tcp connections and very little http requests/responses. We also use lua. Kind regards, Maciej
Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)
Hi, Maciej had said they were going to create a new thread, but I didn't see one yet. I want to start by noting problem was much worse on 2.2.8 & 2.2.9, and that 2.2.13 & 2.3.9 don't get entirely hung at 100% anymore: a big thanks for that initial work in fixing the issue. As I mentioned in my other mail asking for a 1.8.30 release, we're experiencing this problem in DigitalOcean's HAProxy instances used to run the Spaces product. I've been trying to dig out deeper detail as well with a debug threads version, but I have the baseline error output from 2.3.9 here to share, after passing redaction review. This dump was generated with vbernat's PPA version of 2.3.9, not any internal builds. We have struggled to reproduce the problem in testing environments, it only turns up at the biggest regions, and plotting occurances of the issue over the time dimension suggest that it might have some partial correlation w/ a weird workload input. The dumps do suggest Lua is implicated as well, and we've got some extensive Lua code, so it's impossible to rule it out as contributing to the problem (We have been discussing plans to move it to SPOA instead). The Lua code in question hasn't changed significantly in nearly 6 months, and it was problem-free on the 1.8 series (having a test suite for the Lua code has been invaluable). -- Robin Hugh Johnson E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 20210409_haproxy-crashlogs.REDACTED.txt.gz Description: Binary data signature.asc Description: PGP signature
Re: Request for new 1.8.x release due to freq counter bug
"Robin H. Johnson" wrote: > Hi, > Wondering if you could make a new 1.8.x release to get the fix for the > freq counter bug into more systems? > dde80e111 BUG/MEDIUM: time: make sure to always initialize the global tick > 491b86ed0 BUG/MEDIUM: freq_ctr/threads: use the global_now_ms variable > It's a problem for anybody using counters for ratelimiting purposes, who > can't yet upgrade to 2.x series due to other issues encountered there > (the 100% cpu bug even on the latest 2.3.9, that I have logs I'm hoping > to post when they are approved by $work) Hi, I will try to emit new releases next week for 2.0 and 1.8 which are impacted by the bug you mentionned. -- Amaury Denoyelle
Re: [PATCH] MINOR: sample: add json_string
Aleks, I have taken a first look. Find my remarks below. Please note that for the actual source code there might be further remarks by Willy (put in CC) or so. I might have missed something or might have told you something incorrect. So maybe before making changes wait for their opinion. Generally I must say that I don't like the mjson library, because it uses 'int' for sizes. It doesn't really bring the point home that it is a safe library. This one looks much better to me: https://github.com/FreeRTOS/coreJSON. It does not support JSON path, though. Not sure how much of an issue that would be? On 4/8/21 10:21 PM, Aleksandar Lazic wrote: From 7ecb80b1dfe37c013cf79bc5b5b1caa3c0112a6a Mon Sep 17 00:00:00 2001 From: Alekesandar Lazic Date: Thu, 8 Apr 2021 21:42:00 +0200 Subject: [PATCH] MINOR: sample: add json_string I'd add 'converter' to the subject line to make it clear that this is a conveter. This sample get's the value of a JSON key Typo: It should be 'gets'. --- Makefile |3 +- doc/configuration.txt| 15 + include/import/mjson.h | 213 + reg-tests/sample_fetches/json_string.vtc | 25 + src/mjson.c | 1052 ++ src/sample.c | 63 ++ 6 files changed, 1370 insertions(+), 1 deletion(-) create mode 100644 include/import/mjson.h create mode 100644 reg-tests/sample_fetches/json_string.vtc create mode 100644 src/mjson.c diff --git a/doc/configuration.txt b/doc/configuration.txt index 01a01eccc..7f2732668 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -19043,6 +19043,21 @@ http_first_req : boolean This is the 'fetch' section. Move the documentation to the 'converter' section. from some requests when a request is not the first one, or to help grouping requests in the logs. +json_string() : string I don't like the name. A few suggestions: - json_query - json_get - json_decode + Returns the string value of the given json path. It should be "JSON" (in uppercase) here and everywhere else. + The is required. + This sample uses the mjson library https://github.com/cesanta/mjson + The json path syntax is defined in this repo https://github.com/json-path/JsonPath Overall the description of the converter does not read nicely / feels inconsistent compared to other converters / uses colloquial language. Let me suggest something like: Extracts the value located at from the JSON string in the input value. must be a valid JsonPath string as defined at https://goessner.net/articles/JsonPath/ I changed the link, because that appears to be the canonical reference. + Example : No space in front of the colon. + # get the value from the key kubernetes.io/serviceaccount/namespace + # => openshift-logging + http-request set-var(sess.json) req.hdr(Authorization),b64dec,json_string('$.kubernetes\\.io/serviceaccount/namespace') + + # get the value from the key iss + # => kubernetes/serviceaccount + http-request set-var(sess.json) req.hdr(Authorization),b64dec,json_string('$.iss') I don't like that the example is that specific for Kubernetes usage. A more general example would be preferred, because it makes it easier to understand the concept. method : integer + string Returns an integer value corresponding to the method in the HTTP request. For example, "GET" equals 1 (check sources to establish the matching). Value 9 diff --git a/include/import/mjson.h b/include/import/mjson.h new file mode 100644 index 0..ff46e7950 --- /dev/null +++ b/include/import/mjson.h @@ -0,0 +1,213 @@ [...] +// Aleksandar Lazic +// git clone from 2021-08-04 because of this fix +// https://github.com/cesanta/mjson/commit/7d8daa8586d2bfd599775f049f26d2645c25a8ee Please don't edit third party libraries, even if it is just a comment. This makes updating hard. diff --git a/reg-tests/sample_fetches/json_string.vtc b/reg-tests/sample_fetches/json_string.vtc new file mode 100644 index 0..fc387519b --- /dev/null +++ b/reg-tests/sample_fetches/json_string.vtc Again, this is a converter. Move the test into the appropriate folder. And please make sure you understand the difference between fetches and converters. @@ -0,0 +1,25 @@ +varnishtest "Test to get value from json sting" +#REQUIRE_VERSION=2.4 + +feature ignore_unknown_macro + +haproxy h1 -conf { +defaults +mode http +timeout connect 1s +timeout client 1s +timeout server 1s + +frontend fe1 +bind "fd@${fe1}" +http-request set-var(sess.iss) req.hdr(Authorization),b64dec,json_string('$.iss') +http-request return status 200 hdr x-var "json-value=%[var(sess.iss)]" +} -start + +client c1 -connect ${h1_fe1_sock} { +txreq -req GET -url "/" \ + -hdr "Authorization: eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwi
[ANNOUNCE] haproxy-2.4-dev16
Hi, HAProxy 2.4-dev16 was released on 2021/04/09. It added 37 new commits after version 2.4-dev15. This one is particularly calm, I even hesitated between making it or not. But there are a few updates that may affect configuration so I figured it was better to emit a new one. A large part of the patch is essentially caused by the renaming of a few atomic ops that were causing some confusion. Now HA_ATOMIC_FOO() will be a void statement so that if you want to read from the location you use either HA_ATOMIC_FETCH_FOO() or HA_ATOMIC_FOO_FETCH() (pre- or post- fetch). The output code shouldn't change however, and given that it was essentially sed-work, as soon as it started to work I was confident in it. A few changes in the FD code to clean up that messy fdtab structure cause another noticeable part of the diff. I obviously managed to break something once (hence the BUG/MAJOR fix) but now it's OK. Mistakes at this level are never forgiving anyway, either it fully works or it fully fails. The nice part that makes me think we're progressively approaching what could look like the release is that Emeric finally performed the few changes in the DNS and log code. For the DNS, the TCP servers in the "resolvers" section do not need to be referred to as "server" anymore, they are "nameserver" just like the other ones, except that you can mention "tcp@" in front of their address to force them to be TCP nameservers. No more configuration mixing there. And for the log servers, similarly, now you can specify "tcp@" in front of an address on a "log" statement, and it will automatically create the ring for you with default settings. Previously it was still required to manually declare the ring, I found this too cumbersome, and Emeric figured how to handle this. The rest is essentially small bug fixes and code cleanups from a bunch of contributors. Now speaking about the pending stuff I'm still aware of: - I'd like to rename LIST_ADD/LIST_ADDQ to LIST_INSERT/LIST_APPEND (or maybe LIST_ADD_HEAD/LIST_ADD_TAIL ?). I've already been trapped in using LIST_ADD() when I wanted to append and I know the confusion is easy. That's just a quick "sed" once we know what we want. - I identified some undesired cache line sharing around some pointers, that slow down FD operations and possibly other stuff. I see how to arrange this, it just needs to be done (and tested on MacOS since we noticed once that section names differ there). - we've had a recent discussion about the opportunity to import SLZ and to always enable compression by default using it if no other lib is specified. I think it could be useful for some users (especially those for whom memory usage is important). I'll have a look at this, maybe next week, that's only two files to include. - regarding the quick discussion two weeks ago about optimization for modern ARM CPUs, I saw that one solution could be to build using gcc 10.2+ which outline their atomics into functions. That's ugly but the performance impact is small (about 3% in my tests), while it provides a tremendous improvement for many-core machines. But if we rely on this do this I'll probably add two new CPU entries to force to use only an old one (v8.0) or only a modern one (v8.2) so that those who build themselves can shave the last few percent. - no progress made on the default-path, but we'll have to reread the issue to figure the best choice. I'd like to see it done for the release to ease config packaging and deployments. - I'd like to add a warning when we detect that nbthreads is forced to a higher value than the number of bound CPUs. It's not the first time that I see this in a config and the result is catastrophic, so a warning is deserved. It just needs to be set at the right place. - the shared memory pools cleanup must be completed before the release, as the situation is not good for those with a slow malloc() function at the moment. I know what to do, I just need to stay focused on it. - the date & time cleanups would be nice to have for the long term and are not particularly hard to do so better finish them before 2.4. - Tim sent a patch series to implement URI normalization. That's something I'd like to see merged if possible, as it may improve security for some users, and at least improve reliability for others. - I also noticed Alek's mjson import with a new converter, but didn't have a look yet. Maybe it will open opportunities for more converters, that's definitely something which deserves being considered before the release. - Amaury has almost finished some work to improve auto-binding on NUMA systems. Right now using a dual-socket machine is common an painful at the same time. Having threads enabled by default doesn't improve the situation there at all due to the slow communication between the CPUs. In his tests he could
Re: [RFC PATCH 0/8] URI normalization / Issue #714
Le 08/04/2021 à 20:59, Tim Duesterhus a écrit : Willy, Christopher, Not sure who of you is better suited to review this series, so I'm adding both of you :-) I'm tagging this as RFC, because it's large and quite a bit outside of my comfort zone. However the patches are as clean as possible. They include full documentation and each normalizer comes with a bunch of reg-tests ensuring they behave like I expect them to behave. So if you have nothing to complain, then this series is in a mergeable state. I plan to add a few more normalizers after this passes an initial review. I'll add additional remarks to each patch, explaining my decisions in more detail. Thanks Tim, I'll try to review your patches next week. -- Christopher Faulet
Re: haproxy 2.4 opentracing using x-b3-traceid header
I've tried to add header adding the context name but with no luck (i thinks it's done after ot read the request) here an example of headers that i receive X-B3-TraceId: 48485a3953bb6124 X-B3-ParentSpanId: 48485a3953bb6124 X-B3-SpanId: 42e1e27066118383 X-B3-Sampled: 1 X-B3-Flags: 0 here the documentation about propagation (there's more format for the propagation...but this is the more common) https://github.com/openzipkin/b3-propagation Thanks Il giorno ven 9 apr 2021 alle ore 13:52 Miroslav Zagorac < miros...@zagorac.name> ha scritto: > On 04/09/2021 01:40 PM, Andrea Bonini wrote: > > Hi Zaga, > > thank you for the explanation. My problem is that the external service > send > > the span info in header as x-b3-* format (the opentracing default one). > > For reading it, which extract context must i configure? if i don't > > understand wrong if i want read x-b3-* header i must use a context with > > empty name that is not allowed. > > > > Hello Andrea, > > in this case the ability to accept the context with an empty name should > be added to the OpenTracing filter. I will think about it. If this is > not a problem, send the header content for one such example (only those > headers related to OpenTracing, with complete name and content). > > Thank you for the question and comment. > > Best regards, > > -- > Zaga > > What can change the nature of a man? > -- Bonini Andrea
Re: haproxy 2.4 opentracing using x-b3-traceid header
On 04/09/2021 01:40 PM, Andrea Bonini wrote: Hi Zaga, thank you for the explanation. My problem is that the external service send the span info in header as x-b3-* format (the opentracing default one). For reading it, which extract context must i configure? if i don't understand wrong if i want read x-b3-* header i must use a context with empty name that is not allowed. Hello Andrea, in this case the ability to accept the context with an empty name should be added to the OpenTracing filter. I will think about it. If this is not a problem, send the header content for one such example (only those headers related to OpenTracing, with complete name and content). Thank you for the question and comment. Best regards, -- Zaga What can change the nature of a man?
Re: haproxy 2.4 opentracing using x-b3-traceid header
Hi Zaga, thank you for the explanation. My problem is that the external service send the span info in header as x-b3-* format (the opentracing default one). For reading it, which extract context must i configure? if i don't understand wrong if i want read x-b3-* header i must use a context with empty name that is not allowed. Thanks again Il giorno ven 9 apr 2021 alle ore 10:43 Miroslav Zagorac < miros...@zagorac.name> ha scritto: > On 04/08/2021 01:48 PM, Andrea Bonini wrote: > > Hi everyone, > > > > i have a question about opentracing addon. Is there a way for using > headers > > info about a trace created from an outside service? > > > > what i want is using x-b3-traceid header from an incoming request as > > traceid and use spanid as parentid > > > > Thanks > > > > Hello Andrea, > > you are interested in propagating the span context from one service to > another. This can be done by setting up additional http headers in the > service that created the tracing process that contain the span context > which is then read in another process that wants to add its data. This > can be done via the ‘inject’ and ‘extract’ keywords in the opentracing > filter configuration. In this way they can automatically set up and > read the contents of the headers over which the span context is > transmitted. > > In the test directory there is one example that uses such a mode of > operation, where two haproxy processes are started, the first of which > passes the span context to the second. The example is run via the shell > script test/run-fe-be.sh and thus starts the frontend haproxy process > (fe) which passes the span context to the backend haproxy process (be). > Configurations are located in the test/fe and test/be directories. > > For more explanations read the README file from the ot directory. > > Best regards, > > -- > Zaga > > What can change the nature of a man? > -- Bonini Andrea
Request for new 1.8.x release due to freq counter bug
Hi, Wondering if you could make a new 1.8.x release to get the fix for the freq counter bug into more systems? dde80e111 BUG/MEDIUM: time: make sure to always initialize the global tick 491b86ed0 BUG/MEDIUM: freq_ctr/threads: use the global_now_ms variable It's a problem for anybody using counters for ratelimiting purposes, who can't yet upgrade to 2.x series due to other issues encountered there (the 100% cpu bug even on the latest 2.3.9, that I have logs I'm hoping to post when they are approved by $work) -- Robin Hugh Johnson E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 signature.asc Description: PGP signature
Request for new 1.8.x release due to freq counter bug
Hi, Wondering if you could make a new 1.8.x release to get the fix for the freq counter bug into more systems? dde80e111 BUG/MEDIUM: time: make sure to always initialize the global tick 491b86ed0 BUG/MEDIUM: freq_ctr/threads: use the global_now_ms variable It's a problem for anybody using counters for ratelimiting purposes, who can't yet upgrade to 2.x series due to other issues encountered there (the 100% cpu bug even on the latest 2.3.9, that I have logs I'm hoping to post when they are approved by $work) -- Robin Hugh Johnson E-Mail : robb...@orbis-terrarum.net Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 signature.asc Description: PGP signature
Re: Clients occasionally see truncated responses
Hi Nathan, On Thu, Apr 08, 2021 at 06:15:10PM -0700, Nathan Konopinski wrote: > One other behavior I've observed, nginx has an ssl_buffer_size ( > http://nginx.org/en/docs/http/ngx_http_ssl_module.html#ssl_buffer_size) > option that defaults to 16kb. As I decrease its value, I start seeing the > same body size != content length errors from clients. Interesting, really, as it definitely confirms an issue with the way the client handles TLS. > I tried experimenting with tune.idletimer to force the use of large ssl > buffers (tune.ssl.maxrecord's default of no limit seems > than 16kb so I > didn't bother adjusting it) but I'm still seeing errors. I've just rechecked in the code and indeed, by default it will not limit. > From my reading of > the docs, a stream must be idle for 1s by default before the larger ssl > buffers are used. It also seems a stream must be idle for at least 1ms > before the large buffers are available. Is that correct? I don't remember these details but the maxrecord is only an upper bound so if you didn't set it it won't have any effect. > Is there a way to > always use the larger ssl buffers without this detection? Based on your observation, I suspect that for whatever reason, the client is bogus and expects certain data to be located within a single SSL record, which is a violation of the standard, but bugs are present :-/ What might very likely happen here is that some records are shorter than others just because the buffer wraps, and the openssl API doesn't provide a vector-based send function. So you can't say "send 1kB till the end of the buffer and loop back to the beginning for 3 extra kB". Instead you have to perform one 1kB write and a second 3kB write, resulting in two records. One workaround could be the following patch which always unwraps the buffer before sending, always resulting in the largest possible record: diff --git a/src/ssl_sock.c b/src/ssl_sock.c index f1b0668ab..2b47e6b10 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -6071,6 +6071,7 @@ static size_t ssl_sock_to_buf(struct connection *conn, void *xprt_ctx, struct bu static size_t ssl_sock_from_buf(struct connection *conn, void *xprt_ctx, const struct buffer *buf, size_t count, int flags) { struct ssl_sock_ctx *ctx = xprt_ctx; + struct buffer aligned_buf = *buf; ssize_t ret; size_t try, done; @@ -6093,6 +6094,13 @@ static size_t ssl_sock_from_buf(struct connection *conn, void *xprt_ctx, const s #endif try = b_contig_data(buf, done); + + if (try < (b_data(buf) - done) && try < count) { + b_slow_realign(&aligned_buf, trash.area, done); + try = b_contig_data(&aligned_buf, done); + buf = &aligned_buf; + } + if (try > count) try = count; It's not pretty but it should do the job by making certain that every time the record size is limited by wrapping, the buffer is first unwrapped and the record is maximized. I'm also attaching it to help you apply it. It was made for 2.4-dev but I verified that it still applies (with an offset) on 2.2. I'd be interested in knowing if that addresses your problem. > I'm not sure what a stream represents in this context. It's a bidirectional communication between a client and a server in the context of a single HTTP transaction. You can see this as the application level in haproxy, which deals with request/response forwarding between the two sides and which executes in turn all the analysers and filters corresponding to the rules in your config. You can have a look at doc/internals/muxes.png for a quick overview (the stream is the entity at the top with carries the request buffer). Willy diff --git a/src/ssl_sock.c b/src/ssl_sock.c index f1b0668ab..2b47e6b10 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -6071,6 +6071,7 @@ static size_t ssl_sock_to_buf(struct connection *conn, void *xprt_ctx, struct bu static size_t ssl_sock_from_buf(struct connection *conn, void *xprt_ctx, const struct buffer *buf, size_t count, int flags) { struct ssl_sock_ctx *ctx = xprt_ctx; + struct buffer aligned_buf = *buf; ssize_t ret; size_t try, done; @@ -6093,6 +6094,13 @@ static size_t ssl_sock_from_buf(struct connection *conn, void *xprt_ctx, const s #endif try = b_contig_data(buf, done); + + if (try < (b_data(buf) - done) && try < count) { + b_slow_realign(&aligned_buf, trash.area, done); + try = b_contig_data(&aligned_buf, done); + buf = &aligned_buf; + } + if (try > count) try = count;
Re: haproxy 2.4 opentracing using x-b3-traceid header
On 04/08/2021 01:48 PM, Andrea Bonini wrote: Hi everyone, i have a question about opentracing addon. Is there a way for using headers info about a trace created from an outside service? what i want is using x-b3-traceid header from an incoming request as traceid and use spanid as parentid Thanks Hello Andrea, you are interested in propagating the span context from one service to another. This can be done by setting up additional http headers in the service that created the tracing process that contain the span context which is then read in another process that wants to add its data. This can be done via the ‘inject’ and ‘extract’ keywords in the opentracing filter configuration. In this way they can automatically set up and read the contents of the headers over which the span context is transmitted. In the test directory there is one example that uses such a mode of operation, where two haproxy processes are started, the first of which passes the span context to the second. The example is run via the shell script test/run-fe-be.sh and thus starts the frontend haproxy process (fe) which passes the span context to the backend haproxy process (be). Configurations are located in the test/fe and test/be directories. For more explanations read the README file from the ot directory. Best regards, -- Zaga What can change the nature of a man?