Re: OCSP update restarts all proxies
On 04/10/2023 03:05, Shawn Heisey wrote: On 10/3/23 01:33, Remi Tricot-Le Breton wrote: This command relies on the same task that performs the automatic update. What it does is basically add the certificate at the top of the task's update list and wakes it up. The update is asynchronous so we can't return a status to the CLI command. In order to check if the update was successful you can display the contents of the updated OCSP response via the "show ssl ocsp-response" command. If the response you updated is also set to be updated automatically, you can also use the "show ssl ocsp-updates" command that gives the update success and failure numbers as well as the last update status for all the responses registered in the auto update list. I have no idea how to get an interactive session going on the stats socket so that I can see whatever response a command generates. The only command I know for the socket is for the old-style OCSP update where the OCSP response is obtained with openssl, converted to base64, and sent to the socket. No response comes back when using socat in this way. You just have to run the following commands : $ echo "update ssl ocsp-response " | socat /path_to_socket/haproxy.sock - This command does not return anything (which is normal because the update will be performed asynchronously). You could see some log lines being emitted by your haproxy (depending on your log configuration) such as the following : <134>Oct 4 13:28:49 haproxy[14955]: -:- [04/Oct/2023:13:28:49.910] /path_to_cert/server_ocsp.pem.rsa 1 "Update successful" 0 1 You can then either check the contents of your OCSP response : $ echo "show ssl ocsp-response " | socat /path_to_socket/haproxy.sock - or, if your tested OCSP response had the 'ocsp-update' option set to 'on', you could dump the contents of the update tree thanks to the "echo ssl ocsp-updates" command: $ echo "show ssl ocsp-updates" | socat /tmp/haproxy.sock - OCSP Certid | Path | Next Update | Last Update | Successes | Failures | Last Update Status | Last Update Status (str) 303b300906052b0e03021a050004148a83e0060faff709ca7e9b95522a2e81635fda0a0414f652b0e435d5ea923851508f0adbe92d85de007a02021015 | /path_to_cert/server_ocsp.pem.rsa | 04/Oct/2023:14:28:49 +0200 | 04/Oct/2023:13:28:49 +0200 | 1 | 0 | 1 | Update successful The "Successes" and "Failures" counters should change when you call the "update ssl ocsp-response" CLI command. Here is my old script for OCSP updates, which I stopped using once I learned how to set up haproxy to do it automatically: https://paste.elyograg.org/view/5e88c914 Rémi LB
Re: OCSP update restarts all proxies
On 30/09/2023 09:20, Shawn Heisey wrote: On 9/28/23 02:29, Remi Tricot-Le Breton wrote: That's really strange, the OCSP update mechanism does not have anything to do with proxies. Are you sure you did not have a crash and autorestart of your haproxy ? I did not think that I had autorestart for haproxy, but it turns out that the service file created by the systemd stuff in the source repo DOES have "Restart=always". After I changed that to never and did systemctl daemon-reload, I discovered that at the top of the hour, something caused systemd to reload the service. From systemctl status haproxy: Sep 30 01:00:02 smeagol haproxy[234282]: [WARNING] (234282) : Proxy be_gitlab_8881 stopped (cumulated conns: FE: 0, BE: 0). Sep 30 01:00:02 smeagol haproxy[234282]: [WARNING] (234282) : Proxy be_gitlab2_8881 stopped (cumulated conns: FE: 0, BE: 0). Sep 30 01:00:02 smeagol haproxy[234282]: [WARNING] (234282) : Proxy be_artifactory_8082 stopped (cumulated conns: FE: 0, BE: 0). Sep 30 01:00:02 smeagol haproxy[234282]: [WARNING] (234282) : Proxy be_zabbix_81 stopped (cumulated conns: FE: 0, BE: 0). Sep 30 01:00:02 smeagol haproxy[234279]: [NOTICE] (234279) : New worker (236124) forked Sep 30 01:00:02 smeagol haproxy[234279]: [NOTICE] (234279) : Loading success. Sep 30 01:00:02 smeagol systemd[1]: Reloaded HAProxy Load Balancer. Sep 30 01:00:02 smeagol haproxy[234279]: [NOTICE] (234279) : haproxy version is 2.8.3-0499db-3 Sep 30 01:00:02 smeagol haproxy[234279]: [NOTICE] (234279) : path to executable is /usr/local/sbin/haproxy Sep 30 01:00:02 smeagol haproxy[234279]: [WARNING] (234279) : Former worker (234282) exited with code 0 (Exit) There are no relevant systemd timers, nothing in user crontabs, nothing in the various cron.* directories that could cause this. I did compile haproxy with systemd support ... can haproxy itself ask systemd for a reload? A way to check for a possible crash in the OCSP update code would be to use the "update ssl ocsp-response " from the CLI as well. It would use most of the OCSP update code so if a crash were to happen you might see it this way. Can you explain to me how to do this and see any output? I tried piping the command to socat talking to the stats proxy socket, and got no response. I think I do not know how to use socat correctly for this. This command relies on the same task that performs the automatic update. What it does is basically add the certificate at the top of the task's update list and wakes it up. The update is asynchronous so we can't return a status to the CLI command. In order to check if the update was successful you can display the contents of the updated OCSP response via the "show ssl ocsp-response" command. If the response you updated is also set to be updated automatically, you can also use the "show ssl ocsp-updates" command that gives the update success and failure numbers as well as the last update status for all the responses registered in the auto update list. Rémi LB
Re: OCSP update restarts all proxies
Hello, On 28/09/2023 00:30, Shawn Heisey wrote: The haproxy -vv output is at the end of this message. I got the built-in OCSP udpating mechanism working. Works beautifully. Today I discovered that once an hour when the OCSP gets updated, haproxy stops all its proxies and starts them back up. syslog: That's really strange, the OCSP update mechanism does not have anything to do with proxies. Are you sure you did not have a crash and autorestart of your haproxy ? Sep 27 15:00:01 - haproxy[3520801] Proxy web80 stopped (cumulated conns: FE: 42, BE: 0). Sep 27 15:00:01 - haproxy[3520801] Proxy web stopped (cumulated conns: FE: 1403, BE: 0). Sep 27 15:00:01 - haproxy[3520801] Proxy be_deny stopped (cumulated conns: FE: 0, BE: 122). Sep 27 15:00:01 - haproxy[3520801] Proxy be_raspi1_81 stopped (cumulated conns: FE: 0, BE: 0). Sep 27 15:00:01 - haproxy[3520801] Proxy be_raspi2_81 stopped (cumulated conns: FE: 0, BE: 0). Sep 27 15:00:01 - haproxy[3520801] Proxy be_raspi3_81 stopped (cumulated conns: FE: 0, BE: 0). Sep 27 15:00:01 - haproxy[3520801] Proxy be_smeagol_81 stopped (cumulated conns: FE: 0, BE: 700). Sep 27 15:00:01 - haproxy[3520801] Proxy be_plex_32400_tls stopped (cumulated conns: FE: 0, BE: 0). Sep 27 15:00:01 - haproxy[3520801] Proxy be_gitlab_8881 stopped (cumulated conns: FE: 0, BE: 235). Sep 27 15:00:01 - haproxy[3520801] Proxy be_gitlab2_8881 stopped (cumulated conns: FE: 0, BE: 180). Sep 27 15:00:01 - haproxy[3520801] Proxy be_artifactory_8082 stopped (cumulated conns: FE: 0, BE: 0). Sep 27 15:00:01 - haproxy[3520801] Proxy be_zabbix_81 stopped (cumulated conns: FE: 0, BE: 969). Sep 27 15:00:01 - haproxy[3545799] -:- [27/Sep/2023:15:00:01.668] /etc/ssl/certs/local/REDACTED_org.wildcards.combined .pem 1 "Update successful" 0 1 Sep 27 15:00:01 - haproxy[3545799] -:- [27/Sep/2023:15:00:01.795] /etc/ssl/certs/local/REDACTED2.com.wildcards.combined.p em 1 "Update successful" 0 1 Sep 27 15:00:01 - haproxy[3520801] -:- [27/Sep/2023:15:00:01.944] /etc/ssl/certs/local/REDACTED_org.wildcards.combined .pem 1 "Update successful" 0 2 Sep 27 15:00:02 - haproxy[3520801] -:- [27/Sep/2023:15:00:01.998] /etc/ssl/certs/local/REDACTED2.com.wildcards.combined.p em 1 "Update successful" 0 2 The really irritating effect is that once an hour, my Zabbix server records an event saying haproxy has been restarted: https://imgur.com/a/WPkKoFa (imgur will claim the image has mature content. it doesn't.) It looks like the only thing that resets back to zero on the stats page is the uptime in the "status" column for each backend. That's good news, but I would hope for none of the data to be reset. I have one big concern, which may be unfounded: I'm worried that the proxies going down will mean that in-flight connections will be terminated. I'm guessing that the work for seamless reloads will ensure that doesn't happen, I just want to be sure. Not knowing a lot about how haproxy is architected, I do not know if there is some reason that the backends have to be cycled. Seems like only frontends that listen with TLS would need that. I would hope it would be possible to even avoid that ... maybe have OCSP data be copied from a certain memory location every time a frontend needs it, and when OCSP gets updated, overwrite the data in that memory location in a thread-safe way. I know a fair amount about thread safety in Java, but nothing about it in C. That's what's supposed to happen, no listener should have to be restarted :) That's why it looks more like a crash to me. Final questions for today: 1) Can the OCSP update interval be changed? I don't recall exactly what the validity for a LetsEncrypt OCSP response is, but I know it was at least 24 hours, and I think it might have even been as long as a week. I would like to increase the interval to 8-12 hours if I can. The "tune.ssl.ocsp-update.maxdelay" and "tune.ssl.ocsp-update.mindelay" global options can change the default update interval. The idea behind the update every hour is that we can't just rely on the OCSP response's expiration time to determine when to perform update since it might be updated before its expiration date. We went for a 1 hour interval pretty arbitrarily but the mentioned options allow to adapt it to your use case. Just note that those options will concern all the OCSP updates, we do not have a per-certificate configuration mean yet. 2) There are two certs being used in my setup, and haproxy logs updates for both of them twice. I would have hoped for that to only happen once. I'm a bit mystified by the fact that it is done twice. I would have expected either one time or four times ... I have one frontend that listens with TLS, with four bind lines all using exactly the same certificate list. (one TCP, and three UDP) Not sure why that's the case yet. You could use some CLI commands to check whether the OCSP update tree is built correctly (that should be done at
Re: RE: RE: [PATCH] Added support for Arrays in sample_conv_json_query in sample.c
Hello, On 25/09/2023 13:55, Jens Popp wrote: Method now returns the content of Json Arrays, if it is specified in Json Path as String. The start and end character is a square bracket. Any complex object in the array is returned as Json, so that you might get Arrays of Array or objects. Only recommended for Arrays of simple types (e.g., String or int) which will be returned as CSV String. Also updated documentation and fixed issue with parenthesis --- doc/configuration.txt | 14 +++--- src/sample.c | 11 ++- 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index d49d359a2..785cbb77c 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -18356,10 +18356,18 @@ json([]) {"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"} json_query(,[]) - The json_query converter supports the JSON types string, boolean and - number. Floating point numbers will be returned as a string. By + The json_query converter supports the JSON types string, boolean, number + and array. Floating point numbers will be returned as a string. By specifying the output_type 'int' the value will be converted to an - Integer. If conversion is not possible the json_query converter fails. + Integer. Arrays will be returned as string, starting and ending with a + square brackets. The content is a CSV. Depending on the data type, the + array values might be quoted. If the array values are complex types, + the string contains the complete json representation of each value + separated by a comma. Example result for a roles query to a JWT: + + ["manage-account","manage-account-links","view-profile"] + + If conversion is not possible the json_query converter fails. must be a valid JSON Path string as defined in https://datatracker.ietf.org/doc/draft-ietf-jsonpath-base/ diff --git a/src/sample.c b/src/sample.c index 07c881dcf..2751ad3a0 100644 --- a/src/sample.c +++ b/src/sample.c @@ -4159,8 +4159,17 @@ static int sample_conv_json_query(const struct arg *args, struct sample *smp, vo return 1; } + case MJSON_TOK_ARRAY: { + // We copy the complete array, including square brackets into the return buffer + // result looks like: ["manage-account","manage-account-links","view-profile"] + int len; + len = b_putblk(trash, token, token_size); + trash->data = len; + smp->data.u.str = *trash; + smp->data.type = SMP_T_STR; + return 1; + } case MJSON_TOK_NULL: - case MJSON_TOK_ARRAY: case MJSON_TOK_OBJECT: /* We cannot handle these. */ return 0; -- 2.39.3 You don't need the intermediary 'len' variable (which is not of the same type as what b_putblk returns). You can simply replace it by trash->data directly. Apart from that the patch looks good to me. Rémi LB
Re: [PATCH] Added support for Arrays in sample_conv_json_query in sample.c
Hello, On 18/09/2023 10:27, Jens Popp wrote: Method now returns the content of Json Arrays, if it is specified in Json Path as String. The start and end character is a square bracket. Any complex object in the array is returned as Json, so that you might get Arrays of Array or objects. Only recommended for Arrays of simple types (e.g., String or int) which will be returned as CSV String. --- src/sample.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/sample.c b/src/sample.c index 07c881dcf..ecc4a961d 100644 --- a/src/sample.c +++ b/src/sample.c @@ -4159,8 +4159,16 @@ static int sample_conv_json_query(const struct arg *args, struct sample *smp, vo return 1; } - case MJSON_TOK_NULL: case MJSON_TOK_ARRAY: + // We copy the complete array, including square brackets into the return buffer + // result looks like: ["manage-account","manage-account-links","view-profile"] + strncpy( trash->area, token, token_size); + trash->data = token_size; + trash->size = token_size; + smp->data.u.str = *trash; + smp->data.type = SMP_T_STR; + return 1; + case MJSON_TOK_NULL: case MJSON_TOK_OBJECT: /* We cannot handle these. */ return 0; -- 2.39.3 [X] You might want to use the b_putblk function instead of strncpy since you don't manage the case where you don't have enough space in the trash buffer. The trash->size member should never be changed as well since it represents the buffer size, not the number of bytes written in it. Please update the "json_query" documentation as well. Thanks Rémi LB
Re: Old style OCSP not working anymore?
Hello, On 21/07/2023 14:40, Remi Tricot-Le Breton wrote: Hello, On 21/07/2023 11:51, Jarno Huuskonen wrote: Hi, On Thu, 2023-07-20 at 20:27 +0200, Sander Klein wrote: The best thing to do is to test with `openssl s_client -showcerts -connect some.hostname.nl:443` with both your versions to identify what changed. I've tested with 'openssl s_client -showcerts -connect mydomain.com:443 -servername mydomain.com -status -tlsextdebug'' Does 2.8.1 send ocsp response if you connect with ipv4 address: openssl s_client -showcerts -connect ipaddress:443 ... (with or without -servername) On 2.6.14 I get an OCSP response, on 2.8.1 I get: "OCSP response: no response sent" It really looks like HAProxy doesn't want to send the response coming from the file. Is there any more information I can gather? I get the same result as Sander (2.6.x sends ocsp and 2.8.1 doesn't). I've ipv4 and ipv6 binds and for ipv4 connection haproxy(2.8.1) sends ocsp and for ipv6 it doesn't. bind ipv4@:443 name v4ssl ssl crt-list /etc/haproxy/ssl/example.crtlist bind ipv6@:::443 name v6ssl ssl crt-list /etc/haproxy/ssl/example.crtlist (And example.crtlist: /etc/haproxy/ssl/somecertfile.pem.ecdsa [alpn h2,http/1.1] ) (and somecertfile.pem.ecdsa.ocsp in /etc/haproxy/ssl) If I change the order of ipv4 / ipv6 binds (so bind ipv6@:::443 name v6ssl... is first) then haproxy(2.8.1) sends ocsp with ipv6 connection and not with ipv4. I found the faulty commit for Jarno's issue ("cc346678d MEDIUM: ssl: Add ocsp_certid in ckch structure and discard ocsp buffer early"). Here's a patch that should fix it. If you want to try it with your setups be my guests, otherwise it should be merged soon if William is ok with the patch. Sorry for the inconvenience and thanks again for the reproducer Jarno. Rémi From 52057496684b3803732a2e263d21f75d71964d3c Mon Sep 17 00:00:00 2001 From: Remi Tricot-Le Breton Date: Fri, 21 Jul 2023 17:21:15 +0200 Subject: [PATCH] BUG/MINOR: ssl: OCSP callback only registered for first SSL_CTX If multiple SSL_CTXs use the same certificate that has an OCSP response file on the filesystem, only the first one will have the OCSP callback set. This bug was introduced by "cc346678d MEDIUM: ssl: Add ocsp_certid in ckch structure and discard ocsp buffer early" which cleared the ocsp_response from the ckch_data after it was inserted in the tree, which prevented subsequent contexts from having the callback registered. This patch should be backported to 2.8. --- src/ssl_sock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/ssl_sock.c b/src/ssl_sock.c index 9f48483d9..42a76fbb1 100644 --- a/src/ssl_sock.c +++ b/src/ssl_sock.c @@ -1132,7 +1132,7 @@ static int ssl_sock_load_ocsp(const char *path, SSL_CTX *ctx, struct ckch_data * /* In case of ocsp update mode set to 'on', this function might be * called with no known ocsp response. If no ocsp uri can be found in * the certificate, nothing needs to be done here. */ - if (!data->ocsp_response) { + if (!data->ocsp_response && !data->ocsp_cid) { if (data->ocsp_update_mode != SSL_SOCK_OCSP_UPDATE_ON || b_data(ocsp_uri) == 0) { ret = 0; goto out; -- 2.34.1
Re: Old style OCSP not working anymore?
Hello, On 21/07/2023 11:51, Jarno Huuskonen wrote: Hi, On Thu, 2023-07-20 at 20:27 +0200, Sander Klein wrote: The best thing to do is to test with `openssl s_client -showcerts -connect some.hostname.nl:443` with both your versions to identify what changed. I've tested with 'openssl s_client -showcerts -connect mydomain.com:443 -servername mydomain.com -status -tlsextdebug'' Does 2.8.1 send ocsp response if you connect with ipv4 address: openssl s_client -showcerts -connect ipaddress:443 ... (with or without -servername) On 2.6.14 I get an OCSP response, on 2.8.1 I get: "OCSP response: no response sent" It really looks like HAProxy doesn't want to send the response coming from the file. Is there any more information I can gather? I get the same result as Sander (2.6.x sends ocsp and 2.8.1 doesn't). I've ipv4 and ipv6 binds and for ipv4 connection haproxy(2.8.1) sends ocsp and for ipv6 it doesn't. bind ipv4@:443 name v4ssl ssl crt-list /etc/haproxy/ssl/example.crtlist bind ipv6@:::443 name v6ssl ssl crt-list /etc/haproxy/ssl/example.crtlist (And example.crtlist: /etc/haproxy/ssl/somecertfile.pem.ecdsa [alpn h2,http/1.1] ) (and somecertfile.pem.ecdsa.ocsp in /etc/haproxy/ssl) If I change the order of ipv4 / ipv6 binds (so bind ipv6@:::443 name v6ssl... is first) then haproxy(2.8.1) sends ocsp with ipv6 connection and not with ipv4. Thanks for the extra info, I seem to have a reproducer. I'll look into it and hopefully fixing it will fix Sander's issue as well. Rémi
Re: Wierd issue with OCSP updating
On 11/07/2023 22:22, Shawn Heisey wrote: On 7/11/23 01:30, Remi Tricot-Le Breton wrote: The OCSP update mechanism uses the internal http_client which then uses the resolvers. The only time when I had some strange resolver-related issues is when the name resolution returned IPv6 addresses which were not properly managed on my machine. The workaround I had in this case was to add "httpclient.resolvers.prefer ipv4" in the global section. That directive didn't work in "global" but it was accepted when I moved it to "defaults". But it didn't change the behavior. IPv6 is completely disabled on the server. Didn't work as in an error was raised ? I have a local configuration file with this option in the global section and it seems to work fine. You could also try to perform the same kind of request using the http_client directly from the CLI. Can you explain how to do this? When I make the request with curl, it works, but I don't know how to do what you are saying here. You can use the "httpclient" CLI command this way: echo "expert-mode on; httpclient GET http://r3.o.lencr.org/MFMwUTBPME0wSzAJBgUrDgMCGgUABBRI2smg%2ByvTLU%2Fw3mjS9We3NfmzxAQUFC6zF7dYVsuuUAlA5h%2BvnYsUwsYCEgOq9K0xVAXkgj8X4cNGeMutQw%3D%3D; | socat - It probably won't work either but at least you will have an easier way of testing the resolvers outside of the OCSP update. Apart from a specific configuration on your system I don't really see a reason why your tests fail somewhere and work correctly on another machine. Everything works on another server running a newer version of Ubuntu. That uses a newer version of gnu libc, which affects pretty much everything on the system, and a large number of other libraries are newer as well. Rémi
Re: Wierd issue with OCSP updating
On 10/07/2023 22:41, Shawn Heisey wrote: On 7/8/23 21:33, Shawn Heisey wrote: Here's the very weird part. It seems that haproxy is sending the OCSP request to localhost, not the http://r3.o.lencr.org URL that it SHOULD be sending it to. Right before the above log entry is this one: Jul 8 21:15:38 - haproxy[4075] 127.0.0.1:57696 [08/Jul/2023:21:15:38.447] web80 web80/ 0/-1/-1/-1/0 302 230 - - LR-- 1/1/0/0/0 0/0 "GET /MFMwUTBPME0wSzAJBgUrDgMCGgUABBRI2smg%2ByvTLU%2Fw3mjS9We3NfmzxAQUFC6zF7dYVsuuUAlA5h%2BvnYsUwsYCEgOq9K0xVAXkgj8X4cNGeMutQw%3D%3D HTTP/1.1" Anyone have any idea why haproxy is sending the ocsp request to 127.0.0.1 when it should be going to a public address obtained from the dns name r3.o.lencr.org? If I do this command on the same machine, it works correctly: curl -v -o response.ocsp "http://r3.o.lencr.org/MFMwUTBPME0wSzAJBgUrDgMCGgUABBRI2smg%2ByvTLU%2Fw3mjS9We3NfmzxAQUFC6zF7dYVsuuUAlA5h%2BvnYsUwsYCEgOq9K0xVAXkgj8X4cNGeMutQw%3D%3D; The OCSP update mechanism uses the internal http_client which then uses the resolvers. The only time when I had some strange resolver-related issues is when the name resolution returned IPv6 addresses which were not properly managed on my machine. The workaround I had in this case was to add "httpclient.resolvers.prefer ipv4" in the global section. You could also try to perform the same kind of request using the http_client directly from the CLI. Rémi
Re: OCSP update mechanism startup
On 07/07/2023 18:24, Willy Tarreau wrote: On Fri, Jul 07, 2023 at 03:42:58PM +, Tristan wrote: Also personally I have never understood the point of default server certs... besides getting unwanted attention from censys/shodan/etc... I remember some users who were hosting many applications from internal subsidiaries wanted to make sure that unhandled names were always served on the corporate site, at least to say something along "this site is no longer here" or just to direct users to the list of available sites. In the past it was common as well on hosting providers that any unhandled site was handled by a default page hosted by that provider. However it seems it's less common nowadays with SSL (though I could be wrong). Hmmm just testing right now seems to indicate it's still the case for some: $ openssl s_client -servername blah.com -connect www.cloudflare.com:443 /dev/null| grep -m1 CN 0 s:CN = ssl383624.cloudflaressl.com $ openssl s_client -servername blah.com -connect www.fastly.com:443 /dev/null | grep -m1 CN 0 s:CN = www.fastly.com So it seems to indicate that such infrastructures which are designed to host hundreds of thousands of domains still prefer to be able to deliver something if the user is willing to click on "accept the risk". Hmm, I understand why it was decided to go that route, and indeed that is probably not too too hard to do... Though I can't help wonder if an opt-in mechanism similar to the server state file would make sense for it? I seem to remember that it's possible to issue a "show ocsp" on the CLI and direct it to a file, but I could be wrong. That's the same way the server-state works by the way. I'll take a look at it, though the main advantage would be to keep it all in HAProxy and to avoid dirty cron jobs. But I appreciate that's me asking for HAProxy to do everything and more beyond its job... :) For the server state, it was performed at reload time, no need for a cron job. I don't see why it would have to be different here. It might be a bit more complicated than for the server state file since we could have OCSP responses that don't correspond to any file on the filesystem. We can't just assume that the foo.pem.ocsp file does not exist and create a file at this location since it might exist but not have been loaded by haproxy (through a ssl-load-extra-files none for instance). And updating only OCSP responses that were loaded at build time and for which we have an actual path would not be that satisfying. It would still require an extra step for new responses. We could add an extra global parameter that defines a directory in which those responses are written but reusing them for a new haproxy process would require that those responses are moved alongside their corresponding certificate, which could be a pain as well. But I fully agree that it would help to be able to perform this operation from haproxy directly. I just don't see a simple solution yet. Rémi
Re: OCSP update mechanism startup
Hello Tristan, On 06/07/2023 13:24, Tristan wrote: Hello, I'm trying to make use of the new ocsp-update mechanism, and finding no success (yet). I've migrated my crt bind arguments to a crt-list argument (+ relevant file) and that loads in and gets used fine, but despite having "ocsp-update on" on the lines, I'm not seeing any ocsp update being triggered (and naturally there are no ocsp responses being returned). Relevant outputs: $ echo "show ssl crt-list" | socat - /var/run/haproxy/admin.sock /etc/haproxy/ssl/crtlist-mdex-public-direct.conf ... $ echo "show ssl crt-list /etc/haproxy/ssl/crtlist-mdex-public-direct.conf" | socat - /var/run/haproxy/admin.sock /etc/haproxy/ssl/mangadex.dev.pem ocsp-update on mangadex.dev *.mangadex.dev The ocsp-update option should be between brackets /etc/haproxy/ssl/mangadex.dev.pem [ocsp-update on] mangadex.dev *.mangadex.dev $ echo "show ssl ocsp-updates" | socat - /var/run/haproxy/admin.sock OCSP Certid | Path | Next Update | Last Update | Successes | Failures | Last Update Status | Last Update Status (str) One potential hint I find is when using "update ssl ocsp-response ": $ echo "update ssl ocsp-response /etc/haproxy/ssl/mangadex.dev.pem" | socat - /var/run/haproxy/admin.sock 'update ssl ocsp-response' only works on certificates that already have a known OCSP response. Can't send ocsp request for /etc/haproxy/ssl/mangadex.dev.pem! And indeed, the (cli command) docs mention: > This command will only work for certificates that already had a stored OCSP response, either because it was provided during init or if it was previously set through the "set ssl cert" or "set ssl ocsp-response" commands So then: - does the ocsp-update feature also require an initial OCSP response to start at all, like the cli command? No, it will try to get the missing OCSP update responses right after init. - if so, why? (and we should get that documented on the ocsp-update flag) - does the OCSP update mechanism update the files on-disk? No we never write anything on disk. - if not, what happens if upon restart/reload of HAProxy the .ocsp file is outdated? will HAProxy aggressively try to get an up-to-date response before starting its listeners or will I be risking ssl issues by enabling OCSP must-staple with it? The OCSP update mechanism will not block anything, it runs alongside all the "regular" HAProxy tasks. If I remember correctly, you cannot load outdated OCSP responses so you should not face this particular problem. But if you have many certificates for which OCSP update was activated and no OCSP response was provided, fetching all the missing responses will indeed take some time and OCSP stapling will temporarily fail for the given server certificates. Please note that the 'show ssl ocsp-response' can now dump an OCSP response in base64 so that you can write the update OCSP responses to your filesystem by yourself. (Probably not super relevant, but this is with HAProxy 2.8.1 and QuicTLS 1.1.1t) Thanks in advance, Tristan Rémi
Re: log settings have no effect in case of SSL handshake problems
Hello, On 05/04/2023 09:39, Scharfenberg, Carsten wrote: Hello, I’m trying to setup haproxy RFC5424 logging to localhost and forwarding to a central log aggregator with rsyslog. Although this setup sounds quite straight forward and common to me, it’s really hard to setup due to weak documentation of both – haproxy and rsyslog – in this context and a lack of examples. Nevertheless I’ve succeeded after some hours of trial-and-error… Only my settings do not work in case of SSL handshake problems. In this case I still get standard log messages from haproxy. Is it possible to setup RFC5424 also for this case? These are my settings: global log localhost:1514 format rfc5424 local0 log-send-hostname […] defaults log global log-format-sd %{+E}o[my_sdid@12345\ client_ip=\"%ci\"\ client_port=\"%cp\"\ haproxy_frontend=\"%ft\"\ haproxy_backend=\"%b\"\ haproxy_server=\"%s\"\ haproxy_time_receive=\"%TR\"\ haproxy_time_queue=\"%Tc\"\ haproxy_time_response=\"%Tr\"\ haproxy_time_total=\"%Ta\"\ http_status_code=\"%ST\"\ bytes_read=\"%B\"\ haproxy_termination_state=\"%ts\"\ haproxy_total_connections=\"%ac\"\ haproxy_frontend_connections=\"%fc\"\ haproxy_backend_connections=\"%bc\"\ haproxy_server_connections=\"%sc\"\ haproxy_server_retries=\"%rc\"\ haproxy_server_queue=\"%sq\"\ haproxy_backend_queue=\"%bq\"\ http_request_headers=\"%hr\"\ http_response_headers=\"%hs\"\ http_request_method=\"%HM\"\ http_version=\"%HV\"\ http_request_path=\"%HPO\"\ http_request_query=\"%HQ\"] option httplog […] frontend my_frontend mode http bind 1.2.3.4:443 ssl […] […] backend my_backend […] A “normal” log message looks like this: <134>1 2023-04-05T09:00:14.893116+02:00 my_host haproxy 94107 - [my_sdid@12345 client_ip="4.3.2.1" client_port="65344" haproxy_frontend="my_frontend~" haproxy_backend="my_backend" haproxy_server="my_server01" haproxy_time_receive="0" haproxy_time_queue="1" haproxy_time_response="4" haproxy_time_total="5" http_status_code="200" bytes_read="168" haproxy_termination_state="--" haproxy_total_connections="1" haproxy_frontend_connections="1" haproxy_backend_connections="0" haproxy_server_connections="0" haproxy_server_retries="0" haproxy_server_queue="0" haproxy_backend_queue="0" http_request_headers="{my_user_agent}" http_response_headers="" http_request_method="GET" http_version="HTTP/1.1" http_request_path="/path" http_request_query="?query=foo"] 4.3.2.1:65344 [05/Apr/2023:09:00:14.887] my_frontend~ my_backend/my_server01 0/0/1/4/5 200 168 - - 1/1/0/0/0 0/0 {my_user_agent} "GET /path?query=foo HTTP/1.1" In case the SSL handshake fails (e.g. because of a simple TCP connection check): <134>1 2023-04-05T09:00:14.047002+02:00 my_host haproxy 94107 - - 4.3.2.1:65341 [05/Apr/2023:09:00:13.996] my_frontend/1: Connection closed during SSL handshake You might want to have a look at the "error-log-format" option that would allow you to define a dedicated log-format in case of SSL handshake errors (among others). The log line you have corresponds precisely to the legacy error log format that is used if no specific 'error-log-format' is defined. Regards Rémi Le Breton
Re: [PATCH] BUG/MINOR: ssl: Stop leaking `err` in ssl_sock_load_ocsp()
On 27/03/2023 15:31, Tim Düsterhus wrote: Hi On 3/19/23 16:07, Tim Duesterhus wrote: Previously performing a config check of `.github/h2spec.config` would report a 20 byte leak as reported in GitHub Issue #2082. The leak was introduced in a6c0a59e9af65180c3ff591b91855bea8d19b352, which is dev only. No backport needed. I believe you might've missed this patch. Best regards Tim Düsterhus Hi Tim, Sorry about that delay. The patch looks good to me. I'll let William merge it when he has the time. Rémi
Re: OpenSSL 1.1.1 vs 3.0 client cert verify "x509_strict" issues
Hello, On 12/12/2022 16:45, Froehlich, Dominik wrote: Hello HAproxy community! We’ve recently updated from OpenSSL 1.1.1 to OpenSSL 3.0 for our HAproxy deployment. We are now seeing some client certificates getting denied with these error messages: “*SSL client CA chain cannot be verified”/“error:0A86:SSL routines::certificate verify failed*” 30/0A86 We found out that for this CA certificate, the error was X509_V_ERR_MISSING_SUBJECT_KEY_IDENTIFIER This error is only thrown if we run openssl verify with the “-x509_strict” option. The same call (even with the “-x509_strict” option) on OpenSSL 1.1.1 returned OK and verified. Indeed, OpenSSL extended what the x509_strict option actually does in order to follow the requirements described in RFC 5280. OpenSSL's commit 0e071fbce4 gives a detailed list of the extra checks performed when x509_strict is set. As this was a bit surprising to us and we now have a customer who can’t use their client certificate anymore, we wanted to ask for some details on the OpenSSL verify check in HAproxy: * How does HAproxy call the “verify” command in OpenSSL? Actual certificate and certificate chain verification is performed inside OpenSSL so any default behavior change in OpenSSL itself might have an impact on which certificate we reject or not. * Does HAproxy use the “x509_strict” option programmatically? * Is there a flag in HAproxy that would allow us to temporarily disable the “strict” setting so that the customer has time to update their PKI? I did not try to reproduce the problem you encountered yet but you might have success with a proper crt-ignore-err and ca-ignore-err combination (on HAProxy's side). It does not disable strict checking per se but it could allow you to accept certificates that were otherwise rejected. * If there is no flag, we could temporarily patch out the code that uses the flag, can you give us some pointers? Thanks a lot for your help! Dominik Froehlich, SAP Hope this helps. Rémi LB
Re: [EXTERNAL] Re: [PATCH] get BoringSSL back to the game
On 02/02/2022 17:49, William Lallemand wrote: Subject: [PATCH 2/7] BUILD: SSL: define X509_OBJECT for BoringSSL X509_OBJECT is opaque in BonringSSL, since we still use it, let us move it to openssl-compat.h from https://boringssl.googlesource.com/boringssl/+/refs/heads/2924/include/openssl/x509_vfy.h#120 I'm not really fond of this kind of declaration, most of the time we added helpers that were available in recent version of OpenSSL in this file. But in this case, adding a whole structure that was removed... with no guarantee that this will continue to work it's not a good idea. From what I get they aligned the opaque structures with the OpenSSL API, so we probably will have the same problem with OpenSSL v3 without the obsolete API. And we are currently in the process of porting it to HAProxy. We probably need to change the code that uses X509_OBJECT. So I suppose it will start to work during this portage. X509_OBJECT and the APIs working on this structure were not marked as deprecated in OpenSSLv3, we are facing yet another place where BoringSSL seems a bit excessive in what they want to keep hidden. Managing BoringSSL would still be much more expensive than managing OpenSSLv3 if this kind of problem happens on many structures. Rémi
Re: [EXTERNAL] Re: [PATCH 1/2] MINOR: jwt: Make invalid static JWT algorithms an error in `jwt_verify` converter
Hello, On 05/11/2021 08:48, Willy Tarreau wrote: Hi Tim, On Thu, Nov 04, 2021 at 07:12:04PM +0100, Tim Düsterhus wrote: Your patch is already merged and the bug is fixed. However I'd like to comment on the reasons behind why I refactored the whole function to use the ist API: I *strongly* dislike code that just works because of some implicit assumptions that might or might not hold after future changes. This is C, every messed up boundary check can easily result in some major issues. In this specific case the implicit assumption is that all supported algorithms are exactly 5 characters long. This is true with the current implementation in HAProxy which just supports the HS, RS, ES, and PS families. However the JOSE specification [1] also specifies other values for the 'alg' field (though most of them for encrypted instead of just signed tokens) and they have differing lengths. If someone in the future wants to add support for a 6 character algorithm, then they need to remember to add the length check to not run into the same strncmp() issue like you did. My experience is that if a mistake is made once, it is going to be made again. The ist API already contains a large number of functions for *safe* handling of strings that are not NUL terminated. It is very hard to misuse them, because they all include the appropriate checks. If you see isteq(dynamic, ist("static")) then it is very clear that this will match the string "statis" exactly. For strncmp you will need to perform the length check yourself. You also need to remember to manually subtract 1 for the trailing NUL when using sizeof, whereas this is done automatically by the ist() function. While I am not an expert in optimization and getting the last percent of performance out of a function, I don't think the ist API is going to be measurably slower than strncmp. It is used everywhere within HTX after all! For the JWT the OpenSSL crypto is going to take up the majority of the time anyway. We are definitely not yet at a point where performance is the biggest issue and I did not code it with this as my main focus. I simply use standard APIs because I know them better than ist. Hope this help to clarify why I'm strongly pushing the use of the ist API and why I've contributed a fair number of additional functions in the past when I encountered situations where the current functions were not ergonomic to use. One example is the istsplit() function which I added for uri_normalizer.c. It makes tokenization of strings very easy and safe. I get why you like this API, it is indeed there to avoid silly mistakes but in this case replacing all the standard calls by ist ones to fix a simple bug (fixed by a single line patch) seemed a bit overkill. I overall generally agree with your arguments above as I created ist exactly to address such trouble that always end up with someone spend 3 days on a but in someone else's code just to discover a logic mistake such as a string function called on a string that's not always nul-terminated, or reverse scanning that can occasionally continue before the beginning of the string etc. However, there is an important aspect to keep in mind, which is that Rémi is the maintainer of this part and he needs to use what he feels at ease with. I can easily understand that he's not much familiar with the usage of ist. That's exactly it. I'm not refusing to use this API, I just don't know it enough yet to use it spontaneously (especially when I don't interact with other parts of the code which might encourage me to use internal types). While the functions are correctly documented and generally useful, their coverage is not always complete (and you had a few times to add new ones yourself), and we don't have a simple index of what's available with a one-liner description, which doesn't help to get familiar with them. Massive +1 for this one. Scrolling through the entire ist.h file to look for the function that answers to your need (if there is even one) is quite painful. In addition there are places where the input text is still made of nul terminated strings, that require an extra conversion that can also be error-prone, or just would add some code whose benefit is not immediately perceptible. I, too, hope that over time we make most of the code that deals with strings coming from user data evolve towards a more generalized use of ist for the reasons you mention. Safer, easier, friendlier API when you're in a code area that already makes use of them. I will just not force anyone to do that, though I can encourage by showing how to use it. However you can be sure that I'll complain very loudly if I have to debug something that was caused by unsafe code that could have been avoided this way. The performance aspect is indeed something that ought not be neglected in two domains: - CPU: the functions are made to be inlined, and as their end is known, usually this results in trivial operations that can be
[PATCH] BUG/MINOR: jwt: Fix jwt_parse_alg incorrectly returning JWS_ALG_NONE
jwt_parse_alg would mistakenly return JWT_ALG_NONE for algorithms "", "n", "no" and "non" because of a strncmp misuse. It now sees them as unknown algorithms. No backport needed. --- src/jwt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/jwt.c b/src/jwt.c index 94bfa5adb..8c4537542 100644 --- a/src/jwt.c +++ b/src/jwt.c @@ -34,7 +34,7 @@ enum jwt_alg jwt_parse_alg(const char *alg_str, unsigned int alg_len) /* Algorithms are all 5 characters long apart from "none". */ if (alg_len < sizeof("HS256")-1) { - if (strncmp("none", alg_str, alg_len) == 0) + if (alg_len == sizeof("none")-1 && strcmp("none", alg_str) == 0) alg = JWS_ALG_NONE; return alg; } -- 2.32.0
Re: [EXTERNAL] Re: [PATCH 1/2] MINOR: jwt: Make invalid static JWT algorithms an error in `jwt_verify` converter
Hello, On 02/11/2021 16:50, Willy Tarreau wrote: Tim, On Fri, Oct 29, 2021 at 06:06:55PM +0200, Tim Duesterhus wrote: It is not useful to start a configuration where an invalid static string is provided as the JWT algorithm. Better make the administrator aware of the suspected typo by failing to start. I'm hopeful that I can finally emit dev12 this evening as we think we've just nailed down the resolvers bugs. I'll wait for Rémi's ack for these ones as I'm really clueless about that are for now, but that will likely get merged in next one as that seems to look fine at first glance. Regarding the question about "unlikely()" for the 2nd patch, it wouldn't change anything given that most of the cost is already spent in the comparisons, the assignment is totally benign. The first patch is ok, it even fixes a mistake I made in the error message, which did not tell which algorithm was wrong. As for the second one, it would have been easier to simply add a string length comparison before the strcmp in my opinion. We would have had a one line fix instead of a full conversion of strXXX calls into ist equivalents (most of which worked fine thanks to the constant algorithm string length). Rémi
Re: [EXTERNAL] Re: [PATCH] CLEANUP: http_fetch: Use ist helpers in smp_fetch_http_auth_bearer()
Hello Tim, On 29/10/2021 16:57, Tim Düsterhus wrote: Willy, On 10/29/21 8:50 AM, Willy Tarreau wrote: I don't see how this can ever match: - we search for a space in the first characters starting at - if we find one such space, we check if these characters are exactly equal to the string "Bearer" (modulo the case), and if so we take the value. => by definition this cannot match since there is no space in "Bearer" It made me doubt about strncasecmp() to the point that I had to write some code to verify I wasn't mistaken about how it worked. Rémi, am I missing something or is it just that this code snippet indeed has a bug that was not spotted by the regtests (which I'm fine with, they're regression tests, not unit tests seeking 100% coverage) ? You are not missing anything. This is indeed broken, because no reg-test covers calling the `http_auth_bearer` fetch with a string parameter. Thus the tests always go into the 'else' part where `get_http_auth()` handles the parsing. The correct fix would then of course be that the parsing logic is moved into a dedicated function (ideally based on the ist API) and not copied into several different functions. I retract my patch. Remi: Shall I file an issue to track the duplicated parsing logic or will you handle this based on this ML thread? Willy and I talked about this together but I forgot to add the list when I first replied to him. Some patches are waiting for a review already, including yours. Rémi
Re: Proposal about new default SSL log format
Hello, On 06/07/2021 14:55, Willy Tarreau wrote: On Tue, Jul 06, 2021 at 12:19:56PM +0200, Tim Düsterhus wrote: Willy, On 7/6/21 12:12 PM, Willy Tarreau wrote: A few points first, that are needed to address various concerns. The goal here is to defined an HTTPS log format because that's what the vast majority of users are dealing with day-to-day. For specific usages, everyone already redefines log formats. But for a very basic setup, right now it's true that httplog is a bit limited, and all it shows to help guess there was SSL involved is to append this '~' after the frontend's name. It was not clear to me that this log format is meant to apply to HTTPS requests, because the example given by Remi does not include the HTTP verb and neither the request URI (unless I need glasses). I thought it was a format for TLS errors or something like this. At least it was my understanding that it was mostly aimed at HTTPS given that the discussion we had on the subject started around this :-) Is this a mistake in the examples? Or is HAProxy going to emit multiple log lines: One for the TLS connection and one for each HTTP request passing through this TLS connection? There was a discussion around logging at various layers, I don't remember if it was on the list or in a github issue, but I remember having been involved in this some time ago. I think that it makes sense to ultimately be able to emit error logs at various levels (i.e. when being certain that no other event will happen) but it's particularly confusing to log successes at various levels. Thus typically seeing TLS handshake failures is important but their success should not be logged as all the info will instead be available with the traffic logs. There are already enough users complaining about noise caused by client-aborted connections :-) Willy I was indeed more focused on logging SSL related information only, with the idea that an SSL log line could be displayed after a completed handshake, hence the lack of upper layer information in the log line. But it would indeed bring a new level of complexity in the log because correlating the SSL and HTTP log lines of a specific session might be a pain. After talking with Willy, an https log was deemed more useful. It would only be an extension of the existing HTTP log with SSL specific information added. This log format would concern every log line raised by the frontends using this log format (one line per HTTP response of the SSL session). A quick recap of the topics raised by the multiple conversations had in this thread : - we won't used tabs as separators in order to remain consistent with the existing log format (especially since this new format will only be an extension of the HTTP one) - the date format won't change right now, it is a whole new subject - the logged lines will all have the same "date+process_name+pid" prefix as the already existing logs - the HTTP log format won't be touched yet, changing it would be a whole new subject as well The log would then be as follows : >>> Jul 1 18:11:31 haproxy[143338]: 127.0.0.1:37740 [01/Jul/2021:18:11:31.517] \ ssl_frontend~ ssl_frontend/s2 0/0/0/7/+7 200 2750 - - \ 1/1/1/1/0 0/0 {1wt.eu} {} "GET /index.html HTTP/1.1 \ 0/0/0/0 TLSv1.3/TLS_AES_256_GCM_SHA384 Field Format Extract from the example above 1 process_name '[' pid ']:' haproxy[143338]: 2 client_ip ':' client_port 127.0.0.1:37740 3 '[' request_date ']' [01/Jul/2021:18:11:31.517] 4 frontend_name ssl_frontend~ 5 backend_name '/' server_name ssl_frontend/s2 6 TR '/' Tw '/' Tc '/' Tr '/' Ta* 0/0/0/7/+7 7 status_code 200 8 bytes_read* 2750 9 captured_request_cookie - 10 captured_response_cookie - 11 termination_state 12 actconn '/' feconn '/' beconn '/' srv_conn '/' retries* 1/1/1/1/0 13 srv_queue '/' backend_queue 0/0 14 '{' captured_request_headers* '}' {haproxy.1wt.eu} 15 '{' captured_response_headers* '}' {} 16 '"' http_request '"' "GET /index.html HTTP/1.1" 17 *conn_status '/' SSL fe hsk error '/' SSL vfy '/' SSL CA vfy* 0/0/0/0 18 *ssl_version '/' ssl_ciphers* TLSv1.3/TLS_AES_256_GCM_SHA384 and the corresponding log-format : %ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC \ %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r \ *%[conn_err_code]/%[ssl_fc_hsk_err]/%[ssl_c_err]/%[ssl_c_ca_err]* \ *%sslv/%sslc* Thanks Rémi
Re: Proposal about new default SSL log format
Hello Aleksandar, On 03/07/2021 13:19, Aleksandar Lazic wrote: Hi Remi. How about to combine ssl_version/ssl_ciphers in one line. Yes why not. It would be helpful to see also the backend status. Maybe add a 14th and 15th line with following fields *backend_name '/' conn_status '/' SSL hsk error '/' SSL vfy '/' SSL CA vfy* *backend_name '/' ssl_version '/' ssl_ciphers* The backend name is already included in the log line, I'll avoid duplicating information. I had in the past several issues with the backend where the backend CA wasn't in the CA File which was quite difficult to debug. Backend related errors won't be raised in the log line since it is emitted on the fronted side of the connection but this backend error problem will also be treated after this. +1 to the suggestion from Илья Шипицин to use iso8601 which is already in haproxy since 2019/10/01:2.1-dev2. I haven't found sub second format parameter in strftime call therefore I assume the strftime call have this ".00" as fix value. ``` strftime(iso_time_str, sizeof(iso_time_str), "%Y-%m-%dT%H:%M:%S.00+00:00", ) ``` In the timeofday_as_iso_us function where the strftime call happens, the time zone and microsecond precision are set in the time string just after the strftime call (see the 'offset' variable and the utoa_pad call). Maybe another option is to use TAI for timestamps. https://en.wikipedia.org/wiki/International_Atomic_Time https://cr.yp.to/proto/utctai.html http://www.madore.org/~david/computers/unix-leap-seconds.html Thanks Rémi Jm2c Alex Thanks Rémi
Re: Proposal about new default SSL log format
Hello, On 02/07/2021 16:52, Илья Шипицин wrote: I worked with log formats a lot, couple of thoughts 1) tab separated is better for any log import tool (mixing spaces and "/" is terrible for import) I don't have any problems with that apart from inconsistency with the other default formats. If switching to tabs for this format only does not bother anyone I'll do it. 2) time should be iso8601 For now the "format" field of the "log" option only applies to the log line header (containing the timestamp and process_name/pid). It could be nice that it applies also to all time related fields of the log format. I'll try to see if it is easily feasible. Thanks Rémi
Re: Proposal about new default SSL log format
Hello, On 02/07/2021 16:56, Илья Шипицин wrote: also, "process name" is something that is prior knowledge. no need to log it every time (for millions of requests) This process name part does not seem to come from the log format line, it is never mentioned in the HTTP log-format string. If it is a prefix appended to all the haproxy logs it will also be added to SSL ones, regardless of the new log format. Rémi
Re: Proposal about new default SSL log format
Hello Tim, On 02/07/2021 16:34, Tim Düsterhus wrote: Remi, On 7/2/21 4:26 PM, Remi Tricot-Le Breton wrote: But if anybody sees a missing information that could be beneficial for everybody, feel free to tell it, nothing is set in stone yet. […] Feel free to suggest any missing data, which could come from log-format specific fields or already existing sample fetches. Not sure whether the HTTP format will also be updated then, but: Put %ID somewhere by default, please. I won't touch the HTTP format yet because it would bring too much trouble for now since some people might already have dedicated parsers they can't change whenever they want but I can easily add this ID field to the SSL format. Thanks Rémi
Proposal about new default SSL log format
Hello list, Some work in ongoing to ease connection error and SSL handshake error logging. This will rely on some new sample fetches that could be added to a custom log-format string. In order to ease SSL logging and debugging, we will also add a new default log format for SSL connections. Now is then the good time to find the best format for everyone. The proposed format looks like the HTTP one to which the SSL specific information is added. But if anybody sees a missing information that could be beneficial for everybody, feel free to tell it, nothing is set in stone yet. The format would look like this : >>> Jul 1 18:11:31 haproxy[143338]: 127.0.0.1:37740 [01/Jul/2021:18:11:31.517] \ ssl_frontend~ ssl_frontend/s2 0/0/0/7/+7 \ 0/0/0/0 2750 1/1/1/1/0 0/0 TLSv1.3 TLS_AES_256_GCM_SHA384 Field Format Extract from the example above 1 process_name '[' pid ']:' haproxy[143338]: 2 client_ip ':' client_port 127.0.0.1:37740 3 '[' request_date ']' [01/Jul/2021:18:11:31.517] 4 frontend_name ssl_frontend~ 5 backend_name '/' server_name ssl_frontend/s2 6 TR '/' Tw '/' Tc '/' Tr '/' Ta* 0/0/0/7/+7 7 *conn_status '/' SSL hsk error '/' SSL vfy '/' SSL CA vfy* 0/0/0/0 8 bytes_read* 2750 9 termination_state 10 actconn '/' feconn '/' beconn '/' srv_conn '/' retries* 1/1/1/1/0 11 srv_queue '/' backend_queue 0/0 12 *ssl_version* TLSv1.3 13 *ssl_ciphers* TLS_AES_256_GCM_SHA384 The equivalent log-format string would be the following : "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta \ %[conn_err_code]/%[ssl_fc_hsk_err]/%[ssl_c_err]/%[ssl_c_ca_err] \ %B %ts %ac/%fc/%bc/%sc/%rc %sq/%bq %sslv %sslc The fields in bold are the SSL specific ones and the statuses ones will come from a not yet submitted code so the names and format might slightly change. Feel free to suggest any missing data, which could come from log-format specific fields or already existing sample fetches. Thanks Rémi
Re: [PATCH] BUG/MINOR: cache: Correctly handle existing-but-empty, 'accept-encoding' header
Hello Tim, On 18/06/2021 15:26, Tim Düsterhus, WoltLab GmbH wrote: Remi, please find a small fix for the 'Vary' support of the cache to correctly handle the difference between a missing 'accept-encoding' and an empty 'accept-encoding'. Best regards Tim Düsterhus Developer WoltLab GmbH Nice catch, I completely forgot about this case. @Willy, the patch looks good to me. Thanks Rémi
Re: Upgrading from 1.8 to 2.4, getting warning I can't figure out
Hello, On 07/06/2021 01:23, Shawn Heisey wrote: On 6/5/2021 10:47 PM, Shawn Heisey wrote: On 6/5/2021 9:30 PM, Shawn Heisey wrote: [WARNING] (81457) : Loading: OCSP response status not successful. Content will be ignored. This error message happens when a call to OpenSSL's OCSP_response_status function on your response returns anything other than OCSP_RESPONSE_STATUS_SUCCESSFUL which means that we won't be able to process your response. Another self-followup: Apparently that warning also happens with 1.8.22 ... I was unaware of this, as I haven't checked the config file manually for a very long time. root@smeagol:/etc/haproxy# haproxy -c -f /etc/haproxy/haproxy.cfg [WARNING] 156/172157 (328956) : Loading '/etc/ssl/certs/local/mainwildcards.pem.ocsp': OCSP response status not successful. Content will be ignored. Configuration file is valid The .ocsp file DOES contain a valid OCSP response. So ... I think I'm probably good to proceed with the upgrade. I know that on an older version of 1.8, no idea which one, this warning did not happen. Can this thread serve as a possible bug report? OCSP stapling won't work on any version that shows this warning (for this specific response). But apart from that, everything else should work fine, that's why you only get a warning when parsing the configuration file. If you are positive that your OCSP response is valid we may indeed have a bug on our side so you could open an issue on GitHub (https://github.com/haproxy/haproxy/issues). If we were to track a bug through the ML there is a high chance of it being lost pretty quickly. Thanks, Shawn Rémi
Re: [PATCH 1/3] MINOR: cache: Remove the `hash` part of the accept-encoding secondary key
Hello Tim, The patches look good to me. Just one point concerning the first one, you could have removed everything related to ACCEPT_ENCODING_MAX_ENTRIES since the limit was induced by the array that does not exist anymore. The comment of the accept_encoding_normalizer function does not match its behavior anymore either. Nothing to say apart from that, thanks for catching this. Rémi On 16/01/2021 15:07, Tim Duesterhus wrote: As of commit 6ca89162dc881df8fecd7713ca1fe5dbaa66b315 this hash no longer is required, because unknown encodings are not longer stored and known encodings do not use the cache. --- include/haproxy/http_ana-t.h | 2 +- src/cache.c | 80 2 files changed, 18 insertions(+), 64 deletions(-) diff --git a/include/haproxy/http_ana-t.h b/include/haproxy/http_ana-t.h index 63085a5b8..50fbd3de8 100644 --- a/include/haproxy/http_ana-t.h +++ b/include/haproxy/http_ana-t.h @@ -95,7 +95,7 @@ * request/response pairs, because they depend on the responses' optional Vary * header. The different sizes can be found in the vary_information object (see * cache.c).*/ -#define HTTP_CACHE_SEC_KEY_LEN (sizeof(int)+sizeof(int)+sizeof(int)) +#define HTTP_CACHE_SEC_KEY_LEN (sizeof(uint32_t)+sizeof(int)) /* Redirect flags */ diff --git a/src/cache.c b/src/cache.c index 32c99cdc4..965509871 100644 --- a/src/cache.c +++ b/src/cache.c @@ -100,11 +100,6 @@ struct vary_hashing_information { int(*cmp_fn)(const void *ref_hash, const void *new_hash, unsigned int hash_len); /* Comparison function, should return 0 if the hashes are alike */ }; -struct accept_encoding_hash { - unsigned int encoding_bitmap; - unsigned int hash; -} __attribute__((packed)); - static int http_request_prebuild_full_secondary_key(struct stream *s); static int http_request_build_secondary_key(struct stream *s, int vary_signature); static int http_request_reduce_secondary_key(unsigned int vary_signature, @@ -123,7 +118,7 @@ static int accept_encoding_hash_cmp(const void *ref_hash, const void *new_hash, /* Warning : do not forget to update HTTP_CACHE_SEC_KEY_LEN when new items are * added to this array. */ const struct vary_hashing_information vary_information[] = { - { IST("accept-encoding"), VARY_ACCEPT_ENCODING, sizeof(struct accept_encoding_hash), _encoding_normalizer, _encoding_hash_cmp }, + { IST("accept-encoding"), VARY_ACCEPT_ENCODING, sizeof(uint32_t), _encoding_normalizer, _encoding_hash_cmp }, { IST("referer"), VARY_REFERER, sizeof(int), _normalizer, NULL }, }; @@ -2151,19 +2146,6 @@ struct flt_ops cache_ops = { }; -int accept_encoding_cmp(const void *a, const void *b) -{ - unsigned int int_a = *(unsigned int*)a; - unsigned int int_b = *(unsigned int*)b; - - if (int_a < int_b) - return -1; - if (int_a > int_b) - return 1; - return 0; -} - - #define CHECK_ENCODING(str, encoding_name, encoding_value) \ ({ \ int retval = 0; \ @@ -2290,18 +2272,16 @@ static int parse_encoding_value(struct ist encoding, unsigned int *encoding_valu static int accept_encoding_normalizer(struct htx *htx, struct ist hdr_name, char *buf, unsigned int *buf_len) { - unsigned int values[ACCEPT_ENCODING_MAX_ENTRIES] = {}; size_t count = 0; - struct accept_encoding_hash hash = {}; + uint32_t encoding_bitmap = 0; unsigned int encoding_bmp_bl = -1; - unsigned int prev = 0, curr = 0; struct http_hdr_ctx ctx = { .blk = NULL }; unsigned int encoding_value; unsigned int rejected_encoding; /* A user agent always accepts an unencoded value unless it explicitly * refuses it through an "identity;q=0" accept-encoding value. */ - hash.encoding_bitmap |= VARY_ENCODING_IDENTITY; + encoding_bitmap |= VARY_ENCODING_IDENTITY; /* Iterate over all the ACCEPT_ENCODING_MAX_ENTRIES first accept-encoding * values that might span acrosse multiple accept-encoding headers. */ @@ -2314,50 +2294,37 @@ static int accept_encoding_normalizer(struct htx *htx, struct ist hdr_name, if (rejected_encoding) encoding_bmp_bl &= ~encoding_value; else - hash.encoding_bitmap |= encoding_value; + encoding_bitmap |= encoding_value; } else { /* Unknown encoding */ - hash.encoding_bitmap |= VARY_ENCODING_OTHER; + encoding_bitmap |= VARY_ENCODING_OTHER; } - values[count++] = hash_crc32(istptr(ctx.value), istlen(ctx.value)); + count++; } /* If a "*" was found in the accepted encodings (without a null weight), * all the encoding
Re: [PATCH] BUG/MEDIUM: cache: Fix hash collision in `accept-encoding` handling for `Vary`
Hello Willy, Tim, On 29/12/2020 07:52, Willy Tarreau wrote: Hi Tim, On Mon, Dec 28, 2020 at 12:47:52PM +0100, Tim Duesterhus wrote: This patch fixes GitHub Issue #988. Commit ce9e7b25217c46db1ac636b2c885a05bf91ae57e was not sufficient, because it fell back to a hash comparison if the bitmap of known encodings was not acceptable instead of directly returning the the cached response is not compatible. This patch also extends the reg-test to test the hash collision that was mentioned in #988. Vary handling is 2.4, no backport needed. Thanks for this, but while your patch looks valid, the regtest fails for me: c1http[ 0] |HTTP/1.1 c1http[ 1] |200 c1http[ 2] |OK c1http[ 3] |content-type: text/plain c1http[ 4] |vary: accept-encoding c1http[ 5] |cache-control: max-age=5 c1http[ 6] |content-length: 48 c1http[ 7] |age: 0 c1http[ 8] |x-cache-hit: 1 c1c-l|!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNO c1bodylen = 48 ** c1=== expect resp.status == 200 c1EXPECT resp.status (200) == "200" match ** c1=== expect resp.bodylen == 45 c1EXPECT resp.bodylen (48) == "45" failed I don't know if it's the code or the test, but we're having an issue here it seems. Note that this response has no content-encoding so there might be something caused by this. The specific accept-encoding test works but the regular vary.vtc does fail. But it might need a test change rather than a code change because it exhibits a use case we might not want to manage : If a server responds with a known encoding to a request that only has unknown encodings, we won't be able to answer directly from the cache, even if the accept-encoding header values match. This is caused by the "return (ref.encoding_bitmap & new.encoding_bitmap) != ref.encoding_bitmap" line in the accept_encoding_hash_cmp function. This is rather easy to do from a vtc but I don't know if this has a very high chance of happening on real networks. And considering that the alternative is a possible cache corruption because of hash collisions, we could probably add it as a known limitation. Concerning the vary.vtc, the new behavior can be explained as follows : Because of the absence of content-encoding in the second response of the vtc, its encoding is set to "identity", which is accepted by default by any client not rejecting it explicitely. So when looking into the cache for a response that might suit a request having an "Accept-Encoding: first_value" header, this specific one matches. On a related note, I think we're still having an issue that was revealed by your discovery of the bug and the fix. Indeed, initially there was only the hash for the list of values. The goal was to simply have an easy to compare value for a set of known encodings. Now for the known ones we have a bitmap. Thus we shouldn't let the known values contribute to the hash anymore. I remain convinced that now that we have the bitmap, we're probably deploying too many efforts to allow caching of unknown encodings with all the consequences it implies, and that just like in the past we would simply not cache when Vary was presented, not caching when an unknown encoding is presented would seem perfectly acceptable to me, especially given the already wide list of known encodings. And when I see the test using "Accept-Encoding: br,gzip,xxx,jdcqiab", this comforts me in this idea, because my feeling here is that the client supports two known encodings and other ones, if we find an object in the cache using no more than these encodings, it can be used, otherwise the object must be fetched from the server, and the response should only be cached if it uses known encodings. And if the server uses "jdcqiab" as one of the encodings I don't care if it results in the object not being cached. It would not be that hard to discard responses that have an unknown encoding but for now the content-encoding of the response is only parsed when the response has a vary header (it could actually have been limited to a response that varies on accept-encoding specifically). And considering that we probably don't want to parse the content-encoding all the time, responses with an unknown encoding would only be considered uncacheable when they have a vary, and would be cached when they don't. I'm totally fine with it if you are too, it would only require a few lines of explanation in the doc. Just my two cents, Willy Rémi