Re: [2.1.1] http-request replace-uri does not work
On Tue, Dec 17, 2019 at 07:06:37AM +0100, Willy Tarreau wrote: > Actually a slightly cleaner one that should even be backportable to 2.0. (...) replace-path now merged and backported to 2.1. Willy
Re: [2.1.1] http-request replace-uri does not work
On Tue, Dec 17, 2019 at 11:34:03AM +0100, Artur wrote: > For my current setup I tested set-path as suggested by Willy and it > works fine with the following setup : > > acl is_dev_qd hdr(host) -i dev.q.d dev.q.d > acl is_qd hdr(host) -i q.d qs.d www.q.d www.qs.d > acl is_ppds path_beg -i /PPDSlide/ > http-request set-path /d3%[path] if is_ppds is_dev_question_direct > http-request set-path /p3%[path] if is_ppds is_question_direct > > While using replace-uri, my mistake was to blindly follow the official > documentation examples. You didn't make a mistake, it's just that the ecosystem evolves and that what appeared as natural to everyone 10 years ago (i.e. URIs mostly start with '/' unless you're in front of a proxy) is not true anymore in 2019. Cheers, Willy
Re: [2.1.1] http-request replace-uri does not work
Hello, Le 17/12/2019 à 08:41, Julien Pivotto a écrit : > On 17 Dec 06:58, Willy Tarreau wrote: >> On Tue, Dec 17, 2019 at 06:08:56AM +0100, Willy Tarreau wrote: >>> But now I'm starting to suspect that most of the problem comes from the >>> fact that people who used to rely on regex in the past will not as easily >>> perform their rewrites using set-path as they would using a replace rule >>> which is very similar to the old set. So probably we'd need to introduce >>> a "replace-path" action and suggest it in the warning emitted for reqrep. >>> >>> I think it is important that we properly address such needs and am >>> willing to backport anything like this to 2.1 to ease the transition if >>> that's the best solution. >> What about this ? It does exactly what's needed for me. It's self-contained >> enough that we could get it backported to 2.1 and maybe even to 2.0 (though >> it would require some adaptations to legacy mode there). >> >> Willy > I am in favour of replace-path. Yes, it will be nice to have it. For my current setup I tested set-path as suggested by Willy and it works fine with the following setup : acl is_dev_qd hdr(host) -i dev.q.d dev.q.d acl is_qd hdr(host) -i q.d qs.d www.q.d www.qs.d acl is_ppds path_beg -i /PPDSlide/ http-request set-path /d3%[path] if is_ppds is_dev_question_direct http-request set-path /p3%[path] if is_ppds is_question_direct While using replace-uri, my mistake was to blindly follow the official documentation examples. None of these uses absolute uri form. -- Best regards, Artur
Re: [2.1.1] http-request replace-uri does not work
On Tue, Dec 17, 2019 at 08:41:55AM +0100, Julien Pivotto wrote: > On 17 Dec 06:58, Willy Tarreau wrote: > > On Tue, Dec 17, 2019 at 06:08:56AM +0100, Willy Tarreau wrote: > > > But now I'm starting to suspect that most of the problem comes from the > > > fact that people who used to rely on regex in the past will not as easily > > > perform their rewrites using set-path as they would using a replace rule > > > which is very similar to the old set. So probably we'd need to introduce > > > a "replace-path" action and suggest it in the warning emitted for reqrep. > > > > > > I think it is important that we properly address such needs and am > > > willing to backport anything like this to 2.1 to ease the transition if > > > that's the best solution. > > > > What about this ? It does exactly what's needed for me. It's self-contained > > enough that we could get it backported to 2.1 and maybe even to 2.0 (though > > it would require some adaptations to legacy mode there). > > > > Willy > > I am in favour of replace-path. > Should we have a replace-domain or so for the first part of the abs url? Technically speaking it's called the authority. And since we take great care to keep it synchronized with the Host header field, actually one can simply decide to adjust the Host header to adjust this part. I'm just realizing that we should mention this in the doc as well. Willy
Re: [2.1.1] http-request replace-uri does not work
On 17 Dec 06:58, Willy Tarreau wrote: > On Tue, Dec 17, 2019 at 06:08:56AM +0100, Willy Tarreau wrote: > > But now I'm starting to suspect that most of the problem comes from the > > fact that people who used to rely on regex in the past will not as easily > > perform their rewrites using set-path as they would using a replace rule > > which is very similar to the old set. So probably we'd need to introduce > > a "replace-path" action and suggest it in the warning emitted for reqrep. > > > > I think it is important that we properly address such needs and am > > willing to backport anything like this to 2.1 to ease the transition if > > that's the best solution. > > What about this ? It does exactly what's needed for me. It's self-contained > enough that we could get it backported to 2.1 and maybe even to 2.0 (though > it would require some adaptations to legacy mode there). > > Willy I am in favour of replace-path. Should we have a replace-domain or so for the first part of the abs url? -- (o-Julien Pivotto //\Open-Source Consultant V_/_ Inuits - https://www.inuits.eu signature.asc Description: PGP signature
Re: [2.1.1] http-request replace-uri does not work
On Tue, Dec 17, 2019 at 06:58:11AM +0100, Willy Tarreau wrote: > On Tue, Dec 17, 2019 at 06:08:56AM +0100, Willy Tarreau wrote: > > But now I'm starting to suspect that most of the problem comes from the > > fact that people who used to rely on regex in the past will not as easily > > perform their rewrites using set-path as they would using a replace rule > > which is very similar to the old set. So probably we'd need to introduce > > a "replace-path" action and suggest it in the warning emitted for reqrep. > > > > I think it is important that we properly address such needs and am > > willing to backport anything like this to 2.1 to ease the transition if > > that's the best solution. > > What about this ? It does exactly what's needed for me. It's self-contained > enough that we could get it backported to 2.1 and maybe even to 2.0 (though > it would require some adaptations to legacy mode there). Actually a slightly cleaner one that should even be backportable to 2.0. We could trivially extend it to also support the query string though I'm not convinced at all that it would make sense to have a regex-based "replace-query". Willy >From 126c472884112caa9a70ac6d2fea71062f36c5cb Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Tue, 17 Dec 2019 06:52:51 +0100 Subject: MINOR: http: add a new "replace-path" action This action is very similar to "replace-uri" except that it only acts on the path component. This is assumed to better match users' expectations when they used to rely on "replace-uri" in HTTP/1 because mostly origin forms were used in H1 while mostly absolute URI form is used in H2, and their rules very often start with a '/', and as such do not match. It could help users to get this backported to 2.0 and 2.1. --- doc/configuration.txt | 27 ++- src/cfgparse-listen.c | 2 +- src/http_act.c| 20 +++- 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index fdcdb04fae..3ba340c38d 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -4463,6 +4463,30 @@ http-request replace-header # outputs: User-Agent: foo +http-request replace-path + [ { if | unless } ] + + This works like "replace-header" except that it works on the request's path + component instead of a header. The path component starts at the first '/' + after an optional scheme+authority. It does contain the query string if any + is present. The replacement does not modify the scheme nor authority. + + It is worth noting that regular expressions may be more expensive to evaluate + than certain ACLs, so rare replacements may benefit from a condition to avoid + performing the evaluation at all if it does not match. + + Example: +# prefix /foo : turn /bar?q=1 into /foo/bar?q=1 : +http-request replace-path (.*) /foo\1 + +# suffix /foo : turn /bar?q=1 into /bar/foo?q=1 : +http-request replace-path ([^?]*)(\?(.*))? \1/foo\2 + +# strip /foo : turn /foo/bar?q=1 into /bar?q=1 +http-request replace-path /foo/(.*) /\1 +# or more efficient if only some requests match : +http-request replace-path /foo/(.*) /\1 if { url_beg /foo/ } + http-request replace-uri [ { if | unless } ] @@ -4484,7 +4508,8 @@ http-request replace-uri with HTTP/2, clients are encouraged to send absolute URIs only, which look like the ones HTTP/1 clients use to talk to proxies. Such partial replace-uri rules may then fail in HTTP/2 when they work in HTTP/1. Either the rules need - to be adapted to optionally match a scheme and authority. + to be adapted to optionally match a scheme and authority, or replace-path + should be used. Example: # rewrite all "http" absolute requests to "https": diff --git a/src/cfgparse-listen.c b/src/cfgparse-listen.c index 507e071734..9975e46876 100644 --- a/src/cfgparse-listen.c +++ b/src/cfgparse-listen.c @@ -3694,7 +3694,7 @@ stats_error_parsing: } else if (!strcmp(args[0], "cliexp") || !strcmp(args[0], "reqrep")) { /* replace request header from a regex */ ha_alert("parsing [%s:%d] : The '%s' directive is not supported anymore since HAProxy 2.1. " -"Use 'http-request replace-uri' and 'http-request replace-header' instead.\n", +"Use 'http-request replace-path', 'http-request replace-uri' or 'http-request replace-header' instead.\n", file, linenum, args[0]); err_code |= ERR_ALERT | ERR_FATAL; goto out; diff --git a/src/http_act.c b/src/http_act.c index d6015d3624..c8d9220feb 100644 --- a/src/http_act.c +++ b/src/http_act.c @@ -133,6 +133,8 @@ static enum act_parse_ret parse_set_req_line(const char **args, int *orig_arg, s * .arg.act.p[]. It builds a string in the trash from the format string * previously filled by function parse_replace_uri()
Re: [2.1.1] http-request replace-uri does not work
On Tue, Dec 17, 2019 at 06:08:56AM +0100, Willy Tarreau wrote: > But now I'm starting to suspect that most of the problem comes from the > fact that people who used to rely on regex in the past will not as easily > perform their rewrites using set-path as they would using a replace rule > which is very similar to the old set. So probably we'd need to introduce > a "replace-path" action and suggest it in the warning emitted for reqrep. > > I think it is important that we properly address such needs and am > willing to backport anything like this to 2.1 to ease the transition if > that's the best solution. What about this ? It does exactly what's needed for me. It's self-contained enough that we could get it backported to 2.1 and maybe even to 2.0 (though it would require some adaptations to legacy mode there). Willy >From 290a8473a9a8d65685a6a3c4922a581eaa7c3377 Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Tue, 17 Dec 2019 06:52:51 +0100 Subject: MINOR: http: add a new "replace-path" action This action is very similar to "replace-uri" except that it only acts on the path component. This is assumed to better match users' expectations when they used to rely on "replace-uri" in HTTP/1 because mostly origin forms were used in H1 while mostly absolute URI form is used in H2, and their rules very often start with a '/', and as such do not match. It could help users to get this backported to 2.0 and 2.1. --- doc/configuration.txt | 27 ++- src/cfgparse-listen.c | 2 +- src/http_act.c| 21 - 3 files changed, 43 insertions(+), 7 deletions(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index fdcdb04fae..3ba340c38d 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -4463,6 +4463,30 @@ http-request replace-header # outputs: User-Agent: foo +http-request replace-path + [ { if | unless } ] + + This works like "replace-header" except that it works on the request's path + component instead of a header. The path component starts at the first '/' + after an optional scheme+authority. It does contain the query string if any + is present. The replacement does not modify the scheme nor authority. + + It is worth noting that regular expressions may be more expensive to evaluate + than certain ACLs, so rare replacements may benefit from a condition to avoid + performing the evaluation at all if it does not match. + + Example: +# prefix /foo : turn /bar?q=1 into /foo/bar?q=1 : +http-request replace-path (.*) /foo\1 + +# suffix /foo : turn /bar?q=1 into /bar/foo?q=1 : +http-request replace-path ([^?]*)(\?(.*))? \1/foo\2 + +# strip /foo : turn /foo/bar?q=1 into /bar?q=1 +http-request replace-path /foo/(.*) /\1 +# or more efficient if only some requests match : +http-request replace-path /foo/(.*) /\1 if { url_beg /foo/ } + http-request replace-uri [ { if | unless } ] @@ -4484,7 +4508,8 @@ http-request replace-uri with HTTP/2, clients are encouraged to send absolute URIs only, which look like the ones HTTP/1 clients use to talk to proxies. Such partial replace-uri rules may then fail in HTTP/2 when they work in HTTP/1. Either the rules need - to be adapted to optionally match a scheme and authority. + to be adapted to optionally match a scheme and authority, or replace-path + should be used. Example: # rewrite all "http" absolute requests to "https": diff --git a/src/cfgparse-listen.c b/src/cfgparse-listen.c index 507e071734..9975e46876 100644 --- a/src/cfgparse-listen.c +++ b/src/cfgparse-listen.c @@ -3694,7 +3694,7 @@ stats_error_parsing: } else if (!strcmp(args[0], "cliexp") || !strcmp(args[0], "reqrep")) { /* replace request header from a regex */ ha_alert("parsing [%s:%d] : The '%s' directive is not supported anymore since HAProxy 2.1. " -"Use 'http-request replace-uri' and 'http-request replace-header' instead.\n", +"Use 'http-request replace-path', 'http-request replace-uri' or 'http-request replace-header' instead.\n", file, linenum, args[0]); err_code |= ERR_ALERT | ERR_FATAL; goto out; diff --git a/src/http_act.c b/src/http_act.c index d6015d3624..61200ba368 100644 --- a/src/http_act.c +++ b/src/http_act.c @@ -142,6 +142,7 @@ static enum act_return http_action_replace_uri(struct act_rule *rule, struct pro enum act_return ret = ACT_RET_ERR; struct buffer *replace, *output; struct ist uri; + int action; int len; replace = alloc_trash_chunk(); @@ -149,6 +150,13 @@ static enum act_return http_action_replace_uri(struct act_rule *rule, struct pro if (!replace || !output) goto leave; uri = htx_sl_req_uri(http_get_stline(htxbuf(>req.buf))); + action = 3; // replace uri
Re: [2.1.1] http-request replace-uri does not work
Hi guys, On Mon, Dec 16, 2019 at 11:01:19PM +0100, Cyril Bonté wrote: > Hi Willy, > > Le 16/12/2019 à 22:06, Artur a écrit : > > > > [...] > > > > URLs like https://q.d/PPDSlide/testfile are correctly rewritten to > > > > https://q.d/p3/PPDSlide/testfile and forwarded to the backend. > > > > > > > > Once I switched to 2.1.1, haproxy no longer rewrites the URI and the > > > > URIs remains unchanged while forwarded to the backend. I had to > > > > downgrade to have the usual behaviour. > > > > > > > > Is it a bug or something changed in normal haproxy behaviour with 2.1 > > > > release ? > > > > > > I can confirm the issue. > > > > > > It seems to happen with h2 requests only, since commit #30ee1efe67. > > > haproxy normalizes the URI but replace-uri doesn't take into account > > > this information. The fix should be easy for replace-uri (If someone > > > wants to work on it, I won't have time this week). > > > > > > http://git.haproxy.org/?p=haproxy-2.1.git;a=commit;h=30ee1efe67 > > I'm not 100% sure it's the right approach to fix the issue. Can you check if > this patch may fix the issue in all conditions ? > > From what I've observed, it seems we can rely on the HTX_SL_F_NORMALIZED_URI > flag to detect if the URI was normalized by haproxy and in that case, we > should start at the path instead of the URI. In fact it's not a bug and is the intended behavior. The rule in Artur's config only matches origin URIs, not absolute ones, which most H2 clients use. In HTTP/1, there are two ways to send a request URI: origin form:GET /path/to/resource HTTP/1.1 absolute form: GET http://server.domain.tld/path/to/resource HTTP/1.1 Clients mostly use the origin form with servers, and mostly use the absolute form with proxies, and even then, the mapping is 99.99% and not 100% in either case. It's worth noting that Artur's rule already does not work with the absolute form, it only handles the origin form. In H2 you have the same distinction, except that clients were initially encouraged to use the absolute form only. The first non-HTX implementation used to implement only a subset of H2 and to map H2 to H1 using the Host header field, transforming incoming absolute URIs to origin URIs. This broke gRPC and end-to-end H2 because the scheme was lost. This also prevented such requests from being passed to proxies. So starting with 2.1 we now respect the format used by the client. It turns out that there seems to exist a number of configs in field which are already broken with H1 but which work by pure luck as they only handle the origin form. Arguably, many of these have evolved over the ages because initially they were designed to use the old reqrep matching which was very painful to deal with, and writing extra config just to handle an almost always absent scheme was not worth it. Nowadays we have multiple actions available depending on what we want to handle: - set-path to set only the path component - set-uri to set the whole URI - set-query to set the query string only - replace-uri to replace a part of the URI using parts from it In theory in order to perform some path rewrites, one would only need set-path. In Artur's case, it should be as simple as : http-request set-path /p3%[path] But now I'm starting to suspect that most of the problem comes from the fact that people who used to rely on regex in the past will not as easily perform their rewrites using set-path as they would using a replace rule which is very similar to the old set. So probably we'd need to introduce a "replace-path" action and suggest it in the warning emitted for reqrep. I think it is important that we properly address such needs and am willing to backport anything like this to 2.1 to ease the transition if that's the best solution. Please advise, Willy
Re: [2.1.1] http-request replace-uri does not work
Hi Willy, Le 16/12/2019 à 22:06, Artur a écrit : [...] URLs like https://q.d/PPDSlide/testfile are correctly rewritten to https://q.d/p3/PPDSlide/testfile and forwarded to the backend. Once I switched to 2.1.1, haproxy no longer rewrites the URI and the URIs remains unchanged while forwarded to the backend. I had to downgrade to have the usual behaviour. Is it a bug or something changed in normal haproxy behaviour with 2.1 release ? I can confirm the issue. It seems to happen with h2 requests only, since commit #30ee1efe67. haproxy normalizes the URI but replace-uri doesn't take into account this information. The fix should be easy for replace-uri (If someone wants to work on it, I won't have time this week). http://git.haproxy.org/?p=haproxy-2.1.git;a=commit;h=30ee1efe67 I'm not 100% sure it's the right approach to fix the issue. Can you check if this patch may fix the issue in all conditions ? From what I've observed, it seems we can rely on the HTX_SL_F_NORMALIZED_URI flag to detect if the URI was normalized by haproxy and in that case, we should start at the path instead of the URI. -- Cyril Bonté diff --git a/src/http_act.c b/src/http_act.c index d6015d362..797d75275 100644 --- a/src/http_act.c +++ b/src/http_act.c @@ -141,6 +141,7 @@ static enum act_return http_action_replace_uri(struct act_rule *rule, struct pro { enum act_return ret = ACT_RET_ERR; struct buffer *replace, *output; + struct htx_sl *sl; struct ist uri; int len; @@ -148,7 +149,12 @@ static enum act_return http_action_replace_uri(struct act_rule *rule, struct pro output = alloc_trash_chunk(); if (!replace || !output) goto leave; - uri = htx_sl_req_uri(http_get_stline(htxbuf(>req.buf))); + + sl = http_get_stline(htxbuf(>req.buf)); + uri = htx_sl_req_uri(sl); + if (sl->flags & HTX_SL_F_NORMALIZED_URI) + uri = http_get_path(uri); + if (!regex_exec_match2(rule->arg.act.p[1], uri.ptr, uri.len, MAX_MATCH, pmatch, 0)) goto leave;
Re: [2.1.1] http-request replace-uri does not work
Hello Cyril, Thanks a lot for the confirmation. Le 16/12/2019 à 20:20, Cyril Bonté a écrit : > Hi Artur, > > Le 16/12/2019 à 19:06, Artur a écrit : >> Hello, >> >> This is an extract of my frontend configuration working perfectly on >> 2.0.11. >> >> frontend wwws >> bind 0.0.0.0:443 ssl crt /etc/haproxy/ssl/server.pem alpn >> h2,http/1.1 >> mode http >> acl is_dev_qd hdr(host) -i dev.q.d dev.qs.d >> acl is_qd hdr(host) -i q.d qs.d www.q.d www.qs.d >> http-request replace-uri ^/PPDSlide/(.*) /d3/PPDSlide/\1 if >> is_dev_qd >> http-request replace-uri ^/PPDSlide/(.*) /p3/PPDSlide/\1 if >> is_qd >> >> >> URLs like https://q.d/PPDSlide/testfile are correctly rewritten to >> https://q.d/p3/PPDSlide/testfile and forwarded to the backend. >> >> Once I switched to 2.1.1, haproxy no longer rewrites the URI and the >> URIs remains unchanged while forwarded to the backend. I had to >> downgrade to have the usual behaviour. >> >> Is it a bug or something changed in normal haproxy behaviour with 2.1 >> release ? > > I can confirm the issue. > > It seems to happen with h2 requests only, since commit #30ee1efe67. > haproxy normalizes the URI but replace-uri doesn't take into account > this information. The fix should be easy for replace-uri (If someone > wants to work on it, I won't have time this week). > > http://git.haproxy.org/?p=haproxy-2.1.git;a=commit;h=30ee1efe67 > > -- Best regards, Artur
Re: [2.1.1] http-request replace-uri does not work
Hi Artur, Le 16/12/2019 à 19:06, Artur a écrit : Hello, This is an extract of my frontend configuration working perfectly on 2.0.11. frontend wwws bind 0.0.0.0:443 ssl crt /etc/haproxy/ssl/server.pem alpn h2,http/1.1 mode http acl is_dev_qd hdr(host) -i dev.q.d dev.qs.d acl is_qd hdr(host) -i q.d qs.d www.q.d www.qs.d http-request replace-uri ^/PPDSlide/(.*) /d3/PPDSlide/\1 if is_dev_qd http-request replace-uri ^/PPDSlide/(.*) /p3/PPDSlide/\1 if is_qd URLs like https://q.d/PPDSlide/testfile are correctly rewritten to https://q.d/p3/PPDSlide/testfile and forwarded to the backend. Once I switched to 2.1.1, haproxy no longer rewrites the URI and the URIs remains unchanged while forwarded to the backend. I had to downgrade to have the usual behaviour. Is it a bug or something changed in normal haproxy behaviour with 2.1 release ? I can confirm the issue. It seems to happen with h2 requests only, since commit #30ee1efe67. haproxy normalizes the URI but replace-uri doesn't take into account this information. The fix should be easy for replace-uri (If someone wants to work on it, I won't have time this week). http://git.haproxy.org/?p=haproxy-2.1.git;a=commit;h=30ee1efe67 -- Cyril Bonté
[2.1.1] http-request replace-uri does not work
Hello, This is an extract of my frontend configuration working perfectly on 2.0.11. frontend wwws bind 0.0.0.0:443 ssl crt /etc/haproxy/ssl/server.pem alpn h2,http/1.1 mode http acl is_dev_qd hdr(host) -i dev.q.d dev.qs.d acl is_qd hdr(host) -i q.d qs.d www.q.d www.qs.d http-request replace-uri ^/PPDSlide/(.*) /d3/PPDSlide/\1 if is_dev_qd http-request replace-uri ^/PPDSlide/(.*) /p3/PPDSlide/\1 if is_qd URLs like https://q.d/PPDSlide/testfile are correctly rewritten to https://q.d/p3/PPDSlide/testfile and forwarded to the backend. Once I switched to 2.1.1, haproxy no longer rewrites the URI and the URIs remains unchanged while forwarded to the backend. I had to downgrade to have the usual behaviour. Is it a bug or something changed in normal haproxy behaviour with 2.1 release ? -- Best regards, Artur