Hi, we were discussion this on VUG3: comparisons on the Host: header should be case insensitive. Reflecting on this, I think that normalizing the Host: header in Varnish would actually be the better idea and should avoid common errors.
Nils -- ** * * UPLEX - Nils Goroll Systemoptimierung Schwanenwik 24 22087 Hamburg tel +49 40 28805731 mob +49 170 2723133 fax +49 40 42949753 http://uplex.de/
From 4a15cdcefcfb36a95c413e760116a16d0b11ea04 Mon Sep 17 00:00:00 2001 From: Nils Goroll <[email protected]> Date: Thu, 17 Feb 2011 13:50:59 +0100 Subject: [PATCH] Normalize the Host: header according to the recommendation in rfc3986 to facilitate comparisons (and make varnish actually work correctly for many of the examples in the documentation). --- bin/varnishd/cache_center.c | 23 +++++++++++++++++++++++ bin/varnishtest/tests/v00008.vtc | 17 +++++++++++++++++ doc/sphinx/faq/general.rst | 27 ++++++++++++++++++++++----- doc/sphinx/reference/vcl.rst | 12 +++++++++++- 4 files changed, 73 insertions(+), 6 deletions(-) diff --git a/bin/varnishd/cache_center.c b/bin/varnishd/cache_center.c index 4a8b6ac..69543f8 100644 --- a/bin/varnishd/cache_center.c +++ b/bin/varnishd/cache_center.c @@ -67,6 +67,7 @@ SVNID("$Id$") #include <stdlib.h> #include <string.h> #include <unistd.h> +#include <ctype.h> #ifndef HAVE_SRANDOMDEV #include "compat/srandomdev.h" @@ -1281,6 +1282,28 @@ cnt_start(struct sess *sp) http_Unset(sp->http, H_Expect); } + /* + * Normalize Host header + * + * http://www.ietf.org/rfc/rfc3986.txt Section 3.2.2. Host: + * + * producers and normalizers should use lowercase for registered names + * and hexadecimal addresses for the sake of uniformity, while only + * using uppercase letters for percent-encodings. + */ + if (http_GetHdr(sp->http, H_Host, &p)) { + while (*p != '\0') { + if ((p[0] == '%') && + (p[1] != '\0') && (p[2] != '\0')) { + p[1] = toupper(p[1]); + p[2] = toupper(p[2]); + p += 3; + } else { + *p++ = tolower(*p); + } + } + } + sp->step = STP_RECV; return (0); } diff --git a/bin/varnishtest/tests/v00008.vtc b/bin/varnishtest/tests/v00008.vtc index 82bfe72..e2abb67 100644 --- a/bin/varnishtest/tests/v00008.vtc +++ b/bin/varnishtest/tests/v00008.vtc @@ -12,6 +12,19 @@ server s1 { expect req.url == "/bar" expect req.http.host == "127.0.0.1" txresp -body "foo1" + + # Host gets converted to lower case + rxreq + expect req.url == "/lower" + expect req.http.host == "snafu" + txresp -body "foo1" + + # .. but percent-encoding to upper case. The last + # percent-encoding is illegal and won't get touched + rxreq + expect req.url == "/pct" + expect req.http.host == "%6Enafu%a" + txresp -body "foo1" } -start varnish v1 -vcl+backend { } -start @@ -21,6 +34,10 @@ client c1 { rxresp txreq -url "/bar" rxresp + txreq -url "/lower" -hdr "Host: SnAfU" + rxresp + txreq -url "/pct" -hdr "Host: %6eNAFU%a" + rxresp } -run server s2 { diff --git a/doc/sphinx/faq/general.rst b/doc/sphinx/faq/general.rst index 26b86fa..12ce935 100644 --- a/doc/sphinx/faq/general.rst +++ b/doc/sphinx/faq/general.rst @@ -275,21 +275,38 @@ individual regular expressions, so we decided that it would probably confuse people if we made the default case-insentive. (We promise not to change our minds about this again.) +See the `PCRE man pages <http://www.pcre.org/pcre.txt>`_ for more information. + + +**Are regular expressions case sensitive or not? Can I change it?** + To make a PCRE regex case insensitive, put ``(?i)`` at the start:: - if (req.http.host ~ "?iexample.com$") { + if (beresp.http.location ~ "(?i)http://example.com/") { ... } See the `PCRE man pages <http://www.pcre.org/pcre.txt>`_ for more information. -**Are regular expressions case sensitive or not? Can I change it?** -In 2.1 and newer, regular expressions are case sensitive by default. In earlier versions, they were case insensitive. +**Are case insensitive regular expressions needed for comparing the Host: header?** -To change this for a single regex in 2.1, use ``(?i)`` at the start. +No. -See the `PCRE man pages <http://www.pcre.org/pcre.txt>`_ for more information. +From version 3.0 and forward, Varnish normalizes the ``Host:`` header, +so easy comparisons using case-sensitive regexes or the equality +operator are possible again:: + + if (req.http.host ~ "example.com$") { + ... + } + + +or even:: + + if (req.http.host == "www.example.com") { + ... + } **Why does the ``Via:`` header say 1.1 in Varnish 2.1.x?** diff --git a/doc/sphinx/reference/vcl.rst b/doc/sphinx/reference/vcl.rst index d986cd5..03b004a 100644 --- a/doc/sphinx/reference/vcl.rst +++ b/doc/sphinx/reference/vcl.rst @@ -300,10 +300,11 @@ To send flags to the PCRE engine, such as to turn on *case insensitivity* add the flag within parens following a question mark, like this::: - if (req.http.host ~ "(?i)example.com$") { + if (beresp.http.location ~ "(?i)http://example.com/") { ... } +See the `PCRE man pages <http://www.pcre.org/pcre.txt>`_ for more information. Functions --------- @@ -743,6 +744,15 @@ HTTP headers can be removed entirely using the remove keyword::: remove beresp.http.Set-Cookie; } +resp.http.host + Varnish normalizes the Host header according to the recommendations + in http://www.ietf.org/rfc/rfc3986.txt Section 3.2.2: + + It gets converted to lower case, except for pct-encodings, which get + converted to upper case. + + + Grace and saint mode -------------------- -- 1.5.6.5
signature.asc
Description: OpenPGP digital signature
_______________________________________________ varnish-dev mailing list [email protected] http://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
