Hi,

we were discussion this on VUG3: comparisons on the Host: header should be case
insensitive. Reflecting on this, I think that normalizing the Host: header in
Varnish would actually be the better idea and should avoid common errors.

Nils

-- 

** * * UPLEX - Nils Goroll Systemoptimierung

Schwanenwik 24
22087 Hamburg

tel +49 40 28805731
mob +49 170 2723133
fax +49 40 42949753

http://uplex.de/
From 4a15cdcefcfb36a95c413e760116a16d0b11ea04 Mon Sep 17 00:00:00 2001
From: Nils Goroll <[email protected]>
Date: Thu, 17 Feb 2011 13:50:59 +0100
Subject: [PATCH] Normalize the Host: header according to the recommendation in
 rfc3986 to facilitate comparisons (and make varnish actually work
 correctly for many of the examples in the documentation).

---
 bin/varnishd/cache_center.c      |   23 +++++++++++++++++++++++
 bin/varnishtest/tests/v00008.vtc |   17 +++++++++++++++++
 doc/sphinx/faq/general.rst       |   27 ++++++++++++++++++++++-----
 doc/sphinx/reference/vcl.rst     |   12 +++++++++++-
 4 files changed, 73 insertions(+), 6 deletions(-)

diff --git a/bin/varnishd/cache_center.c b/bin/varnishd/cache_center.c
index 4a8b6ac..69543f8 100644
--- a/bin/varnishd/cache_center.c
+++ b/bin/varnishd/cache_center.c
@@ -67,6 +67,7 @@ SVNID("$Id$")
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
+#include <ctype.h>
 
 #ifndef HAVE_SRANDOMDEV
 #include "compat/srandomdev.h"
@@ -1281,6 +1282,28 @@ cnt_start(struct sess *sp)
                http_Unset(sp->http, H_Expect);
        }
 
+       /*
+        * Normalize Host header
+        *
+        * http://www.ietf.org/rfc/rfc3986.txt Section 3.2.2. Host:
+        *
+        * producers and normalizers should use lowercase for registered names
+        * and hexadecimal addresses for the sake of uniformity, while only
+        * using uppercase letters for percent-encodings.
+        */
+       if (http_GetHdr(sp->http, H_Host, &p)) {
+               while (*p != '\0') {
+                       if ((p[0] == '%') &&
+                           (p[1] != '\0') && (p[2] != '\0')) {
+                               p[1] = toupper(p[1]);
+                               p[2] = toupper(p[2]);
+                               p += 3;
+                       } else {
+                               *p++ = tolower(*p);
+                       }
+               }
+       }
+
        sp->step = STP_RECV;
        return (0);
 }
diff --git a/bin/varnishtest/tests/v00008.vtc b/bin/varnishtest/tests/v00008.vtc
index 82bfe72..e2abb67 100644
--- a/bin/varnishtest/tests/v00008.vtc
+++ b/bin/varnishtest/tests/v00008.vtc
@@ -12,6 +12,19 @@ server s1 {
        expect req.url == "/bar"
        expect req.http.host == "127.0.0.1"
        txresp -body "foo1"
+
+       # Host gets converted to lower case
+       rxreq
+       expect req.url == "/lower"
+       expect req.http.host == "snafu"
+       txresp -body "foo1"
+
+       # .. but percent-encoding to upper case. The last
+       #    percent-encoding is illegal and won't get touched
+       rxreq
+       expect req.url == "/pct"
+       expect req.http.host == "%6Enafu%a"
+       txresp -body "foo1"
 } -start
 
 varnish v1 -vcl+backend { } -start
@@ -21,6 +34,10 @@ client c1 {
        rxresp
        txreq -url "/bar"
        rxresp
+       txreq -url "/lower" -hdr "Host: SnAfU"
+       rxresp
+       txreq -url "/pct" -hdr "Host: %6eNAFU%a"
+       rxresp
 } -run
 
 server s2 {
diff --git a/doc/sphinx/faq/general.rst b/doc/sphinx/faq/general.rst
index 26b86fa..12ce935 100644
--- a/doc/sphinx/faq/general.rst
+++ b/doc/sphinx/faq/general.rst
@@ -275,21 +275,38 @@ individual regular expressions, so we decided that it 
would
 probably confuse people if we made the default case-insentive.
 (We promise not to change our minds about this again.)
 
+See the `PCRE man pages <http://www.pcre.org/pcre.txt>`_ for more information.
+
+
+**Are regular expressions case sensitive or not? Can I change it?**
+
 To make a PCRE regex case insensitive, put ``(?i)`` at the start::
 
-       if (req.http.host ~ "?iexample.com$") {
+       if (beresp.http.location ~ "(?i)http://example.com/";) {
                ...
        }
 
 See the `PCRE man pages <http://www.pcre.org/pcre.txt>`_ for more information.
 
-**Are regular expressions case sensitive or not? Can I change it?**
 
-In 2.1 and newer, regular expressions are case sensitive by default.  In 
earlier versions, they were case insensitive.
+**Are case insensitive regular expressions needed for comparing the Host: 
header?**
 
-To change this for a single regex in 2.1, use ``(?i)`` at the start.
+No.
 
-See the `PCRE man pages <http://www.pcre.org/pcre.txt>`_ for more information.
+From version 3.0 and forward, Varnish normalizes the ``Host:`` header,
+so easy comparisons using case-sensitive regexes or the equality
+operator are possible again::
+
+       if (req.http.host ~ "example.com$") {
+               ...
+       }
+
+
+or even::
+
+       if (req.http.host == "www.example.com") {
+               ...
+       }
 
 
 **Why does the ``Via:`` header say 1.1 in Varnish 2.1.x?**
diff --git a/doc/sphinx/reference/vcl.rst b/doc/sphinx/reference/vcl.rst
index d986cd5..03b004a 100644
--- a/doc/sphinx/reference/vcl.rst
+++ b/doc/sphinx/reference/vcl.rst
@@ -300,10 +300,11 @@ To send flags to the PCRE engine, such as to turn on *case
 insensitivity* add the flag within parens following a question mark,
 like this:::
 
-  if (req.http.host ~ "(?i)example.com$") {
+  if (beresp.http.location ~ "(?i)http://example.com/";) {
           ...
   }
 
+See the `PCRE man pages <http://www.pcre.org/pcre.txt>`_ for more information.
 
 Functions
 ---------
@@ -743,6 +744,15 @@ HTTP headers can be removed entirely using the remove 
keyword:::
     remove beresp.http.Set-Cookie;
   }
 
+resp.http.host
+  Varnish normalizes the Host header according to the recommendations
+  in http://www.ietf.org/rfc/rfc3986.txt Section 3.2.2:
+
+  It gets converted to lower case, except for pct-encodings, which get
+  converted to upper case.
+
+
+
 Grace and saint mode
 --------------------
 
-- 
1.5.6.5

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
varnish-dev mailing list
[email protected]
http://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Reply via email to