Hi Beni,

A few things to digest here.

What was leading me up this path was a bit of elementary (and probably naïve) 
white-listing with respect to the contents of the Host header and the URI/L 
supplied by the user. Tools like Fiddler make request manipulation trivial so 
filtering out 'obvious' manipulation attempts would be a good idea. With this 
in mind my thinking (if it can be considered as such) was that:

(1) user request is for http://www.example.com/whatever
(2) Host header is www.example.com
(3) All is good! Pass request on to server.

Alternatively:

(1) user request is for http://www.example.com/whatever
(2) Host header is www.whatever.com
(3) All is NOT good! Flick request somewhere harmless.

I'm not sure whether your solution supports this, and if your interpretation is 
correct maybe HAProxy doesn't support it either.

I'll do some more experimenting and I hope I don't lock myself out ;-)

Cheers
Andrew

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Benedikt 
Fraunhofer
Sent: Wednesday, 28 April 2010 7:42 PM
To: Andrew Commons
Cc: [email protected]
Subject: Re: Matching URLs at layer 7

Hi Andrew,

2010/4/28 Andrew Commons <[email protected]>:

> url_beg <string>
>  Returns true when the URL begins with one of the strings. This can be used to
>  check whether a URL begins with a slash or with a protocol scheme.
>
> So I'm assuming that "protocol scheme" means http:// or ftp:// or whatever....

I would assume that, too..
but :) reading the other matching options it looks like those only
affect the "anchoring" of the matching. Like

> url_ip <ip_address>
>  Applies to the IP address specified in the absolute URI in an HTTP request.
>  It can be used to prevent access to certain resources such as local network.
>  It is useful with option "http_proxy".

yep. but watch this "http_proxy"


> url_port <integer>
>  "http_proxy". Note that if the port is not specified in the request, port 80
>  is assumed.

same here.. This enables plain proxy mode where requests are issued
(from the client) like

 GET http://www.example.com/importantFile.txt HTTP/1.0
.

> This seems to be reinforced (I think!) by:
>
> url_dom <string>
>  Returns true when one of the strings is found isolated or delimited with dots
>  in the URL. This is used to perform domain name matching without the risk of
>  wrong match due to colliding prefixes. See also "url_sub".

I personally don't think so.. I guess this is just another version of
"anchoring", here
"\.$STRING\."

> If I'm suffering from a bit of 'brain fade' here just set me on the right 
> road :-) If the url_ criteria have different interpretations in terms of what 
> the 'url' is then let's find out what these are!

I currently can't give it a try as i finally managed to lock myself out, but

http://haproxy.1wt.eu/download/1.4/doc/configuration.txt

has an example that looks exactly as what you need:
-------------------
To select a different backend for requests to static contents on the "www" site
and to every request on the "img", "video", "download" and "ftp" hosts :

   acl url_static  path_beg         /static /images /img /css
   acl url_static  path_end         .gif .png .jpg .css .js
   acl host_www    hdr_beg(host) -i www
   acl host_static hdr_beg(host) -i img. video. download. ftp.

   # now use backend "static" for all static-only hosts, and for static urls
   # of host "www". Use backend "www" for the rest.
   use_backend static if host_static or host_www url_static
   use_backend www    if host_www

-------------------

and as "begin" really means anchoring it with "^" in a regex this
would mean that there's no host in url as this would redefine the
meaning of "begin" which should not be done :)

So you should be fine with

       acl xxx_host     hdr(Host)      -i xxx.example.com
       acl xxx_url      url_beg /
       #there's already a predefined acl doing this.
       use_backend xxx         if xxx_host xxx_url

if i recall your example correctly.. But you should really put
something behind the url_beg to be of any use :)

Just my 2 cent

 Beni.


Reply via email to