regex matches on url escape chars

2010-02-26 Thread Colin Nicholson
Hi,

I have a web service which checks URLs sent to it by our software and
looks up a categorisation for that url (product is a web filter).

The request is of the form

/checkurl/http%3A%2F%2Fwww%2Efacebook%2Ecom%3A80%2Fx%2F2613988206%2Ffalse%2Fp%5F1327924941%3D2

which is checking
http://www.facebook.com:80/x/2613988206/false/p_1327924941=2.


I would like to rewrite the request, in this case, to be just
/checkurl/facebook%2Ecom, but I can't get the % escape codes to match
anything.

If the escape codes weren't in use, the following code works fine:

if (req.url ~ "facebook\.com") {
set req.url = "/checkurl/facebook.com";
}

but

if (req.url ~ "facebook\%2Ecom") {
set req.url = "/checkurl/facebook%2Ecom";
}

does not work. I've tried escaping the %2E with a backslash, but still
nothing. It isn't practical to change our software to not use the escape
codes unfortunately.


Does anyone have any ideas?


Thanks,
Colin


___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: regex matches on url escape chars

2010-02-26 Thread Colin Nicholson
Hi,

Found the answer - the % sign was being treated as the start of a hex
sequence, so putting the hex code for the % sign (%25) did the trick.


The working snippet is now:

if (req.url ~ "facebook%252Ecom") {
   set req.url = "/checkurl/facebook.com";
}




Colin


On 26/02/2010 17:24, Colin Nicholson wrote:
> Hi,
> 
> I have a web service which checks URLs sent to it by our software and
> looks up a categorisation for that url (product is a web filter).
> 
> The request is of the form
> 
> /checkurl/http%3A%2F%2Fwww%2Efacebook%2Ecom%3A80%2Fx%2F2613988206%2Ffalse%2Fp%5F1327924941%3D2
> 
> which is checking
> http://www.facebook.com:80/x/2613988206/false/p_1327924941=2.
> 
> 
> I would like to rewrite the request, in this case, to be just
> /checkurl/facebook%2Ecom, but I can't get the % escape codes to match
> anything.
> 
> If the escape codes weren't in use, the following code works fine:
> 
> if (req.url ~ "facebook\.com") {
> set req.url = "/checkurl/facebook.com";
> }
> 
> but
> 
> if (req.url ~ "facebook\%2Ecom") {
> set req.url = "/checkurl/facebook%2Ecom";
> }
> 
> does not work. I've tried escaping the %2E with a backslash, but still
> nothing. It isn't practical to change our software to not use the escape
> codes unfortunately.
> 
> 
> Does anyone have any ideas?
> 
> 
> Thanks,
> Colin
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc