On 09/08/2010 07:30 AM, p...@blu-studio.com wrote:
Using GNU Regular Expressions I need to examine an URL like those below, checking the size key and value, I need to capture and block all URLs where 'size does not equal 10'. In other words "size=12", not acceptable.
...
All around size, the other key and value pairs can be there, not be there, be in a different order, and the doamin and directory path combination may be different too.

Any good regexps for this?
The following regex pattern works, though you might need to tweak its syntax for your particular parser -- meaning, I'm not real sure this is "GNU Regular Expression" syntax. For example, \y for a word boundary may not be correct, but if you can support word boundaries, you can tweak this for syntax. Sure would like to hear from you if it works or not.

/http:\/\/[^?]+\?.*\ysize=10\y/

that is:

/
    opening delimiter
http:\/\/
    literal "http://";,
[^?]+
    one or more of anything other than a literal "?"
\?
    a literal "?"
.*
    zero or more of any character
\y
    a word boundary
size=10
    literal "size=10"
\y
    a word boundary
/
    closing delimiter

If it matches this regex, it's an http URI with query string having variable "size" equal to "10". If you need it also to match https URIs, use this:
/https?:\/\/[^?]+\?.*\ysize=10\y/


--
Allen Shaw
TwoMiceAndAStrawberry.com

"Excellence in Web software development and design"

al...@twomiceandastrawberry.com
Phone: (903)361-7429
Fax:   (253)276-8711
http://www.TwoMiceAndAStrawberry.com

_______________________________________________
New York PHP Users Group Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk

http://www.nyphp.org/Show-Participation

Reply via email to