On Wed, Mar 17, 2004 at 08:18:18AM -0600, Bob Apthorpe wrote: > Hi, > > On Wed, 17 Mar 2004 11:19:44 +0000 Mat Harris <[EMAIL PROTECTED]> wrote: > > > On Wed, Mar 17, 2004 at 04:04:58 -0600, David B Funk wrote: > > > Would somebody please mass-check the following rule set > > > and let me know if there's any collateral damage? > > > I whiped them up to deal with a new flavor of spam that I'm > > > seeing more of these days. > > > > > > > > > rawbody L_FAKE_HREF /\w\whref=http:/i > > > describe L_FAKE_HREF Faked href to hide spammer URLs > > > score L_FAKE_HREF 1.0 > > > > > > > i am probably just seeing things and being stupid, but what is > > invalid about the above href? > > \w matches [a-zA-Z0-9_] so /\w\whref=http:/i matches 'href=http:' > preceded by two characters that are neither punctuation or whitespace. > Meaning 'zzhref=http:' matches, but '<a href=http:' doesn't. >
Sorry, I did miss it. For some reason, I thought \w meant \s.
I see what this is all about now :)
cheers
> See `perldoc perlre` for details.
>
> Hrm. Does it hurt to change
>
> /\w\whref=http:/i
>
> to
>
> /\w\whref="?https?:/i
>
> or even
>
> /\w\whref="?[a-z]{4,8}:/i
>
> ?
>
> -- Bob
--
Cats land on their feet.
Toast lands jellyside down.
A cat glued to some jelly toast will hover in quantum indecision
perl -e'$_=q#: 13_2: 12/o{>: 8_4) (_4: 6/2^-2; 3;-2^\2: 5/7\_/\7: \
12m m::#;y#:#\n#;s#(\D)(\d+)#$1x$2#ge;print'
Yes, of course it's the right cabl [le0: NO CARRIER]
pgpsQlJsz3FS4.pgp
Description: PGP signature
