I know body checks are to be avoided, but sometimes stuff just gets
through everything else.

One thing I have seen a lot of is URLs using the replacement characters.
For example, %20 is a space.  Only these use TONS of them.

I am going to test the below body check to see what it would zap.

/http:\/\/.*(%[1234567890abcdef]{1,3}){3,}/ WARN

Basically, it is looking for a URL with a %, up to three HEX characters,
and any sequence of more than three of these.

My guess is that will skip over poorly structured but valid URLs, like
links to "example.com/~user/My Latest Webpage.htm" and still catch a lot
of crap we do not need or want.

--Eric


Reply via email to