You may also want to stick optional whitespace in there to avoid trivial bypass: There's also the possibility of adding a typeface or other options to the <font> tag, which would bypass your simple rule. And HTML is not case-sensitive. And avoid * on complex stuff when matching arbitrarily long texts, which can lead to runaway backtracking and scan timeouts.

Thanks. This spammer is prolific, but seems to be very stupid and pattern based, hardly ever varying what he puts in some parts of the message. I've been seeing this pattern without change for about 3 months now. I almost never have to tweak a rule for his stuff to account for a possible variation.

It would be interesting (at least to me) to run a set of test rules against the SA corpus to try to determine the optimial cutoff point for a good S/O as regards length of 0-point text. I personally have absolutely no idea what a "reasonable" size is for 0-point text in an email. Personally I'd be inclined to say that any 0-point text isn't reasonable, but mass marketers seem to believe otherwise.

Reply via email to