Just wondering. would it be handy to have a new "body" type, the same as
"body" but matched as a single string, with all newlines converted to " "?
in other words, this text:
I noticed we're now seeing a lot of folks using such an old perl 5.6.1
that maybe we should update our SpamAssassin requirement to use 5.8.0
as a bare minimum.
I know that DBI requires 5.8.0 and states that while it may or may not
build with a lesser version, you can't complain about it if you're
using something that is over 5 yeas old!
Heck I've still got a RH 6.2 sever online and have perl 5.8.x built
and installed on it. 8*)
would be converted to this:
"I noticed we're now seeing a lot of folks using such an old perl 5.6.1
that maybe we should update our SpamAssassin requirement to use 5.8.0 as a bare
minimum. I know that DBI requires 5.8.0 and states that while it may or may not
build with a lesser version, you can't complain about it if you're using
something that is over 5 yeas old! Heck I've still got a RH 6.2 sever online
and have perl 5.8.x built and installed on it. 8*)"
ie, no newlines, all whitespace converted to " ". this would be optimal
for matching with phrase rules. (To avoid exponential-runtime .*
problems, it'd chop the text after the first 8000 characters or so.)
This is based on what I've been doing with the "seek-phrases" script;
it appears it may allow us to catch some spam patterns we might otherwise
miss, from spammers exploiting our inability to use a "body" rule across
a paragraph boundary. (Are they still doing that?)
--j.