Just wondering.  would it be handy to have a new "body" type, the same as
"body" but matched as a single string, with all newlines converted to " "?
in other words, this text:

    I noticed we're now seeing a lot of folks using such an old perl 5.6.1
    that maybe we should update our SpamAssassin requirement to use 5.8.0
    as a bare minimum.
    
    I know that DBI requires 5.8.0 and states that while it may or may not
    build with a lesser version, you can't complain about it if you're
    using something that is over 5 yeas old!

    Heck I've still got a RH 6.2 sever online and have perl 5.8.x built
    and installed on it. 8*)


would be converted to this:

    "I noticed we're now seeing a lot of folks using such an old perl 5.6.1 
that maybe we should update our SpamAssassin requirement to use 5.8.0 as a bare 
minimum. I know that DBI requires 5.8.0 and states that while it may or may not 
build with a lesser version, you can't complain about it if you're using 
something that is over 5 yeas old! Heck I've still got a RH 6.2 sever online 
and have perl 5.8.x built and installed on it. 8*)"

ie, no newlines, all whitespace converted to " ".  this would be optimal
for matching with phrase rules.  (To avoid exponential-runtime .*
problems, it'd chop the text after the first 8000 characters or so.)

This is based on what I've been doing with the "seek-phrases" script;
it appears it may allow us to catch some spam patterns we might otherwise
miss, from spammers exploiting our inability to use a "body" rule across
a paragraph boundary.  (Are they still doing that?)

--j.

Reply via email to