Babu.N writes:
> http://wiki.apache.org/spamassassin/OutOfMemoryProblems
> 
> This link suggests that one should skip sending large emails to 
> SpamAssassin (for better performance). It states that "Tests show 
> that larger messages are overwhelmingly likely to be non-spam, given 
> the economics of spamming". If spammers use botnets to pump spam, is 
> this statement still valid ?

Yes, pretty much; large spam still affects botnet senders, since it
greatly reduces the rate at which they can emit spam.  (They care
a lot about that.)

The exception is Japanese-language spam targeted at recipients in Japan,
which tends to be pretty bulky -- I would guess due to the great consumer
broadband situation over there.

> In case of botnet spamming, spammers may send large emails (as it is 
> the network of the botnet which is used, but not the spammer), with 
> top most portion of the email containing spam message & rest of the 
> email having some bulk to sizeup the email.
> 
> Is it not better if SA takes any-size email & attempts scanning on 
> only the top-most portion (say initial 500KB) of the email content 
> (as it may not make sense for spammers to keep their advertisement in 
> later portions of the email) ?

As Mark says, it would make sense to have a way for SpamAssassin to
deal more sensibly with large mails.

However, it's worth noting that your idea fails in the face of HTML
messages -- it's trivial for a spammer to generate a HTML message
along these lines:

    From: spammer
    Subject: hi
    Content-Type: text/html

    <div style="display: none">
    [2MB of innocent-looking text]
    </div>
    [spam payload here]


Which then renders as:

    From: spammer
    Subject: hi

    [spam payload here]


ie. the 2MB of innocent-looking text is silently hidden, serving only to
mislead naive filters. There are plenty more ways to do this using
Javascript and CSS,  and probably MIME multipart/alternative tricks too.
It gets complex very fast...

--j.

Reply via email to