On Tue, Oct 23, 2012 at 11:02:37PM +0200, Axb wrote: > On 10/23/2012 10:48 PM, John Hardin wrote: > >On Tue, 23 Oct 2012, Kevin A. McGrail wrote: > > > >>My thoughts were to ignore any binary attachments. > > > >I don't think that's justified. I'm beginning to see a resurgence of > >image spams that the OCR plugin would probably catch. Plus I fairly > >regularly see 419 spams with the body of the pitch in a PDF or MS Word > >document attachment. > > SA never scanned binary attachements and the chunk method wouldn't > change that, just apply rules to content for which it was not > designed for.
Just as a reminder for everyone: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6582 The problem here is SA stumbling onto large masses of data that it believes to be "text", thus running all body rules etc on that. If we fix or limit the impact of that, there's no reason to have any kind of silly "skip or chop large message" kludges. You'd only want to skip large messages completely if you are very low on resources and can't spare cpu or few megs of memory to keep the parsed message blobs in memory. Ok, there probably isn't much spam in the 10MB range anymore, so you might skip that..
