[EMAIL PROTECTED] (Justin Mason) writes: > The first fix is truncation of the text before passing to TextCat. > Michael, I think you were looking at this? the results are impressive, > if the text is truncated to 32k bytes: > > before: SZ = 49336 RSS = 45584 by end of "spamassassin -t" run > after: 44252 41396 > > 5 megs dropped straight away.
We should look at whether the XS version fixes this. Definitely no problem truncating once (a) the sample is large enough or (b) the verdict certainty is of high confidence. > however: 100 URLs is pretty low. it's worth noting these are the *first* > 100 URLs found in the message, but still -- there may be a way a spammer > could overload this and get past SpamAssassin by loading up 100 URLs > before their payload URL. We need better logic to ignore fluff URLs. Daniel -- Daniel Quinlan ApacheCon! 13-17 November (3 SpamAssassin http://www.pathname.com/~quinlan/ http://www.apachecon.com/ sessions & more)
