On Tue, 2012-12-04 at 07:02 -0500, David F. Skoll wrote: > On Tue, 04 Dec 2012 11:12:54 +0100 > "Andrzej A. Filip" <andrzej.fi...@gmail.com> wrote: > > > Have you tried/considered scoring based on "headers only"? > Does anybody have statistics on the type and number of components in messages that exceed the scan size limit? What about information on how the various components contribute to the score?
It occurs to me that, if we knew these stats, it could be fairly simple for spamc to selectively remove parts that don't contribute to the score, retain a fragment of some that do, e.g. all you need from an image are the MIME headers (because we have useful rules that compare the content type with the file name) and, possibly, the image's header bytes (for rules that compare the file name with its content). Spamc would then send the shortened message to spamd for scanning, receive the SA headers back, and insert them in the original message (which it must retain) before passing it on. Is there an ANSI C MIME encode/decode library that could be bolted onto spamc? I can't find one, though there are a number of OSS C++ libraries, so spamc might need rewriting to use them. Martin