https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6088

           Summary: Adding one optional arg to M::S::parse allows caller to
                    pass additional info to SA
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: Libraries
        AssignedTo: [email protected]
        ReportedBy: [email protected]


Here is a trivial change to Mail::SpamAssassin::parse and to
Mail::SpamAssassin::Message with possibly far fetching opportunities.
(It is something I wanted to do for some time, but didn't have a
good excuse until now :)  The comment to sub parse now adds:


-=item parse($message, $parse_now)
+=item parse($message, $parse_now [, $suppl_attrib])

 The optional last argument I<$suppl_attrib> provides a way for a caller
 to pass additional information about a message to SpamAssassin. It is
 either undef, or a ref to a hash where each key/value pair provides some
 supplementary attribute of the message, typically information that cannot
 be deduced from the message itself, or is hard to do so reliably, or would
 represent unnecessary work for SpamAssassin to obtain it. The argument will
 be stored to a Mail::SpamAssassin::Message object as 'suppl_attrib', thus
 made available to the rest of the code as well as to plugins. The exact list
 of attributes will evolve through time, any unknown attributes should be
 ignored. Possible examples are: SMTP envelope information, a flag indicating
 that a message as supplied by a caller was truncated due to size limit, an
 already verified list of DKIM signature objects, or perhaps a list of rule
 hits predetermined by a caller, which makes another possible way for a
 caller to provide meta information (instead of having to insert made-up
 header fields in order to pass information), or maybe just plain rule hits.


The change itself is hardly controversial (is small, retains compatibility),
but what it makes possible for the future might be, which is why I'm opening
this PR.

But first to the background story, which makes up my excuse for adding
this feature.

I noticed an increased number of spam messages slipping through because
of their size being above the maximum message size (my limit is 420 kB).
Often this is some promotional/sales material with images or PDF attached.
Other users on the mailing list I attend to also started noticing and
were asking for a solution.

Knowing that a CRM114 classifier can effectively deal with spam by just
examining a portion of a mail message - by default much smaller than
the SpamAssassin (spamc) limit, it seemed the best solution is for the
caller of SpamAssassin (amavisd in my case, but could be spamc or other)
to truncate large messages and just pass the first 0.5 MB to SpamAssassin.

Having done so, it seemed like a perfect solution - SpamAssassin does
not mind some missing or truncated attachments, and does its job
very well even for truncated messages.

But alas, two days later when examining false positives on some large
messages I noticed that these received score points because DKIM or DK
signatures failed to validate, as mail body was modified (truncated),
even though signatures were actually valid.

This called for some way for a caller to tell SpamAssassin when a message
was truncated, so that DKIM/DK checking rules could be disabled for such
message. That's how this change was born.

Having had the supplementary attributes passing mechanism in place, it
was a solution asking for a problem. As amavisd already does DKIM/DK
verification (by calling Mail::DKIM module, the same as SpamAssassin)
because it needs the information for its whitelisting and reputation
mechanisms, and the SpamAssassin does Mail::DKIM verification again,
it was a natural step to just pass the Mail::DKIM signature objects
to SpamAssassin via this new mechanism, and save the SA DKIM plugin
the trouble to do the verification again, while also avoiding the
issue with truncated large messages altogether. A relatively small
modification to a DKIM plugin made it possible (but later evolved to
a more extensive change in unrelated areas, see Bug 6087).

So this is where I currently stand. Large messages slipping through
has been solved, broken signatures by message truncation solved, and
duplicated work in verifying signatures eliminated. On the spamc/spamd
side this is still mostly a solution asking for a problem, but I'm
confident it can be put to good use.


-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to