On Mon, Mar 23, 2009 at 18:47, Mark Martinec <[email protected]> wrote: >> I think there may still be a meta bug in the bugzilla... worth >> checking it for ideas. > > All I could find was: > https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4917 > but is empty and closed.
found it. follow the "depends on" links from https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4560 > Some ideas can be found as enhancement requests in the bugzilla. > > > Here are some other that come to mind: > > - 'a bugathlon': there are many bugs open, and some of these are > rather small things to fix. Some may even be just forgotten and > already fixed. It would be nice to go systematically through the > list, doing some triage, and fix the more straightforward ones. > > - the M::SA::Message::Metadata::Received::parse_received_line > looks like one big ad-hoc mess of exceptions. I'd dreamed that > making a general (but permissive) parser of the syntax as > prescribed in RFC 2821 could cover 2/3 of the cases, then > dealing with the remaining exceptions. > > - there is a basic IPv6 support in SA, but seems like there are > several corner cases where IPv6 addresses are not recognized or > supported. Likely (just guessing) in RBL lookups, in Received header > field parsing, some DNS lookups in plugins, querying for AAAA in > addition to A, and in .ip6.arpa for reverse queries, maybe in > spamc/spamd. It would be nice to go systematically across features, > checking or fixing their IPv6 support. > > - my personal pet peeve: cleanly separating checking of a message > from score generation and from reporting. This would make it possible > (when using SA at a MTA level) to run a multi-recipient message > through checks once, then produce a per-recipient score and/or > per-recipient report individually for each recipient without having > to re-run the rules. Most rules are already compatible with this: > checking could just collect the set of rule names that fire, and > assigning and summing up scores could be done as a separate step. > Missing details are excluding rules which have zero score for all > recipients of a message, short-circuiting, per-recipient bayes. > Some stats indicate that a message has 1.5 recipients on the average, > which means saving 50% of time almost for free when running in the > MTA integration mode, while still preserving many per-recipient features. > > - dealing with arbitrary size mail messages: the rules and plugins > which need it could have access to a complete message kept on a file > (like checking DKIM signatures, processing of large attached pictures > or documents, ...), while the rest can continue to work with an > in-memory copy, but truncated to a managable size if necessary. > The spamc could for example pass a file name to spamd (when both > are running on the same host), instead of having to feed mail contents > through a pipe/socket. > > Mark > >
