> I think there may still be a meta bug in the bugzilla... worth > checking it for ideas.
All I could find was: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4917 but is empty and closed. Some ideas can be found as enhancement requests in the bugzilla. Here are some other that come to mind: - 'a bugathlon': there are many bugs open, and some of these are rather small things to fix. Some may even be just forgotten and already fixed. It would be nice to go systematically through the list, doing some triage, and fix the more straightforward ones. - the M::SA::Message::Metadata::Received::parse_received_line looks like one big ad-hoc mess of exceptions. I'd dreamed that making a general (but permissive) parser of the syntax as prescribed in RFC 2821 could cover 2/3 of the cases, then dealing with the remaining exceptions. - there is a basic IPv6 support in SA, but seems like there are several corner cases where IPv6 addresses are not recognized or supported. Likely (just guessing) in RBL lookups, in Received header field parsing, some DNS lookups in plugins, querying for AAAA in addition to A, and in .ip6.arpa for reverse queries, maybe in spamc/spamd. It would be nice to go systematically across features, checking or fixing their IPv6 support. - my personal pet peeve: cleanly separating checking of a message from score generation and from reporting. This would make it possible (when using SA at a MTA level) to run a multi-recipient message through checks once, then produce a per-recipient score and/or per-recipient report individually for each recipient without having to re-run the rules. Most rules are already compatible with this: checking could just collect the set of rule names that fire, and assigning and summing up scores could be done as a separate step. Missing details are excluding rules which have zero score for all recipients of a message, short-circuiting, per-recipient bayes. Some stats indicate that a message has 1.5 recipients on the average, which means saving 50% of time almost for free when running in the MTA integration mode, while still preserving many per-recipient features. - dealing with arbitrary size mail messages: the rules and plugins which need it could have access to a complete message kept on a file (like checking DKIM signatures, processing of large attached pictures or documents, ...), while the rest can continue to work with an in-memory copy, but truncated to a managable size if necessary. The spamc could for example pass a file name to spamd (when both are running on the same host), instead of having to feed mail contents through a pipe/socket. Mark
