Hey -- just to turn the tables for a bit ;), I've recently been considering a problem and a possible solution, and could do with SpamAssassin users' advice.
These days, I've been forced to use SBL/XBL as an upfront anti-spam check, rejecting spam at RCPT TO: time during the SMTP transaction. (Previously I'd been running it from SpamAssassin in the usual manner.) That's great, and it works well, rejecting a *lot* of spam and saving a lot of CPU time by not running SpamAssassin. ;) However: it's important for SpamAssassin developers and mass-checkers to get a "representative" feed of spam -- with all kinds of spam included -- so that the rules are measured against something close to reality. This, unfortunately, implies that discarding mails that hit SBL/XBL is a bad thing, since those mails won't get into the mass-checked corpora -- and what will be mass-checked from that point on is just the 25% of spam that evades those rules. Bug 5096 suggests that we replace some of the mass-check corpora with pure-spamtrap feeds to fix this. Bit of a heavy fix :( There's another way, though. If it were possible to change the SMTP transaction flowchart to include this: - is IP listed in SBL/XBL? - if not listed, deliver as normal; - else if listed, continue SMTP transaction as if normal delivery is underway, but deliver to a spamtrap mbox file or maildir. Then we could avoid the delivery overhead of that spam -- the only "delivery" is an append to a file or write to a maildir -- while still recording the spam in question. It appears Postfix would allow this using http://www.policyd-weight.org/ -- look up DNSBLs, give it a "weight", if weight is too high, add a header. Postfix rules can then intercept messages with that header and divert to the spamtrap mbox. Has anyone done this? Got code you'd like to share? (In the meantime, I'm just going back to removing the BL, using SpamAssassin instead, and using the Shortcircuit plugin to reduce CPU load if RCVD_IN_SBL or RCVD_IN_XBL fires.) --j.