Are statistics that important ? My setup is using only 2 RBLs directly in sendmail to reject connections: dynablock and opm. This stops the zombies and home-made spam delivery.
Then I use greylisting to block other "fake" SMTP servers. Then I use spamassassin (thru MailScanner) with several other RBLs in the usual SA scoring. The greylisting is disabled on a few spamtraps so that SA gets some "pure" spam to feed it's bayes database. >From time to time, when SA is quiet (this is mostly during the week-ends or during the night), I disable greylisting and the dynablock/opm RBLs in sendmail. I've not seen big changes in the FN/FP rates when opening the valves like that. I'm not sure it is that important to get let spam go in. SA scoring is based on huge corpus and gives good results. Bayes auto-adapts to the spam and ham that is going thru the "pre-filters" one may set (like RBL or greylisting). If there are big changes in these pre-filters, bayes will need some new spam/ham messages to adapt again, but will adapt. For me, it doesn't seem that bad not to process all kind of spams on a given SA setup... but I may be wrong ! Best, Christian