On 09/28/2009 01:44 AM, Warren Togami wrote:

Use RCVD_IN_PSBL_2WEEKS to assign a score. RCVD_IN_PSBL_DEEP would be te
equivalent to RCVD_IN_PSBL_2WEEKS. The stricter RCVD_IN_PSBL would be a
subrule that matches only with last-external, thereby being stricter and
eliminating most of the already mininuscule chance of false positives.
Thus the full score of RCVD_IN_PSBL_2WEEKS would be split into two parts.

Before
RCVD_IN_PSBL_2WEEKS score 2
This rule does deep parsing which is often good, but sometimes bad.

After
RCVD_IN_PSBL score 2
This rule matces only last-external making it safer from FP's.
RCVD_IN_PSBL_DEEP score -1
This rule is can be scored separately, subtracting a tiny amount if the
PSBL hit was found in deep parsing. Both rules would trigger, one adds,
the second subtracts. The subtracting rule would never fire on its own.

OK, the above "subtract" probably needs some explanation.

This came from a feeling of discomfort with deep parsing of PSBL. PSBL *is* working well in masscheck with deep parsing with very few FP's. The trouble is these FP's like sending an e-mail from wireless broadband card via a legitimate mail server is legitimately blacklisted. Even though this alone is not likely to cause their mail to be classified as spam with the default threshold of 5, there is nothing the user can do about the previous user of that IP having sent spam.

For this reason I think we should have used psbl-lastexternal. psbl-lastexternal is extra certain to be correct and deserves a high score. [1] Deep parsing however has shown to be mostly correct and probably deserves a smaller score in cases where psbl-lastexternal didn't hit. Can spamassassin do separate sub-rule matches of lastexternal and deep parsing without querying twice?

[1] We are still hitting some yahoo FP's because filtering out Yahoo from the blacklist was broken until a few days ago. These should disappear entirely by the two week timeout.

Warren Togami
[email protected]

Reply via email to