https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6156

--- Comment #63 from Warren Togami <[email protected]> 2009-09-28 10:17:47 PDT 
---
I believe we made a mistake here with deep parsing instead of lastexternal.

While masschecks have shown us that it catches maybe an additional 20% with
deep parsing, it does introduce some very rare FP's.   Deep parsing catches IP
addresses that posted to a Yahoo webmail interface or mail sent via a
legitimate MTA.  They are are FP's for reasons like sending mail from a mobile
phone or mobile broadband from an IP address that was previously used by a
spammer.  There is nothing these users can do about it.

While RCVD_IN_PSBL with deep parsing alone is not likely to flag the mail as
spam, that IP address could easily FP on a different DNSBL and push it over.

I believe we should do the following.

1) RCVD_IN_PSBL becomes lastexternal.  It deserves a higher score because it
eliminates the above type of FP.

2) Add a separate subrule that hits with PSBL deep parsing && !RCVD_IN_PSBL. 
This can add a smaller score, safer to the very rare FP's.  Can deep parsing
and lastexternal be done simultaneously without two queries?

3) Release 3.3.0 with #1 by default.  We could add #2 too, or wait until
sa-update later.  It will be difficult to score part #2 with the GA given how
rare the FP's are.  I am not aware of any in my own 8 users' corpus at the
moment.

-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to