Robert Menschel wrote:
<snipped a little for brevity>
OK, based on what little discussion there's been so far, here's a draft proposal for people to think about.
Summary: A group of volunteers will maintain a collected/distributed whitelist, using SpamAssassin's whitelist_from_rcvd capabilities, similar to (but in the opposite direction as) William Stearns' collected/distributed blacklist at http://www.stearns.org/sa-blacklist/sa-blacklist.current.cf
Don't forget about the new whitelist_from_spf capability that should be in the next major release.
I think we'd like to get away from whitelist_from_rcvd if possible for domains that appear to have sensible SPF records (records that actually list their hosts and don't use ?all, etc). There's no point in keeping up with the addition/removal of a domain's hosts if we don't have to (which will be more common than the current whitelist_from_rcvd domains if there are lots of them added). It's also somewhat unfair to restrict a domain to their current mail hosts due to concerns about 'their SpamAssassin score' if we don't have to.
For info on whitelist_from_spf (and def_whitelist_from_spf) implementation see bug 3487.
Assumption: This activity will focus only on public newsletters, services, etc., which normally do not contain any private information. Therefore there will not be any privacy or confidentiality concerns for the great majority of emails from these sources.
What about emails from banks etc? I'd think they'd be a good candidate for something you want to whitelist based on their received headers or SPF.
Distribution: The rules file which results from this activity will be maintained within the SARE system, as file 70_sare_whitelist.cf -- it can be downloaded manually or via RDJ.
Rules: Since these rules are gathered by the community at large, rather than use the whitelist_from_rcvd rule which normally scores -100, we will use the def_whitelist_from_rcvd rule, which scores only -15. Any site that wishes can copy any of these rules to a def_whitelist_from rule to gain the full -100 points.
While it doesn't really matter how it's done, I'd suggest that a user just sets the score for def_whitelist_from_rcvd to -100 or whatever they want if -15 isn't enough (which I don't think I've seen a case where it isn't enough).
def_whitelist_from_spf currently doesn't assign the full -15, unless the 'From:' header matches the envelope sender (see bug 3487), and would have to be scored appropriately if desired.
[Question to the devs: would you agree this is a valid use for the def_whitelist_from_rcvd rule?]
I don't see a problem with it, although we may want to create an alias for it, such as sare_whitelist_from_(rcvd|spf), to prevent confusion between entries included in the distribution with those available from SARE.
Any submission which already matches a def_whitelist_from_rcvd rule within the SA distribution will be identified and ignored (after response back to the submitter). (We are not going to try to develop pre-3.0.0 files.)
There aren't too many of them so it shouldn't be a problem. I'd suggest listing those domains (along with new domains as added) on a web page at your site, along with the info on how to submit new domains.
Can anyone think of any guidelines to be added or changed?
You might want to setup a web form for submissions. You could use it (well, the script behind it & a database) to automatically filter out duplicate submissions -- but still tally submissions for a domain.
Then again I don't know how many submissions you are expecting, so that may be overkill.
Daryl