On 11/17/2015 5:20 PM, Chris Siebenmann wrote:
I have also decided to stop using "deny" at the data ACL and instead
>either redirect to a webmaster alias or the bit bucket. I have reached
>the conclusion that denying spam or malware at data time doesn't
>accomplish anything useful. Using deny with a code at rcpt time has
>the feature of saving both internet bandwidth and server time, but
>after data, that damage is already done.
  One theoretical reason I can see to do deny-at-DATA for a small
personal server is that it potentially sends a signal to large origin
sources like GMail that something questionable is up with email from a
particular account. I can imagine GMail looking for signals on outgoing
email like an increased number of rejections.

(For a large population, deny-at-DATA has the advantage that a sender
who is the victim of a false positive at least knows about it, and in
turn this may make adding more aggressive filtering more palatable to
your users.)

        - cks

First, I need to correct a statement I made several times in this thread. The spamhaus zen filter in my server's exim.conf is applied at MAIL time, not RCPT.

Thanks for bringing up false positives, one of my favorite spam topics. I have very strong opinions against aggressive spam filtering that cheerfully tosses tons of ham to avoid delivering a pound of spam. (For the same reason, I totally refuse to use stuff like captcha or its equivalent approach on my web page contact form.)

Which is why I was perfectly willing to live with some snowshoe spam rather than use any aggressive technique that risks loss of legitimate messages.

Spamassassin does occasionally generate fairly high scores on perfectly legitimate emails that could certainly produce false positives. Which is why I am evolving toward a multilevel approach with respect to spam scoring (with Accept at every level):

1) Very low spam score: Deliver to recipient mailbox without even a spam score.
2) Low spam score: Deliver to recipient mailbox with the usual spam headers.
3) Likely spam: Re-route to a spam-dedicated mailbox for human analysis.
4) Certain spam: Accept and throw away.

Level 4 is reserved for email that fails specific test(s). For instance, the custom SA rule I described in the initial message in this thread.

Level 3 is for any "spam" that has the slightest possibility of being ham. My thinking is that the result of analysis will be custom SA rules that either direct future similar emails to either level 1 or level 4.

The recipient is charged with deciding how to treat level 1 or 2 false negatives.

As to implementation, I will use this terrific exim-specific guide:
  https://github.com/Exim/exim/wiki/ExiscanExamples

SA false positive story: I receive a daily email from a newspaper which contains a dozen or so top headlines of the day. Together with each headline is a one-sentence summary, and a smail graphic, plus a link to the story on their web version of the newspaper. I had to use an SA whitelist rule because SA was routinely giving these emails 8-plus spam scores. (I use the SA 'whitelist_from_rcvd' rule.)

I am not much persuaded by your Gmail example. I think Gmail and other large sources have better ways of finding abusers of their service.

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/

Reply via email to