Andras Korn wrote:
> I don't agree. These days, there are RBLs that will automatically list and
> delist IPs in the space of a few hours, well within the lifetime of a single
> email message.

If a server is being added and removed from RBLs in the space of a few 
hours, its behavior must be just on the border between "legitimate" and 
"spammer".  In that case, I would think the administrator would want to 
know about it by receiving a few complaints from users whose messages 
were being bounced.

You started this thread with a complaint that temporary rejections were 
needlessly consuming your server resources by causing the remote server 
to retry deliveries multiple times.  I guarantee that making the RBL 
filter return temporary rejection codes would waste considerably more 
resources for everyone, as RBLs are much more common and more widely 
used than RHSBLs.

> rblsmtpd also uses temporary rejects, fwiw.

Well, most of the major email providers (AOL, Yahoo!, GMail, Hotmail, 
etc) use permanent rejections for RBL matches.

> Temporary rejects also give the administrator a chance to whitelist an IP
> they do want to receive mail from (such as when it turns out that your new
> business partner's ISP just got blacklisted by an RBL).

The administrator would have to be carefully watching the outbound queue 
to notice a message was being held, then investigate the logs to find 
out why.  I can't envision this happening unless the server is new and 
the administrator is testing to make sure everything is working.

> Currently, what happens is this (IIRC):
> 
> 1. client 1.2.3.4 connects.
> 2. spamdyke checks rdns, RBLs, blacklists and whitelists, rejects message if
>    necessary.
> 3. client issues HELO/EHLO.
> 4. spamdyke checks DNS, rejects message if necessary.
> 5. spamdyke forwards HELO to qmail.
> 6. client issues MAIL FROM.
> 7. spamdyke checks DNS, RHSBLs, blacklists and whitelists, rejects message
>    if necessary.
> 8. spamdyke forwards MAIL FROM to qmail.
> 9. client issues RCPT TO.
> 10. spamdyke consults localdomains, blacklists, whitelists, relay access and
>     whatnot; rejects receipient if necessary.
> 11. spamdyke forwards recipient to qmail.
> 12. repeat 9-11 until client issues DATA.
> 13. spamdyke forwards DATA to qmail.
> 14. actual message is transferred.
> 
> What I suggest is to skip all DNS based tests until just before step 13. If
> qmail accepted none of the recipients (including the case where it didn't
> even get to see them because they were filtered by spamdyke), there is
> nothing to do and we saved some slow DNS queries.
> 
> If some recipients were accepted, spamdyke does the DNS lookups and if they
> indicate that the message should be rejected, it sends an appropriate 45x or
> 55x response to the DATA command of the client. Instead of DATA, it sends
> QUIT to qmail.

I understand now.

What you're describing would make spamdyke more efficient only for users 
who have modified/replaced their qmail-smtpd to support blacklists or 
other filters.  Most qmail servers run a stock version of qmail-smtpd, 
which will only reject recipients for relaying.  Since spamdyke already 
blocks relaying itself, qmail never gets to issue rejections for those 
cases.

On a stock qmail installation, this change would make spamdyke _less_ 
efficient, since it would keep qmail running for all connections, at 
least until the DATA command is given.  However, the current code closes 
qmail as soon as possible to free up resources.  "As soon as possible" 
depends on the configured filters -- the possibility of SMTP AUTH and 
the use of sender whitelists require qmail to continue running until 
"MAIL FROM" is seen.   The use of recipient whitelists require qmail to 
continue running until "RCPT TO" is seen.  But if spamdyke is configured 
to do graylisting, some RBLs, some rDNS tests and SMTP AUTH (a typical 
setup), qmail will be closed as soon as the "MAIL FROM" command is given.

I suspect we're debating fractional efficiencies here anyway -- I've 
never benchmarked either scenario.  I've also found that the most 
efficient scenario is often counterintuitive (meaning my initial 
hypotheses are often wrong).  For example, I never thought that reading 
spamdyke's configuration from a file would be faster than reading the 
command line but my testing showed that it is (apparently my file parser 
is more efficient than glibc's getopt() function).

If you can think of a way to test the efficiency/cost of the two 
approaches, I would be very interested to see the results.

> Ps. not that it matters, but by buffering the list of recipients there is
> still a way out in the situation you described (which doesn't however arise
> in the scheme I am suggesting): just kill the qmail-smtpd child and spawn
> another, but don't give it that particular recipient address. I'm only
> including this footnote because this approach could conceivably be useful in
> other situations.

I've considered doing this and there are some compelling arguments for 
it.  It would mean a semi-major overhaul of spamdyke however, since 
right now nothing is buffered.  If I took it further and buffered the 
message data, spamdyke would be able to limit message size, strip 
attachments, run virus scanners or do other interesting things.  I may 
tackle it in a future version but probably not any time soon.

-- Sam Clippinger
_______________________________________________
spamdyke-users mailing list
[email protected]
http://www.spamdyke.org/mailman/listinfo/spamdyke-users

Reply via email to