Re: [spamdyke-users] let qmail decide if it accepts a recipient before doing RHSBL?

2008-04-15 Thread Sam Clippinger
Probably not, at least not in the next version.  spamdyke's DNS system 
sends its queries simultaneously and accepts the first positive response 
it receives.  The rest are discarded.  In order to log them all, 
spamdyke would have to wait for all responses to come back, which would 
slow down the DNS system quite a bit (it would make spamdyke only as 
fast as the slowest RBL instead of the fastest).

If you're just wanting to evaluate the RBLs you're using, I think you 
could probably do that more effectively from a daily script that 
analyzes the mail log, requeries RBLs and generates statistics.

-- Sam Clippinger

Eric Shubert wrote:
 I like having specific RBLs logged. I just installed spamdyke on a few
 qmail-toasters yesterday (replacing rblsmtpd), and was going to as about
 this. Michael beat me to it! ;)
 
 If simultaneous queries are being done, can all RBLs that match be logged?
 Perhaps a comma separated list within parenthesis. This would make it
 possible to gather stats on the effectiveness of the RBLs being used.
 
 Sam Clippinger wrote:
 Yes, this is certainly possible.  Right now spamdyke identifies the RBL 
 in its message to the remote server but not in the logs.  Good idea!

 What would be a good way to log this information (preferably without 
 breaking existing scripts)?  I'm thinking as I type here, but spamdyke 
 already follows the rejection reason with parenthesis (when the log 
 level is high enough) to indicate which file/line matched for file-based 
 filters... perhaps the same could be done for RBLs/RHSBLs.  Something 
 like this:
  DENIED_RBL_MATCH(rbl.example.com)

 As for reordering the RBLs to put the often-matched ones first, the next 
 version of spamdyke will make that less necessary.  By default, it will 
 query all RBLs simultaneously, regardless of their order.  (That 
 behavior can be prevented with a new flag -- ordering would be important 
 in that case.)

 -- Sam Clippinger

 Michael Colvin wrote:
 To find real numbers, you would have to consider how many 
 connections are accepted, how many are rejected and for what 
 reasons.  Then look at the popularity of different spamdyke 
 features and specifically the popularity of different DNS 
 RBLs.  Use all that to find out what percentage of rejected 
 connections could avoid the DNS queries due to local tests.  
 Along those lines, is it possible, or can it be possible, to have spamdyke's
 logs indicate which DNS RBL caused a message to be rejected?  I'm assuming
 that once a reason for rejection is found, IE, the IP is listed in a
 particular RBL, further tests against other RBL's in the list are not
 performed?  Knowing, statistically, which ones have a higher rejection rate,
 and queuing those first in the list of RBLS might save some time.

 Or course, multiple RBLS could reject the same message, and the one first in
 line would have the higher percentage, but this would give us a way to move
 them around and check the results...

 Just a thought from a newbie to spamdyke. 

 BTW, I LOVE Spamdyke!  What a difference it has made in my system's ability
 to filter spam and save resources!  It's a God send!

 Mike

 
 
___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] let qmail decide if it accepts a recipient before doing RHSBL?

2008-04-14 Thread Sam Clippinger
Andras Korn wrote:
 On Sun, Apr 13, 2008 at 02:55:16PM -0500, Sam Clippinger wrote:
 Most qmail servers run a stock version of qmail-smtpd, which will only
 reject recipients for relaying.
 
 They shouldn't, as stock qmail is liable to causing backscatter. No
 self-respecting admin should run qmail as distributed by DJB today. Mail to
 bogus recipients must not be accepted and then bounced later. Some mechanism
 must be in place to ensure that mail to bogus recipients is never accepted
 at all.

I agree.  But regardless of what /should/ be the case, the fact is that 
most qmail servers run the stock version of qmail-smtpd.  I can't 
justify making a change that will make spamdyke less efficient for the 
majority and only slightly more efficient for the minority.

 I think you should always wait for RCPT TO, even if it's not necessary for
 whitelist decisions, because then you can log whose mail you're rejecting.
 rblsmtpd's inability to do this is one of its major shortcomings.

spamdyke does always wait for RCPT, so that it can log the recipient. 
But it does not keep qmail running the entire time, if there is no 
chance the message will be accepted.  If a recipient whitelist is not in 
use, there's no reason to wait.

 I suspect we're debating fractional efficiencies here anyway -- I've 
 never benchmarked either scenario.
 
 Well, fwiw, I just ran a quick test: querying the handful of RBLs I have
 configured in parallel takes about 10 seconds (as long as it takes for the
 slowest of them to reply). Rejecting a mail based on the local list of valid
 recipients takes a good deal less than a second.

Of course local file accesses are faster than DNS queries.  That's not 
what I meant.

To find real numbers, you would have to consider how many connections 
are accepted, how many are rejected and for what reasons.  Then look at 
the popularity of different spamdyke features and specifically the 
popularity of different DNS RBLs.  Use all that to find out what 
percentage of rejected connections could avoid the DNS queries due to 
local tests.  Lastly, find a way to evaluate the real cost (wall time, 
server load and network load) of spamdyke's DNS queries versus the 
additional load generated by passing the extra SMTP traffic to qmail.

That last step is the part I don't know how to measure.

If all of those numbers were available, my instinct says the advantage 
of your proposed change would be very small at best.

-- Sam Clippinger
___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] let qmail decide if it accepts a recipient before doing RHSBL?

2008-04-14 Thread Sam Clippinger
Yes, this is certainly possible.  Right now spamdyke identifies the RBL 
in its message to the remote server but not in the logs.  Good idea!

What would be a good way to log this information (preferably without 
breaking existing scripts)?  I'm thinking as I type here, but spamdyke 
already follows the rejection reason with parenthesis (when the log 
level is high enough) to indicate which file/line matched for file-based 
filters... perhaps the same could be done for RBLs/RHSBLs.  Something 
like this:
DENIED_RBL_MATCH(rbl.example.com)

As for reordering the RBLs to put the often-matched ones first, the next 
version of spamdyke will make that less necessary.  By default, it will 
query all RBLs simultaneously, regardless of their order.  (That 
behavior can be prevented with a new flag -- ordering would be important 
in that case.)

-- Sam Clippinger

Michael Colvin wrote:
 
 To find real numbers, you would have to consider how many 
 connections are accepted, how many are rejected and for what 
 reasons.  Then look at the popularity of different spamdyke 
 features and specifically the popularity of different DNS 
 RBLs.  Use all that to find out what percentage of rejected 
 connections could avoid the DNS queries due to local tests.  
 
 Along those lines, is it possible, or can it be possible, to have spamdyke's
 logs indicate which DNS RBL caused a message to be rejected?  I'm assuming
 that once a reason for rejection is found, IE, the IP is listed in a
 particular RBL, further tests against other RBL's in the list are not
 performed?  Knowing, statistically, which ones have a higher rejection rate,
 and queuing those first in the list of RBLS might save some time.
 
 Or course, multiple RBLS could reject the same message, and the one first in
 line would have the higher percentage, but this would give us a way to move
 them around and check the results...
 
 Just a thought from a newbie to spamdyke. 
 
 BTW, I LOVE Spamdyke!  What a difference it has made in my system's ability
 to filter spam and save resources!  It's a God send!
 
 Mike
 
 
 
 ___
 spamdyke-users mailing list
 spamdyke-users@spamdyke.org
 http://www.spamdyke.org/mailman/listinfo/spamdyke-users
___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] let qmail decide if it accepts a recipient before doing RHSBL?

2008-04-14 Thread Andras Korn
On Mon, Apr 14, 2008 at 09:40:51AM -0500, Sam Clippinger wrote:

 Andras Korn wrote:
  On Sun, Apr 13, 2008 at 02:55:16PM -0500, Sam Clippinger wrote:
  Most qmail servers run a stock version of qmail-smtpd, which will only
  reject recipients for relaying.
  
  They shouldn't, as stock qmail is liable to causing backscatter. No
  self-respecting admin should run qmail as distributed by DJB today. Mail to
  bogus recipients must not be accepted and then bounced later. Some mechanism
  must be in place to ensure that mail to bogus recipients is never accepted
  at all.
 
 I agree.  But regardless of what /should/ be the case, the fact is that 
 most qmail servers run the stock version of qmail-smtpd.  I can't 
 justify making a change that will make spamdyke less efficient for the 
 majority and only slightly more efficient for the minority.

You yourself said that the decreased efficiency, if any, would be marginal.

This behaviour could even be configurable: delay-dns-blacklis-checks=1|0
or similar, defaulting to off if you're worried about efficiency.

 To find real numbers, you would have to consider how many connections 
 are accepted, how many are rejected and for what reasons.  Then look at 
 the popularity of different spamdyke features and specifically the 
 popularity of different DNS RBLs.  Use all that to find out what 
 percentage of rejected connections could avoid the DNS queries due to 
 local tests.

I can come up with local figures, but knowing globally what features of
spamdyke are used how often is probably impossible.

 Lastly, find a way to evaluate the real cost (wall time, server load and
 network load) of spamdyke's DNS queries versus the additional load
 generated by passing the extra SMTP traffic to qmail.

I can't imagine this latter additional load as being nontrivial, but it
could be measured to some extent using strace -c.

 If all of those numbers were available, my instinct says the advantage 
 of your proposed change would be very small at best.

This is still ignoring the unnecessary load on the RBL DNS servers (most of
us are using them for free, yet someone must pay for their maintenance and
bandwidth, so let's not be wasteful).

Also, I still think that in the case of email service, saving wall time
(i.e. reducing latency) is more beneficial than saving CPU time (probably a
minuscule amount of CPU time, at that). I find a net gain of 9 seconds per
message with a single bogus recipient hard to ignore.

Andras

-- 
 Andras Korn korn at chardonnay.math.bme.hu
 http://chardonnay.math.bme.hu/~korn/ QOTD:
History will record it. I know it because I'll write it myself.
___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] let qmail decide if it accepts a recipient before doing RHSBL?

2008-04-14 Thread Michael Colvin
Great!  Of course, this feature could also be used to determine if a
specific RBL is causing to many false-positives too...

Running all the checks simultaneously certainly will negate the need to
order them in any specific order and should make the overall process that
much faster, especially if you're using multiple RBL's.

What kind of effect will that have on server load?  Many RBL lookups
sequencially versus many RBL lookups simultaneously?  Seems like the process
might be faster, but will take more resources on the server?  Which would
likely mean a basic Push with the net result being faster handling of the
session?

Thanks again!
 

Mike


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sam 
 Clippinger
 Sent: Monday, April 14, 2008 9:40 AM
 To: spamdyke users
 Subject: Re: [spamdyke-users] let qmail decide if it accepts 
 a recipient before doing RHSBL?
 
 Yes, this is certainly possible.  Right now spamdyke 
 identifies the RBL in its message to the remote server but 
 not in the logs.  Good idea!
 
 What would be a good way to log this information (preferably 
 without breaking existing scripts)?  I'm thinking as I type 
 here, but spamdyke already follows the rejection reason with 
 parenthesis (when the log level is high enough) to indicate 
 which file/line matched for file-based filters... perhaps the 
 same could be done for RBLs/RHSBLs.  Something like this:
   DENIED_RBL_MATCH(rbl.example.com)
 
 As for reordering the RBLs to put the often-matched ones 
 first, the next version of spamdyke will make that less 
 necessary.  By default, it will query all RBLs 
 simultaneously, regardless of their order.  (That behavior 
 can be prevented with a new flag -- ordering would be 
 important in that case.)
 
 -- Sam Clippinger
 
 Michael Colvin wrote:
  
  To find real numbers, you would have to consider how many 
 connections 
  are accepted, how many are rejected and for what reasons.  
 Then look 
  at the popularity of different spamdyke features and 
 specifically the 
  popularity of different DNS RBLs.  Use all that to find out what 
  percentage of rejected connections could avoid the DNS 
 queries due to 
  local tests.
  
  Along those lines, is it possible, or can it be possible, to have 
  spamdyke's logs indicate which DNS RBL caused a message to be 
  rejected?  I'm assuming that once a reason for rejection is 
 found, IE, 
  the IP is listed in a particular RBL, further tests against other 
  RBL's in the list are not performed?  Knowing, statistically, which 
  ones have a higher rejection rate, and queuing those first 
 in the list of RBLS might save some time.
  
  Or course, multiple RBLS could reject the same message, and the one 
  first in line would have the higher percentage, but this 
 would give us 
  a way to move them around and check the results...
  
  Just a thought from a newbie to spamdyke. 
  
  BTW, I LOVE Spamdyke!  What a difference it has made in my system's 
  ability to filter spam and save resources!  It's a God send!
  
  Mike
  
  
  
  ___
  spamdyke-users mailing list
  spamdyke-users@spamdyke.org
  http://www.spamdyke.org/mailman/listinfo/spamdyke-users
 ___
 spamdyke-users mailing list
 spamdyke-users@spamdyke.org
 http://www.spamdyke.org/mailman/listinfo/spamdyke-users
 

___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] let qmail decide if it accepts a recipient before doing RHSBL?

2008-04-14 Thread Sam Clippinger
The aggressive DNS queries will definitely increase the momentary load 
on the DNS servers, because they will get a burst of simultaneous 
queries each time a remote server connects.  However, the overall load 
won't go up because the long-term rate of DNS queries is determined by 
the rate of SMTP connections.  I believe most sites will only notice an 
increase in spamdyke's speed with no new load on their DNS servers.

The new DNS behavior is configurable though, including an option to 
return to the behavior used by the standard system resolver.  That way, 
if the DNS servers are having trouble, spamdyke can be made less demanding.

-- Sam Clippinger

Michael Colvin wrote:
 Great!  Of course, this feature could also be used to determine if a
 specific RBL is causing to many false-positives too...
 
 Running all the checks simultaneously certainly will negate the need to
 order them in any specific order and should make the overall process that
 much faster, especially if you're using multiple RBL's.
 
 What kind of effect will that have on server load?  Many RBL lookups
 sequencially versus many RBL lookups simultaneously?  Seems like the process
 might be faster, but will take more resources on the server?  Which would
 likely mean a basic Push with the net result being faster handling of the
 session?
 
 Thanks again!
  
 
 Mike
 
 
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sam 
 Clippinger
 Sent: Monday, April 14, 2008 9:40 AM
 To: spamdyke users
 Subject: Re: [spamdyke-users] let qmail decide if it accepts 
 a recipient before doing RHSBL?

 Yes, this is certainly possible.  Right now spamdyke 
 identifies the RBL in its message to the remote server but 
 not in the logs.  Good idea!

 What would be a good way to log this information (preferably 
 without breaking existing scripts)?  I'm thinking as I type 
 here, but spamdyke already follows the rejection reason with 
 parenthesis (when the log level is high enough) to indicate 
 which file/line matched for file-based filters... perhaps the 
 same could be done for RBLs/RHSBLs.  Something like this:
  DENIED_RBL_MATCH(rbl.example.com)

 As for reordering the RBLs to put the often-matched ones 
 first, the next version of spamdyke will make that less 
 necessary.  By default, it will query all RBLs 
 simultaneously, regardless of their order.  (That behavior 
 can be prevented with a new flag -- ordering would be 
 important in that case.)

 -- Sam Clippinger

 Michael Colvin wrote:
 To find real numbers, you would have to consider how many 
 connections 
 are accepted, how many are rejected and for what reasons.  
 Then look 
 at the popularity of different spamdyke features and 
 specifically the 
 popularity of different DNS RBLs.  Use all that to find out what 
 percentage of rejected connections could avoid the DNS 
 queries due to 
 local tests.
 Along those lines, is it possible, or can it be possible, to have 
 spamdyke's logs indicate which DNS RBL caused a message to be 
 rejected?  I'm assuming that once a reason for rejection is 
 found, IE, 
 the IP is listed in a particular RBL, further tests against other 
 RBL's in the list are not performed?  Knowing, statistically, which 
 ones have a higher rejection rate, and queuing those first 
 in the list of RBLS might save some time.
 Or course, multiple RBLS could reject the same message, and the one 
 first in line would have the higher percentage, but this 
 would give us 
 a way to move them around and check the results...

 Just a thought from a newbie to spamdyke. 

 BTW, I LOVE Spamdyke!  What a difference it has made in my system's 
 ability to filter spam and save resources!  It's a God send!

 Mike



 ___
 spamdyke-users mailing list
 spamdyke-users@spamdyke.org
 http://www.spamdyke.org/mailman/listinfo/spamdyke-users
 ___
 spamdyke-users mailing list
 spamdyke-users@spamdyke.org
 http://www.spamdyke.org/mailman/listinfo/spamdyke-users

 
 ___
 spamdyke-users mailing list
 spamdyke-users@spamdyke.org
 http://www.spamdyke.org/mailman/listinfo/spamdyke-users
___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] let qmail decide if it accepts a recipient before doing RHSBL?

2008-04-14 Thread Eric Shubert
I like having specific RBLs logged. I just installed spamdyke on a few
qmail-toasters yesterday (replacing rblsmtpd), and was going to as about
this. Michael beat me to it! ;)

If simultaneous queries are being done, can all RBLs that match be logged?
Perhaps a comma separated list within parenthesis. This would make it
possible to gather stats on the effectiveness of the RBLs being used.

Sam Clippinger wrote:
 Yes, this is certainly possible.  Right now spamdyke identifies the RBL 
 in its message to the remote server but not in the logs.  Good idea!
 
 What would be a good way to log this information (preferably without 
 breaking existing scripts)?  I'm thinking as I type here, but spamdyke 
 already follows the rejection reason with parenthesis (when the log 
 level is high enough) to indicate which file/line matched for file-based 
 filters... perhaps the same could be done for RBLs/RHSBLs.  Something 
 like this:
   DENIED_RBL_MATCH(rbl.example.com)
 
 As for reordering the RBLs to put the often-matched ones first, the next 
 version of spamdyke will make that less necessary.  By default, it will 
 query all RBLs simultaneously, regardless of their order.  (That 
 behavior can be prevented with a new flag -- ordering would be important 
 in that case.)
 
 -- Sam Clippinger
 
 Michael Colvin wrote:
 To find real numbers, you would have to consider how many 
 connections are accepted, how many are rejected and for what 
 reasons.  Then look at the popularity of different spamdyke 
 features and specifically the popularity of different DNS 
 RBLs.  Use all that to find out what percentage of rejected 
 connections could avoid the DNS queries due to local tests.  
 Along those lines, is it possible, or can it be possible, to have spamdyke's
 logs indicate which DNS RBL caused a message to be rejected?  I'm assuming
 that once a reason for rejection is found, IE, the IP is listed in a
 particular RBL, further tests against other RBL's in the list are not
 performed?  Knowing, statistically, which ones have a higher rejection rate,
 and queuing those first in the list of RBLS might save some time.

 Or course, multiple RBLS could reject the same message, and the one first in
 line would have the higher percentage, but this would give us a way to move
 them around and check the results...

 Just a thought from a newbie to spamdyke. 

 BTW, I LOVE Spamdyke!  What a difference it has made in my system's ability
 to filter spam and save resources!  It's a God send!

 Mike



-- 
-Eric 'shubes'
___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] let qmail decide if it accepts a recipient before doing RHSBL?

2008-04-13 Thread Andras Korn
On Sun, Apr 13, 2008 at 02:55:16PM -0500, Sam Clippinger wrote:

 Andras Korn wrote:
  I don't agree. These days, there are RBLs that will automatically list and
  delist IPs in the space of a few hours, well within the lifetime of a single
  email message.
 
 If a server is being added and removed from RBLs in the space of a few 
 hours, its behavior must be just on the border between legitimate and 
 spammer.  In that case, I would think the administrator would want to 
 know about it by receiving a few complaints from users whose messages 
 were being bounced.

Again, let's agree to disagree. :)

 You started this thread with a complaint that temporary rejections were 
 needlessly consuming your server resources by causing the remote server 
 to retry deliveries multiple times.  I guarantee that making the RBL 
 filter return temporary rejection codes would waste considerably more 
 resources for everyone, as RBLs are much more common and more widely 
 used than RHSBLs.

In a way, that is true. However, if you allowed qmail to permanently reject
some of the spam (because it's addressed to a bogus recipient), the
temporary rejections wouldn't make that much of a difference because there
wouldn't be so many of them.

The added load caused by RBL-based temporary rejections I'm willing to
accept.

  rblsmtpd also uses temporary rejects, fwiw.
 
 Well, most of the major email providers (AOL, Yahoo!, GMail, Hotmail, 
 etc) use permanent rejections for RBL matches.

They probably use their own RBLs though, don't they? Also, they have
hundreds of thousands of users, so they aren't going to care about any
particular message not getting through.

At smaller sites, it's possible to keep a virtual eye on the qmail log (say,
using a script) that can alert you when some new type of mail is being
blocked. It's nice to be able to interfere.

  Temporary rejects also give the administrator a chance to whitelist an IP
  they do want to receive mail from (such as when it turns out that your new
  business partner's ISP just got blacklisted by an RBL).
 
 The administrator would have to be carefully watching the outbound queue 
 to notice a message was being held, then investigate the logs to find 
 out why.  I can't envision this happening unless the server is new and 
 the administrator is testing to make sure everything is working.

I didn't mean the administrator of the server sending the message, but the
admin of the server rejecting it. I've often manually whitelisted IPs
blocked by one RBL or the other. I have found this to be practically the
only way to use RBLs run by 3rd parties to block mail (instead of just
increasing their spamminess score in SpamAssassin).

 I understand now.
 
 What you're describing would make spamdyke more efficient only for users 
 who have modified/replaced their qmail-smtpd to support blacklists or 
 other filters.

Yes.

 Most qmail servers run a stock version of qmail-smtpd, which will only
 reject recipients for relaying.

They shouldn't, as stock qmail is liable to causing backscatter. No
self-respecting admin should run qmail as distributed by DJB today. Mail to
bogus recipients must not be accepted and then bounced later. Some mechanism
must be in place to ensure that mail to bogus recipients is never accepted
at all.

 On a stock qmail installation, this change would make spamdyke _less_ 
 efficient, since it would keep qmail running for all connections, at 
 least until the DATA command is given.

Yes, if you only consider the single server spamdyke and qmail are running
on. But issuing needless DNS queries also puts supefluous load on the local
caching DNS resolver and the DNS servers of the RBLs/RHSBLs. Wouldn't it be
a courtesy to them to not query their servers if a local decision can be
made to reject a message?

Also, local decisions have lower latency. It may be possible to reject a
message based on local tests in less wall time than by waiting for the RBLs;
thus, a higher connection rate could potentially be served, because many
connections would end sooner.

 However, the current code closes qmail as soon as possible to free up
 resources.

As soon as possible in terms of the SMTP conversation, certainly; but not
as soon as possible in real time, I'm pretty sure.

 As soon as possible depends on the configured filters -- the possibility
 of SMTP AUTH and the use of sender whitelists require qmail to continue
 running until MAIL FROM is seen.  The use of recipient whitelists
 require qmail to continue running until RCPT TO is seen.  But if
 spamdyke is configured to do graylisting, some RBLs, some rDNS tests and
 SMTP AUTH (a typical setup), qmail will be closed as soon as the MAIL
 FROM command is given.

I think you should always wait for RCPT TO, even if it's not necessary for
whitelist decisions, because then you can log whose mail you're rejecting.
rblsmtpd's inability to do this is one of its major shortcomings.

 I suspect we're debating fractional efficiencies 

Re: [spamdyke-users] let qmail decide if it accepts a recipient before doing RHSBL?

2008-04-12 Thread Sam Clippinger
The RHSBL filter checks rDNS names and sender addresses, not recipient 
addresses.  It also produces permanent rejection codes, not temporary 
ones.  If you're seeing the same sender rejected repeatedly, it's 
because the remote server is sending repeatedly.

Also, spamdyke should be disconnecting (and killing) qmail as soon as 
the blacklisted sender is given (depending on your configuration -- if 
you're using a recipient whitelist, qmail is disconnected after the RCPT 
command).  After that, all SMTP traffic is answered by spamdyke (with 
rejection codes).  So at least for that short time, spamdyke is saving 
resources.

However, with regard to blacklisted recipients, the reason spamdyke runs 
its filters before passing the RCPT command to qmail is because there 
may be multiple recipients.  Once a recipient has been passed to qmail, 
it cannot be removed.  Passing the RCPT command just to check the status 
code would effectively defeat spamdyke.

For example, imagine an unpatched qmail server.  The remote server names 
a blacklisted recipient, spamdyke passes it to qmail, checks the status 
code, then sends a rejection to the remote server.  Then the remote 
server names a second recipient that is not blacklisted.  spamdyke must 
allow the message to pass through because the second recipient is 
legitimate.  However, because the first recipient was already sent to 
qmail, that recipient will also receive the message.

-- Sam Clippinger

Andras Korn wrote:
 Hi,
 
 since I installed spamdyke my logs are inundated with messages like this
 one:
 
 DENIED_RHSBL_MATCH from: [EMAIL PROTECTED] to: [EMAIL PROTECTED] origin_ip: 
 85.179.173.120 origin_rdns: e179173120.adsl.alicedsl.de auth: (unknown)
 
 The recipient address is bogus and my (patched) qmail-smtpd would reject it
 permanently. Apparently, since it matches a RHSBL, spamdyke rejects the
 message temporarily, and the same client keeps trying for a while, always
 costing me some resources.
 
 I think this is wasteful; it would be better to only do the RHSBL lookup
 after the backend qmail-smtpd accepted the recipient address. If the
 backend qmail-smtpd throws a permanent rejection, spamdyke could just pass
 it on to the client.
 
 Andras
 
___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] let qmail decide if it accepts a recipient before doing RHSBL?

2008-04-12 Thread Andras Korn
On Sat, Apr 12, 2008 at 06:10:04PM -0500, Sam Clippinger wrote:

 The RHSBL filter checks rDNS names and sender addresses, not recipient
 addresses.

I know.

 It also produces permanent rejection codes, not temporary ones.

OK, with RHSBL, that is probably justified. However, I hope RBL by default
produces temporary rejects?

 If you're seeing the same sender rejected repeatedly, it's because the
 remote server is sending repeatedly.

Strange that they didn't do it so far, but apparently this is the case.

 Also, spamdyke should be disconnecting (and killing) qmail as soon as 
 the blacklisted sender is given (depending on your configuration -- if 
 you're using a recipient whitelist, qmail is disconnected after the RCPT 
 command).  After that, all SMTP traffic is answered by spamdyke (with 
 rejection codes).  So at least for that short time, spamdyke is saving 
 resources.
 
 However, with regard to blacklisted recipients, the reason spamdyke runs 
 its filters before passing the RCPT command to qmail is because there 
 may be multiple recipients.  Once a recipient has been passed to qmail, 
 it cannot be removed.  Passing the RCPT command just to check the status 
 code would effectively defeat spamdyke.
 
 For example, imagine an unpatched qmail server.  The remote server names 
 a blacklisted recipient, spamdyke passes it to qmail, checks the status 
 code, then sends a rejection to the remote server.  Then the remote 
 server names a second recipient that is not blacklisted.  spamdyke must 
 allow the message to pass through because the second recipient is 
 legitimate.  However, because the first recipient was already sent to 
 qmail, that recipient will also receive the message.

I'm not sure I understand what you're saying.

If a recipient is blacklisted in spamdyke, spamdyke should of course reject
it.

If it is blacklisted by qmail, qmail will reject it and spamdyke needn't
worry about it.

The SMTP conversation can continue, with each recipient specified by the
client being treated as above.

Finally, if any recipients were accepted by the backend qmail, spamdyke can
check RBL and RHSBL, and if there is a match, reject the client temporarily
(for RBL) or permanently (in the case of RHSBL), and send a QUIT to the
backend qmail.

The costly DNS lookups needn't be performed at all if qmail rejects all
recipients.

I see no situation where this scheme would result in mail being passed to
recipients who would otherwise not receive it.

I think all feasible local tests should be carried out before resorting to
remote tests, because those can be (and typically are) much slower.

Andras

-- 
 Andras Korn korn at chardonnay.math.bme.hu
 http://chardonnay.math.bme.hu/~korn/ QOTD:
   When I was your age, we had to walk ten miles to a node.
___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users