Wow; thanks for the suggestions!  I'll respond to each one inline below...

KORN Andras wrote:
> Hi,
> 
> I've just tried spamdyke and like it so far.
> 
> I have a few ideas for new features and some comments.
> 
> * I think spamdyke would make an even more seamless replacement for rblsmtpd
> if it supported the RBLSMTPD environment variable in roughly the same way as
> rblsmtpd itself; that is, if it's set but empty, skip RBL checks; if it's
> set to a string, reject the mail temporarily with the given string as the
> error message sent to the client; and if the string begins with a hyphen,
> reject the message permanently with the string sans the hyphen as the error
> message sent to the client. If the variable is unset, just filter normally.

I wasn't aware rblsmtpd included this feature.  I'm a little hesitant to 
duplicate it because of its design -- the existence of the environment 
variable and its effect on rblsmtpd's behavior are very non-intuitive. 
In particular, using the first character to signal a temporary/permanent 
rejection code is too obscure for my taste.

However, if other tools already set this variable, I can make spamdyke 
use it to allow better compatibility.  Since I don't (and won't) use 
this feature myself, whether I implement it is up to everyone here.  If 
people want it, I'll add it.

> It would be similarly desirable to specify custom error messages in
> blacklist files.

I've already added custom rejection messages to the next version of 
spamdyke.  The rejection message for each filter can be overridden in 
the configuration file (or on the command line).

> * Other environment variables could be supported in a similar way; e.g. if
> "WHITELIST" is set, skip all spam tests and allow the mail through; or
> perhaps even selectively enable/disable some tests based on envvars (which
> can be set by tcpsvd or a replacement).

I've implemented a flag in the next version of spamdyke that will 
function the way you describe the "WHITELIST" variable.  It doesn't use 
an environment variable but it is otherwise identical.  It also has a 
setting to block all messages (like an analogous "BLACKLIST" variable).

> I think this is more in line with The Qmail Way than using your own
> list-files and may also save resources (because you don't have to sift
> through half a dozen lists, just consult a handful of environment variables
> that were set by your parent process).

I understand your suggestion although I must admit I don't hold "The 
Qmail Way" in very high regard.  Too much of qmail is obscure, 
undocumented and/or only configurable by applying patches.  That's just 
my opinion though.

I don't like passing values to child processes in environment variables 
because they're not externally visible.  In other words, when an 
environment variable is set, only the child process can read it.  If the 
child process doesn't behave correctly, it's difficult/impossible to 
figure out why (or to reproduce the conditions for troubleshooting).  On 
the other hand, when the configuration is set through command line flags 
or configuration files, it's very easy to see what's happening. 
Configuration, testing and troubleshooting are much easier.

So, as with the RBLSMTPD environment variable, I would be willing to 
implement environment variable-based configuration in spamdyke in order 
to work with existing tools.  But I don't think implementing new 
features this way is a good idea.

When it comes to changing spamdyke's configuration for each connection, 
I think the next version of spamdyke will do what you want.  I've 
implemented a system where the configuration can be changed based on the 
incoming IP address, the incoming rDNS name, the sender address or the 
recipient address (or any combination of those four attributes).

> * As for filtering invalid recipients, I think the approach implemented by
> the SPAMCONTROL patch for qmail is feasible, at least for smallish
> installations: list all valid recipient addresses in a file (with wildcards
> supported), and block everything else.
[snip]
> recipient-whitelist-file isn't the same, because if a line is matched, all
> spam tests are skipped. With badrcptto, this is just an additional test:
> does the recipient exist?

Several other people have suggested storing a list of valid usernames in 
a file and I don't like that idea for several reasons.  First, it 
doesn't work for large sites.  spamdyke is being used on mail servers 
that host tens of thousands of domains.  The files would be too big, too 
difficult to maintain and too slow to search.  Second, how do you create 
and maintain the list of valid addresses?  Doing it by hand is not 
practical.  If there is a way to do it (correctly) from a script, please 
send me that script -- it contains all the logic I need to implement 
"real" recipient validation in spamdyke.  Third, if I add recipient 
validation by checking lists in files, I must continue to support it in 
the future, even if I later add "real" validation.  I'd rather do it 
correctly the first time.

I plan to implement real recipient validation in the version-after-next. 
  Enough people have asked for this that it's time to bite the bullet 
and reverse engineer the qmail and vpopmail authentication systems.

> * GeoIP support would be nice, and reading custom spamdyke.conf files based
> on the geoip lookup (such as spamdyke.cc.conf if it exists, then fall back
> to spamdyke.conf if it doesn't).

I'll make a note of this one but it could be a while before I get to it. 
  I've been considering adding a feature to spamdyke to run external 
scripts for checking values, to allow for situations I can't anticipate. 
  For example, someone may want to authenticate their users with an LDAP 
server.  If spamdyke can call an external script to do that, the 
administrator only has to provide the script.  Perhaps geolocation could 
work the same way.

> * Likewise, custom configfiles based on the value of an environment
> variable like SPAMDYKECONF=/etc/spamdyke/myclientclass.config?

I think the new configuration system in the next version will do what 
you want here.  Take a look at the next version (when it's ready) and 
suggest this again if it doesn't meet your needs.

> * About using a database instead of the filesystem to store some information:
> I agree with most of what the FAQ has to say about this, but regarding
> efficiency, if the database support were built into a separate daemon
> process, the overhead could be kept to a minimum (the per-connection
> spamdyke process would only need to support a very simple query interface
> that would allow it to talk to the database backend daemon). This modularity
> would also allow arbitrary databases to be supported without modifying
> spamdyke.

True.  I'm already planning on turning spamdyke into a daemon (to 
replace tcpserver) and this could be done at that time.  On the other 
hand, if I add the ability to call external scripts (as described 
above), those scripts could make the necessary database calls.  That 
way, each administrator could decide for themselves if they want 
database support.

> * Could RBL/RHSBL lookups be speeded up by doing them in parallel using e.g.
> libadns? Or are they done in parallel already?

Starting with version 3.1.0, spamdyke performs its own DNS queries (it 
doesn't use the system resolver library for queries) and sends some (not 
all) queries in parallel.  Specifically, it sends A, TXT and CNAME 
queries to each DNS RBL/RHSBL in parallel although it still checks the 
DNS RBLs/RHSBLs sequentially.

In the next version, I've changed that code.  All DNS RBLs/RHSBLs are 
queried simultaneously for all response types (A, TXT and CNAME). 
Several DNS servers can be queried simultaneously.  DNS servers, retries 
and timeouts are configurable.  spamdyke no longer uses any part of the 
system resolver library.

Because some DNS servers (and network hardware) become unstable when 
queried too aggressively, I've also added a flag to make spamdyke 
imitate the system resolver's behavior (one query at a time, one server 
at a time).

> * check-rhsbl tests two separate things: whether the rdns of the client is
> blacklisted, and whether the envelope from domain is blacklisted. I may want
> the latter without the former (which would also save DNS lookups).

I can add this if it's something people want.  I made spamdyke check 
both because that was how DNS RHSBLs were described everywhere I read 
about them.  Personally, I couldn't envision a scenario where one check 
would be desirable but the other would not.  It would be easy to 
separate though.

> * Log messages could be more succint, like for example:
> Accept: S:208.110.65.146:ns.silence.org H:iconoclast.silence.org F:[EMAIL 
> PROTECTED] T:[EMAIL PROTECTED]

I've tried to design the log messages to be both easily parsable by 
scripts and easily readable by humans.  I think the current format does 
that reasonably well -- space separated fields are simple to break apart 
with perl/awk/sed/cut and the verbose labels make it easy to read.  I'm 
not sure there's much to gain by cutting out a few dozen characters.

Thanks again for all of the suggestions!

-- Sam Clippinger
_______________________________________________
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users

Reply via email to