Wow; thanks for the suggestions! I'll respond to each one inline below... KORN Andras wrote: > Hi, > > I've just tried spamdyke and like it so far. > > I have a few ideas for new features and some comments. > > * I think spamdyke would make an even more seamless replacement for rblsmtpd > if it supported the RBLSMTPD environment variable in roughly the same way as > rblsmtpd itself; that is, if it's set but empty, skip RBL checks; if it's > set to a string, reject the mail temporarily with the given string as the > error message sent to the client; and if the string begins with a hyphen, > reject the message permanently with the string sans the hyphen as the error > message sent to the client. If the variable is unset, just filter normally.
I wasn't aware rblsmtpd included this feature. I'm a little hesitant to duplicate it because of its design -- the existence of the environment variable and its effect on rblsmtpd's behavior are very non-intuitive. In particular, using the first character to signal a temporary/permanent rejection code is too obscure for my taste. However, if other tools already set this variable, I can make spamdyke use it to allow better compatibility. Since I don't (and won't) use this feature myself, whether I implement it is up to everyone here. If people want it, I'll add it. > It would be similarly desirable to specify custom error messages in > blacklist files. I've already added custom rejection messages to the next version of spamdyke. The rejection message for each filter can be overridden in the configuration file (or on the command line). > * Other environment variables could be supported in a similar way; e.g. if > "WHITELIST" is set, skip all spam tests and allow the mail through; or > perhaps even selectively enable/disable some tests based on envvars (which > can be set by tcpsvd or a replacement). I've implemented a flag in the next version of spamdyke that will function the way you describe the "WHITELIST" variable. It doesn't use an environment variable but it is otherwise identical. It also has a setting to block all messages (like an analogous "BLACKLIST" variable). > I think this is more in line with The Qmail Way than using your own > list-files and may also save resources (because you don't have to sift > through half a dozen lists, just consult a handful of environment variables > that were set by your parent process). I understand your suggestion although I must admit I don't hold "The Qmail Way" in very high regard. Too much of qmail is obscure, undocumented and/or only configurable by applying patches. That's just my opinion though. I don't like passing values to child processes in environment variables because they're not externally visible. In other words, when an environment variable is set, only the child process can read it. If the child process doesn't behave correctly, it's difficult/impossible to figure out why (or to reproduce the conditions for troubleshooting). On the other hand, when the configuration is set through command line flags or configuration files, it's very easy to see what's happening. Configuration, testing and troubleshooting are much easier. So, as with the RBLSMTPD environment variable, I would be willing to implement environment variable-based configuration in spamdyke in order to work with existing tools. But I don't think implementing new features this way is a good idea. When it comes to changing spamdyke's configuration for each connection, I think the next version of spamdyke will do what you want. I've implemented a system where the configuration can be changed based on the incoming IP address, the incoming rDNS name, the sender address or the recipient address (or any combination of those four attributes). > * As for filtering invalid recipients, I think the approach implemented by > the SPAMCONTROL patch for qmail is feasible, at least for smallish > installations: list all valid recipient addresses in a file (with wildcards > supported), and block everything else. [snip] > recipient-whitelist-file isn't the same, because if a line is matched, all > spam tests are skipped. With badrcptto, this is just an additional test: > does the recipient exist? Several other people have suggested storing a list of valid usernames in a file and I don't like that idea for several reasons. First, it doesn't work for large sites. spamdyke is being used on mail servers that host tens of thousands of domains. The files would be too big, too difficult to maintain and too slow to search. Second, how do you create and maintain the list of valid addresses? Doing it by hand is not practical. If there is a way to do it (correctly) from a script, please send me that script -- it contains all the logic I need to implement "real" recipient validation in spamdyke. Third, if I add recipient validation by checking lists in files, I must continue to support it in the future, even if I later add "real" validation. I'd rather do it correctly the first time. I plan to implement real recipient validation in the version-after-next. Enough people have asked for this that it's time to bite the bullet and reverse engineer the qmail and vpopmail authentication systems. > * GeoIP support would be nice, and reading custom spamdyke.conf files based > on the geoip lookup (such as spamdyke.cc.conf if it exists, then fall back > to spamdyke.conf if it doesn't). I'll make a note of this one but it could be a while before I get to it. I've been considering adding a feature to spamdyke to run external scripts for checking values, to allow for situations I can't anticipate. For example, someone may want to authenticate their users with an LDAP server. If spamdyke can call an external script to do that, the administrator only has to provide the script. Perhaps geolocation could work the same way. > * Likewise, custom configfiles based on the value of an environment > variable like SPAMDYKECONF=/etc/spamdyke/myclientclass.config? I think the new configuration system in the next version will do what you want here. Take a look at the next version (when it's ready) and suggest this again if it doesn't meet your needs. > * About using a database instead of the filesystem to store some information: > I agree with most of what the FAQ has to say about this, but regarding > efficiency, if the database support were built into a separate daemon > process, the overhead could be kept to a minimum (the per-connection > spamdyke process would only need to support a very simple query interface > that would allow it to talk to the database backend daemon). This modularity > would also allow arbitrary databases to be supported without modifying > spamdyke. True. I'm already planning on turning spamdyke into a daemon (to replace tcpserver) and this could be done at that time. On the other hand, if I add the ability to call external scripts (as described above), those scripts could make the necessary database calls. That way, each administrator could decide for themselves if they want database support. > * Could RBL/RHSBL lookups be speeded up by doing them in parallel using e.g. > libadns? Or are they done in parallel already? Starting with version 3.1.0, spamdyke performs its own DNS queries (it doesn't use the system resolver library for queries) and sends some (not all) queries in parallel. Specifically, it sends A, TXT and CNAME queries to each DNS RBL/RHSBL in parallel although it still checks the DNS RBLs/RHSBLs sequentially. In the next version, I've changed that code. All DNS RBLs/RHSBLs are queried simultaneously for all response types (A, TXT and CNAME). Several DNS servers can be queried simultaneously. DNS servers, retries and timeouts are configurable. spamdyke no longer uses any part of the system resolver library. Because some DNS servers (and network hardware) become unstable when queried too aggressively, I've also added a flag to make spamdyke imitate the system resolver's behavior (one query at a time, one server at a time). > * check-rhsbl tests two separate things: whether the rdns of the client is > blacklisted, and whether the envelope from domain is blacklisted. I may want > the latter without the former (which would also save DNS lookups). I can add this if it's something people want. I made spamdyke check both because that was how DNS RHSBLs were described everywhere I read about them. Personally, I couldn't envision a scenario where one check would be desirable but the other would not. It would be easy to separate though. > * Log messages could be more succint, like for example: > Accept: S:208.110.65.146:ns.silence.org H:iconoclast.silence.org F:[EMAIL > PROTECTED] T:[EMAIL PROTECTED] I've tried to design the log messages to be both easily parsable by scripts and easily readable by humans. I think the current format does that reasonably well -- space separated fields are simple to break apart with perl/awk/sed/cut and the verbose labels make it easy to read. I'm not sure there's much to gain by cutting out a few dozen characters. Thanks again for all of the suggestions! -- Sam Clippinger _______________________________________________ spamdyke-users mailing list spamdyke-users@spamdyke.org http://www.spamdyke.org/mailman/listinfo/spamdyke-users