Re: [spamdyke-users] feature requests :)
On Tue, Apr 08, 2008 at 11:40:48PM -0500, Sam Clippinger wrote: Andras Korn wrote: On Tue, Apr 08, 2008 at 12:38:38AM -0500, Sam Clippinger wrote: blacklist connections at the TCP level. tcpserver (or its replacement) should only set appropriate environment variables based on the tests it has carried out and leave blacklisting to the next program in the chain, which speaks the application layer protocol concerned (SMTP in this case). This way, spamdyke needn't duplicate the work of tcpserver and can still obtain envelope information before refusing the mail. I think we're just going to have to agree to disagree on this. I OK. :) way.) The fact that environment variables are not easily visible to external viewers is a show-stopper for me. I still think though that as long as their contents can be logged by the child process that is using them, this is a non-issue. That's certainly one possible conclusion, but if you look at it this way, you could just as well implement your own SSL support or C library. IMHO it is perfectly acceptable for some features to only be available on some systems. I think the Unix Way is to re-use as much of what already exists as possible in order to minimize duplication of effort and to better focus development. A single tool should do a well-defined set of related things, and do that exceptionally well, while providing the ability to combine effectively with other tools. I wouldn't want ls(1) to have the features of find(1) just to be self-sufficient. implement in a reasonable timeframe.) As for reimplementing the C library, I have no problem with writing new code when the existing library doesn't meet my needs. Sure; the point I was trying to make is that it's often not a good idea to write your own code just for the sake of doing so if there is existing code that would do the job. Some reasons: - you may introduce bugs that are not present in the existing implementation (because you're working with something you're not yet experienced in, this is arguably pretty likely); - fewer eyeballs will see your code than that of the existing standard implementation, which means that security problems will probably be noticed later; - you don't automatically benefit from the continued development of the standard implementation. (Interestingly, DJB reimplemented portions of the C library when he wrote qmail and its supporting tools. For example, most of the memory allocation routines and string functions used by qmail are DJB's code, not the system library's.) I know, and part of me hates DJB for that. It makes his code a nightmare to modify, and modifications very hard to get right. At least he did it for good reasons, but with blatant disregard to everyone else. It's one of the reasons why qmail and most other djbware isn't more open. Your description of The Unix Way could just as easily be described as The Reusable Way or The Library Way or even The Free Software Way. As True. well with it. As I get requests for changes to work with a project/patch I've never heard of, I try to determine if the changes will benefit a large enough audience to be worthwhile. So far, you are the first person to request that spamdyke support environment variables the way rblsmtpd does. If other people also request it, I'll reconsider my position. I understand. If I can find the time, I may write a patch that does what I want. I don't think of myself as a programmer and I teach Unix system administration at a university. I may just have been lucky so far, but most of my students know about strace and are able to use /proc even before they enrol for my course. I think you're being unfair to system administrators here, or are including untrained computer operators in the term. As I see it, the important thing however is that specialist software shouldn't be designed to meet the needs of laymen; it should be built to best support the trained expert while being as useful as possible to those with less than expert knowledge but a willingness to learn. [snip] I wouldn't put anyone to whom daemon processes are mysterious in charge of a critical service. I certainly mean no disrespect to you, your students or any other system administrator. I know, I was just amazed that you think of sysadmins as, let us say, less than well trained for their jobs. However, on the mailing lists and forums I read, I see many questions from administrators who are obviously not familiar with the tools you've mentioned. This is a pretty sad state of affairs, but I don't think it's reason enough to develop system software that meets the needs of laymen at the expense of experts. Spamdyke is not an end-user tool, after all. If someone doesn't have the skills necessary to use it, they should acquire them, and it shouldn't be the software that is made less efficient or dumber to accommodate
Re: [spamdyke-users] feature requests :)
On Tue, Apr 08, 2008 at 12:38:38AM -0500, Sam Clippinger wrote: That makes sense, when it is explained in that way. However, I still don't find it intuitive, because it requires an explanation. The flag I've implemented in the next version of spamdyke will look like this: filter-level=normal filter-level=allow-all filter-level=require-auth filter-level=reject-all If any of those lines are present in a configuration file, I believe no explanation is required to understand their basic effect. The same is not true of an environment variable; that's why I don't like it. Well, if FILTER_LEVEL were an environment variable with the above values supported, I see no fundamental difference in the intuitiveness. However, if other tools already set this variable, I can make spamdyke use it to allow better compatibility. Since I don't (and won't) use this feature myself, whether I implement it is up to everyone here. If people want it, I'll add it. Would you integrate a patch with this functionality? Only if other code already exists that sets this environment variable (e.g. a tcpserver replacement). In fact, tcpserver can do it itself, based on tcp.cdb. But so can ipsvd. I've implemented a flag in the next version of spamdyke that will function the way you describe the WHITELIST variable. It doesn't use an environment variable but it is otherwise identical. It also has a The idea with the environment variable is that it can be set/unset using an arbitrarily flexible or complex mechanism outside spamdyke, based on arbitrary criteria. I don't see how you can duplicate that in any other way. If the parent daemon (e.g. tcpserver) can alter the environment for its children based on arbitrary criteria, why can't it alter spamdyke's command line instead? You know that tcpserver can't. ipsvd sort of can, but that solution doesn't scale well (it would boil down to spawning a script that starts spamdyke with a different command line for each connection). Support for environment variables exists and is scalable. I'm getting the impression you're describing software that hasn't been written yet anyway, so the environment doesn't have to be the only way to communicate with child processes. No, I'm totally writing about existing software here. Much of what you're doing in spamdyke is duplicating functionality that could be (and is) provided by a tcpserver replacement. For example, blacklisting IP addresses and rdns domains could be trivially accomplished using tcpsvd and environment variables; no need for these kinds of blacklists in spamdyke. [...] All very true. tcpserver does indeed provide the TCPREMOTEHOST environment variable, which spamdyke ignores. tcpserver also parses /etc/tcp.smtp.cdb but spamdyke ignores its efforts and reparses /etc/tcp.smtp anyway. Note that this isn't necessarily the same. tcp.smtp.cdb is updated atomically; the same is not true for tcp.smtp. I don't think re-using the source of a generated binary file in this way is a clean solution, but I won't argue this point because I don't use tcp.smtp at all anyway (because ipsvd also supports a different configuration scheme, see ipsvd-instruct(5)). There are several reasons I'm implementing these features in spamdyke and duplicating the effort put into tcpserver (and others). Efficiency is not always my top priority. I don't think this is about efficiency; it's more about clean separation of duties. First, there are some situations where spamdyke must perform duplicate work in order to achieve the correct result. SMTP AUTH is the best example -- authenticated users are allowed to bypass all filters. If blacklisting takes place before spamdyke is invoked, authenticated users will be incorrectly blacklisted. This is one of rblsmtpd's major failings. I completely agree. As I wrote in my previous message, it is undesirable to blacklist connections at the TCP level. tcpserver (or its replacement) should only set appropriate environment variables based on the tests it has carried out and leave blacklisting to the next program in the chain, which speaks the application layer protocol concerned (SMTP in this case). This way, spamdyke needn't duplicate the work of tcpserver and can still obtain envelope information before refusing the mail. Second, most qmail servers use DJB's tcpserver. Many replacements may be available but none are in wide use. For that reason, I must design spamdyke for the lowest common denominator of qmail configurations. If I make spamdyke dependent on an alternative daemon, spamdyke's popularity will immediately drop to (almost) zero. I don't think I recommended or requested any change that would make spamdyke dependent on any alternative to tcpserver; at least it wasn't my intention. try it quickly, see if it works and remove it just as quickly. I am a qmail expert yet I still
Re: [spamdyke-users] feature requests :)
Wow; thanks for the suggestions! I'll respond to each one inline below... KORN Andras wrote: Hi, I've just tried spamdyke and like it so far. I have a few ideas for new features and some comments. * I think spamdyke would make an even more seamless replacement for rblsmtpd if it supported the RBLSMTPD environment variable in roughly the same way as rblsmtpd itself; that is, if it's set but empty, skip RBL checks; if it's set to a string, reject the mail temporarily with the given string as the error message sent to the client; and if the string begins with a hyphen, reject the message permanently with the string sans the hyphen as the error message sent to the client. If the variable is unset, just filter normally. I wasn't aware rblsmtpd included this feature. I'm a little hesitant to duplicate it because of its design -- the existence of the environment variable and its effect on rblsmtpd's behavior are very non-intuitive. In particular, using the first character to signal a temporary/permanent rejection code is too obscure for my taste. However, if other tools already set this variable, I can make spamdyke use it to allow better compatibility. Since I don't (and won't) use this feature myself, whether I implement it is up to everyone here. If people want it, I'll add it. It would be similarly desirable to specify custom error messages in blacklist files. I've already added custom rejection messages to the next version of spamdyke. The rejection message for each filter can be overridden in the configuration file (or on the command line). * Other environment variables could be supported in a similar way; e.g. if WHITELIST is set, skip all spam tests and allow the mail through; or perhaps even selectively enable/disable some tests based on envvars (which can be set by tcpsvd or a replacement). I've implemented a flag in the next version of spamdyke that will function the way you describe the WHITELIST variable. It doesn't use an environment variable but it is otherwise identical. It also has a setting to block all messages (like an analogous BLACKLIST variable). I think this is more in line with The Qmail Way than using your own list-files and may also save resources (because you don't have to sift through half a dozen lists, just consult a handful of environment variables that were set by your parent process). I understand your suggestion although I must admit I don't hold The Qmail Way in very high regard. Too much of qmail is obscure, undocumented and/or only configurable by applying patches. That's just my opinion though. I don't like passing values to child processes in environment variables because they're not externally visible. In other words, when an environment variable is set, only the child process can read it. If the child process doesn't behave correctly, it's difficult/impossible to figure out why (or to reproduce the conditions for troubleshooting). On the other hand, when the configuration is set through command line flags or configuration files, it's very easy to see what's happening. Configuration, testing and troubleshooting are much easier. So, as with the RBLSMTPD environment variable, I would be willing to implement environment variable-based configuration in spamdyke in order to work with existing tools. But I don't think implementing new features this way is a good idea. When it comes to changing spamdyke's configuration for each connection, I think the next version of spamdyke will do what you want. I've implemented a system where the configuration can be changed based on the incoming IP address, the incoming rDNS name, the sender address or the recipient address (or any combination of those four attributes). * As for filtering invalid recipients, I think the approach implemented by the SPAMCONTROL patch for qmail is feasible, at least for smallish installations: list all valid recipient addresses in a file (with wildcards supported), and block everything else. [snip] recipient-whitelist-file isn't the same, because if a line is matched, all spam tests are skipped. With badrcptto, this is just an additional test: does the recipient exist? Several other people have suggested storing a list of valid usernames in a file and I don't like that idea for several reasons. First, it doesn't work for large sites. spamdyke is being used on mail servers that host tens of thousands of domains. The files would be too big, too difficult to maintain and too slow to search. Second, how do you create and maintain the list of valid addresses? Doing it by hand is not practical. If there is a way to do it (correctly) from a script, please send me that script -- it contains all the logic I need to implement real recipient validation in spamdyke. Third, if I add recipient validation by checking lists in files, I must continue to support it in the future, even if I later add real validation. I'd