Re: [PATCH] mod_smtpd_queue_smtp
But as far as I can tell, this code is all about SMTP forwarding (not even relaying per-se). Confuses me anyway :) I.e. smarthosting. Which might be a better name for the whole thing. For anything involving filtering + forwarding on SMTP, might be worth including a note to admins to remember to use consistent filtering on all mail exchangers (MX) for the domain in question. Otherwise spammers often try to deliver to one of the lower priority MX with the hope of bypassing spam filtering. I can't see how any of that is "our job" though, just thought I'd mention it.
Re: to the users of mod_smtpd
To Plugin Writers: If examining the source on svn, also remember to check out my completed RBL / DNSBL modules under modules/access. In particular, mod_dnsbl_lookup provides functionality that could easily be used to implement a spamassassin-style scoring filter. That particular module can also be used without mod_smtpd.
Re: mod_smtpd module review
I just saw this old message now... I have been moving and my new ISP still hasn't connected my service after 3 weeks. This past week I have finished up a few modules and ready for review. http://www.brianfrance.com/software/apache/mod_smtpd_load.tar.gz This is a nifty addition, thanks! I would also like to set the error code, because looking over rfc0821 I think it should return 452 or may be that needs to be a default for smtpd_run_connect soft errors (552 for hard errors). Should we allow the module to set the error code? Error codes can be sent with SMTPD_DONE. Otherwise error codes won't change in the next fifty years for things like SMTPD_DENY and friends. I'll look into changing those specific codes. Most of the error codes I got from Postfix and Qpsmtpd but of course they aren't perfect. I think there should be a way for other modules to specify a preferred return code. Access controls are a good example, whether mod_smtpd_rbl or something else that wants to ensure a specific code is returned.
Re: mod_smtpd filter support
> This mostly means that mod_smtpd is very close to completion. I expect > some bug-fixes and I plan on adding a one-recipient/one-transaction > feature and a message body reading abstraction, but other than that it > seems to be in its final working state. Features include: Thanks for the update Rian, glad I just caught your message now as I am in the process of moving (will be on highways for next few days) and due to moving/univ will probably not have computer access again until Sept. 8 or so. Nick and/or other mentors, I sent an email to this effect and inquiring whether there was something else I am expected to do before September but I did not receive any replies.
Re: mod_smtpd overhaul
+1, Jem since you have checked in the first plugin for mod_smtpd would you mind creating a directory structure similar to this if it seems fine to you? I've built the directory structure to hold multiple modules. I also updated my modules so they build without duplicating .h files, and fixed the various copyright and style issues Garrett mentioned earlier. Also updated mod_smtpd_rbl so that it uses the APR array method for msgs. The changes have been committed, please give it a look-over. - Jem
Re: mod_smtpd overhaul
> > I don't have a problem with it. Do I need a verification of my earlier > > commit before I commit a new directory structure? > > What do you mean by "verification"? In your earlier email you said "Since the commit mail hasn't come through yet (needs to be approved I imagine)"
Re: mod_smtpd overhaul
> +1, Jem since you have checked in the first plugin for mod_smtpd would > you mind creating a directory structure similar to this if it seems fine > to you? I don't have a problem with it. Do I need a verification of my earlier commit before I commit a new directory structure?
Committed mod_smtpd/trunk/mod_smtpd_rbl and mod_dnsbl_lookup
> Hopefully later today I should have this completely done and checked in. I waited for Rian to update the mod_smtpd structures, and I have now checked in my code for RBL functionality. There are README files in both directories describing use. However could someone tell me how to properly use mod_smtpd.h and dnsbl_lookup.h in the build process? I've copied them between directories but this can't be the right way to do it. https://svn.apache.org/repos/asf/httpd/mod_smtpd/trunk/mod_smtpd_rbl/ - Adds RBL whitelisting and blacklisting to mod_smtpd, either rejecting client IPs upon connection (DNSBL) or envelope sender domains (RHSBL). By hooking into Rian's smtp this remains totally modular and does not alter mod_smtpd itself. https://svn.apache.org/repos/asf/httpd/mod_smtpd/trunk/mod_dnsbl_lookup/ - Does the actual DNSBL and RHSBL lookups, supporting rather advanced configuration in the form of distinct query chains. Many chains can be defined so admins can use chains for different purposes. Flags to the lookup functions allow different query styles, such as either stopping on one match or querying everything and returning a table of response details. Sample configuration for mod_smtpd + mod_smtpd_rbl + mod_dnsbl_lookup # Enable mod_smtpd SmtpProtocol On # Define whitelist and blacklist chains for mod_smtpd_rbl SmtpWhitelist mywhitelist SmtpBlacklist myblacklist # Enable mod_dnsbl_lookup DnsblLookups On # The zones and chains for mod_dnsbl_lookup RhsblZone myblacklist rhsbl.ahbl.org. 127.0.0.2 RhsblZone myblacklist abuse.rfc-ignorant.org. 127.0.0.4 DnsblZone myblacklist sbl.spamhaus.org. any DnsblZone myblacklist cbl.abuseat.org.any
Re: mod_smtpd overhaul
> Is this the right way or is there an example module I could compare > with? I noticed a couple posts about examples, there is now one as I have committed all the RBL stuff I wrote. See: https://svn.apache.org/repos/asf/httpd/mod_smtpd/trunk/mod_smtpd_rbl/ This hooks into mod_smtpd in two places and returns various data (e.g. if the client IP is blacklisted then mod_smtpd is told to deny mail). I hope it serves as a good example, it seems to work quite nicely to give mod_smtpd all the DNSBL/RHSBL features in a modular fashion.
Re: Supporting RBL in mod_smtpd
> > smtpd_run_connect (might deny service to connecting IP, per > > request_rec) > > smtpd_run_mail (might deny service to this envelope domain, per loc) > +1 > ... > Don't do this just yet, mod_smtpd is changing completely! completely = > structures/io. I should commit my changes very soon so you can start > working on this. OK, I'll watch for the changes. Make sure you keep what I need though :) smtpd_run_connect should somehow pass the address of the peer (currently that's within the request_rec) smtp_run_mail should still pass the MAIL FROM address the peer specifies, it currently comes in the char* loc. As long as this data is still available to me and I can return a code to reject the mail, we should be good to go. Somewhat trivial I hope.
Supporting RBL in mod_smtpd
Here is my current plan for introducing the RBL support in mod_smtpd, using the existing mod_dnsbl_lookup which I posted earlier. This way of accomplishing the RBL support should not require any code modification to mod_smtpd itself. Nick and Rian, let me know if I should be going about this a different way? I thought the most modular fashion would be to create a mod_smtpd_rbl that registers the following mod_smtpd hooks: smtpd_run_connect (might deny service to connecting IP, per request_rec) smtpd_run_mail (might deny service to this envelope domain, per loc) These would query whitelists and blacklists, whatever is available. I don't mind whipping up this bridging mod_smtpd_rbl module, but if it seems excessive to introduce a new module for this purpose then the other way of doing this would be to add the RBL supporting code into mod_smtpd itself. Either way it's done, RBLs are still not required and mod_dnsbl_lookup does not have to be present for mod_smtpd to function normally. However, adding a new bridging module has the advantage of leaving mod_smtpd code alone and taking advantage of the hooks interface.
Re: mod_dnsbl_lookup 0.90
> That's super in-efficient for the majority case, and there's no > application level caching, which tends to be a must for most > implementations (even if it is only per-request, like Exim's or We talked about this on IRC, and it seems the preferred approach is to delegate the caching responsibility to an entity that is made purely for that purpose, for example DJB's local DNS cache software or even rbldnsd (an extremely fast DNSBL server) running locally. I did start to implement software side caching in mod_dnsbl_lookup but it raised questions as to whether it's appropriate to have global scale caching when we're doing connection and request oriented processing. So I've left caching out of mod_dnsbl_lookup 0.91
New mod_dnsbl_lookup release
I don't have svn access yet, but I have posted the module here: http://www.sysdesign.ca/archive/mod_dnsbl_lookup-0.91.tar.gz This is much improved from my earlier 0.90, taking advice from Colm. With this new style of configuration the module can be used more flexibly for blacklists, whitelists, or other things. Configuration now looks like: DnsblZone spammers sbl.spamhaus.org. any DnsblZone spammers dnsbl.sorbs.net.127.0.0.5 DnsblZone spammers dnsbl.sorbs.net.127.0.0.6 DnsblZone whitelist customers.dnsbl any RhsblZone spammers rhsbl.ahbl.org. 127.0.0.2 The README in the above tarball is very thorough and describes how to use the module's functions. I'm interested in adding the functionality into mod_smtpd of course. Rian and Nick: how should we proceed on that? Here in brief is a relevant part of my README === 4. Using from mod_smtpd === The function calls work in isolation, without requiring any prior setup before using DNSBLs. The server configuration takes care of all DNSBL and RHSBL setup, including domains to query and responses to interpret as positive. The important knowledge link between mod_dnsbl_lookup and its user, say mod_smtpd, is the chain name that defines the desired DNSBLs. Instead of hard coding a chain name, it makes much more sense to have a module such as mod_smtpd load during its configuration some chains to work with. So mod_smtpd might have configuration directives such as: SmtpBlacklistChain blackchain SmtpWhitelistChain whitechain Now mod_smtpd knows which chain to query for blacklisting purposes, and which chain to query for whitelisting purposes. The admin may leave either chain undefined of course and can easily modify the configuration by substituting different chain names (as used by DnsblZone and RhsblZone). The pseudo code within mod_smtpd might then be: Attempt to load optional dnsbl_lookup functions If functions are available If dnsbl_lookup_ip("whitechain", client) == DNSBL_POSITIVE return ALLOW_SERVICE// even if blacklisted Else If dnsbl_lookup_ip("blackchain", client) == DNSBL_POSITIVE return DENY_SERVICE return ALLOW_SERVICE// default action - Jem
Hash table growth
When I looked at the expand function used by apr_hash.c it looked to me like it keeps growing if you keep using 'set' with novel values. I was thinking of using apr_hash in order to cache DNSBL queries for my module. It would ensure rapid cache search but I am having trouble figuring out how I could remove existing entries. I really _want_ collisions to happen but I'm not sure if this is possible. Any tips on how I can overwrite existing entries in the hash table, rather than keep expanding the table entries? e.g. key ABC is already in the table, and it collides with XYZ which I now want to add. However, if I apr_hash_get(XYZ) it will tell me correctly that this key is not present; and apr_hash_set(XYZ) now expands the table right?
Re: New mod_smtpd release
> Well there's also another problem. RFC 2821 (SMTP) doesn't define a > particular message format for SMTP (in wide use there the RFC 822 and > MIME message formats). I don't think that mod_smtpd should assume a RFC > 822 or MIME message format since its strictly a SMTP module, that's why I agree with this > I still think header parsing should be in another module. Of course > this module is free to register itself as an mod_smtpd filter and do > what it needs to do, but it shouldn't be part of the main mod_smtpd. That seems wise. Any weird thing can come through over SMTP, it could look very much unlike an email after all. You're handling the protocol in your module and that means the SMTP protocol as I understand, not MIME or anything.
Re: mod_dnsbl_lookup 0.90
Following up on mod_dnsbl, a new version is nearing completion although I have encountered some obstacles that slowed me down. I have taken some of Colm's advice to make mod_dnsbl_lookup more flexible and self sufficient. I'm attaching the documentation part of what I'm currently working on. If anyone sees any logic problems, please let me know! I have made a major effort to document this thing sufficiently that anyone stumbling upon it won't have to struggle wit how the heck to use it. Hopefully I will post version 0.91 tomorrow or Monday. - README follows A DNSBL or RHSBL is just a form of efficient database that returns a simple code (expressed as an IP address) for a given lookup key. The lookup key is either an IPv4 or IPv6 address in the case of a DNSBL, or a host/sub/domain name in the case of a RHSBL. The return code from the database may be an IP address such as 127.0.0.2 or NXDOMAIN, indicating no match. DNSBLs are often used in spam filtering, where the return code 127.0.0.x indicates that the lookup key (a relay's IP address) is blacklisted. However the meaning of the information returned by a database is on no way limited to this. Sometimes the DNSBL server intends positive matches to be whitelisted hosts; other times there are a variety of 127.0.0.x codes each meaning something different. For this reason we discourage the use of the term blacklist or RBL (real time blacklist) because this is just one use of DNSBLs and RHSBLs. This mod_dnsbl_lookup aims to provide generic and flexible DNSBL and RHSBL use without limiting functionality. Each server has its own policy and return codess, so you must configure dnsbl_lookup_query appropriately as there is no intrinsic way to know if something is blacklisted, whitelisted, or somewhere in between. Define servers and codes that you consider "positive matches" under one or more chains. This allows you to make independent configurations for different uses. Note that only IPv4 is supported at the moment. # This might be under a mod_smtpd virtual server config # Enable module DnsblLookups On # # Need to get host names for RHSBL lookups to work # Note that terminating dot in server names prevents local domain search HostNameLookups On # # The following define positive matches for the chain I call "spammers" # # Any non-failure result from sbl.spamhaus.org is a positive match DnsblIPv4 spammers sbl.spamhaus.org. any # # The 127.0.0.2 result from cbl.abuseat.org is a positive match DnsblIPv4 spammers cbl.abuseat.org.127.0.0.2 # # Only the specific codes 127.0.0.5,6,9 from dnsbl.sorbs.net are positive # The module internally caches queries, only one actual DNS query is made DnsblIPv4 spammers dnsbl.sorbs.net.127.0.0.5 DnsblIPv4 spammers dnsbl.sorbs.net.127.0.0.6 DnsblIPv4 spammers dnsbl.sorbs.net.127.0.0.9 # # The following define positive matches for the chain I call "whitelist" # # A zone designed for whitelisting, any mail from Canada is positive DnsblIPv4 whitelist ca.countries.nerd.dk. 127.0.0.2 # # A local zone we run, customers or partners of ours are positive DnsblIPv4 whitelist customers.dnsbl any # # A chain for RHSBL lookups (distinct from DNSBL chains) # RhsblZone spammers rhsbl.ahbl.org. 127.0.0.2 With this configuration, a user could now do a DNSBL_ANYPOSTV_RETFIRST query on the "spammers" chain to see if a host is a spammer (returns DNSBL_POSITIVE when the first positive response is encountered). The user might also want to do a DNSBL_ANYPOSTV_RETFIRST on the "whitelist" chain and allow through any host that returns DNSBL_POSITIVE, meaning it is whitelisted. If the whitelist override is more stringent, a DNSBL_ALLPOSTV_RETEVERY query might be done instead to require that every single entry in the "whitelist" chain returns a positive result. A more lenient admin might instead do a DNSBL_ANYPOSTV_RETEVERY query on the "spammers" chain and do post processing after getting DNSBL_POSITIVE. The table returned by the lookup (see below) contains detail on every positive match, so the admin may want to only block mail from the host if there are at least 2 positive zones. The disadvantage of this are many extra queries. The configuration (above) simplifies the client code down to querying a specific chain using a certain query mode. The functions used are: dnsbl_lookup_ip(const char* chain, int querymode, apr_sockaddr_t* address, apr_pool_t* p, server_rec* s, apr_table_t** zonedata) dnsbl_lookup_domain(const char* chain, int querymode, const char* domain, apr_pool_t* p, server_rec* s, apr_table_t** zonedata) With return values: DNSBL_POSITIVE - Positive match (zonedata has details, if requested) DNSBL_NEGATIVE - Negative DNSBL_FAILURE - Generic failure, e.g. DnsblLookups Off or invalid chain For DNSBLs, you would use dnsbl_lookup_ip() and pass the IP address in the apr_sockaddr_t*. For
Re: mod_dnsbl_lookup 0.90
Sure, we could support them but if they are the only one (and without public documentation on how to use) then aren't we making guesses from a rare case? I haven't found any public discussion on IPv6 DNSBL conventions. For example, what is the standard for how to place the IPv6 string under the DNSBL zone? Are we still using decimal octets? Can you point me towards some examples? On Sat, 13 Aug 2005, Colm MacCarthaigh wrote: On Sat, Aug 13, 2005 at 03:20:10PM -0700, Jem Berkes wrote: I haven't found any examples of IPv6 RBLs. rbl-plus.hea.net. If you can give me a small fixed IP range, I can arrange access. -- Colm MacCárthaighPublic Key: [EMAIL PROTECTED]
Re: mod_dnsbl_lookup 0.90
Cool. I'd split dnsbl_zones into ipv4_dnsbl_zones and ipv6_dnsbl_zones and have the DnsblZones directive work like; DnsblIPv4Zones DnsblIPv6Zones or similar. IPv6 RBL's do exist, and are incompatible with IPv4 ones, so it's worth having the support early-on. I haven't found any examples of IPv6 RBLs. Could you point me to one? What I'm finding on the web and usenet is that there is still no established standard for IPv6 blocklists. Unless I can find some reference, any implementation I make would be a guess so I'm leaving this unsupported right now.
Re: mod_dnsbl_lookup 0.90
Sorry for the slow replies, our phone landline +internet is dead and the telco [TSX: MBT] won't fix it for a week. Terrible for getting work done. > Cool. I'd split dnsbl_zones into ipv4_dnsbl_zones and ipv6_dnsbl_zones > and have the DnsblZones directive work like; > > DnsblIPv4Zones > DnsblIPv6Zones That's a good idea, I suspected IPv6 RBLs might exist :) I'll add the IPv6 support. > dnsbl_lookup_query() takes an IP address argument as a string, but it > would probably be a lot better to take it as an apr_sockaddr_t, since > that's an IP version agnostic format, and is generally the way an Apache > module would have the address available to it. The problem this introduces is when looking up RHSBLs, which operate on host names or domain names instead of IP address. Would you recommend different functions for DNSBL (pass an IP) and RHSBL (pass a hostname or domain name)? > Passing it around in binary format also helps you avoid using sscanf and > the associated reentrancy problems on many platforms. I did not know there were reentrancy problems with sscanf. strtok I know. > The implementation is neat, but it could also do with efficiency being > in mind, IME (I help run a very large RBL) rbl lookups tend to be a big > source of latency during request/mail handling and it's worth making the > effort to go a bit further :) Yes, I am going to add some caching for recent queries. I thought at first that the resolver already does this but as far as I can tell, it does not do any caching. > Although the dnsbl_lookup_query() function's output is comprehensive, > perhaps more useful and efficient would be to supply a framework for > allowing modules to check DNSBL's in a boolean manner. As-is the code > scans every registered RBL, even if one flags an address as listed. > That's super in-efficient for the majority case, and there's no > application level caching, which tends to be a must for most > implementations (even if it is only per-request, like Exim's or > sendmail's implementations for example). I agree. What I've started can probably be taken much further but I want to put the basic layers there first. I'll split up the code so it will be easier to modify later to not query all at once. > Part of the lack of boolean-checking reveals another problem, how are > other modules supposed to know what constitutes a positive for a > particular RBL? What constitutes a positive depends entirely on the particular RBL's policy. Some RBLs are whitelists themselves, so if an IP or domain matches then it should NOT be blocked.
mod_dnsbl_lookup 0.90
I've posted it here. I've been testing it with 2.1.6-alpha http://www.sysdesign.ca/archive/mod_dnsbl_lookup-0.90.tar.gz The README file should describe everything. This is a module providing an optional utility function intended for (but not limited to) mod_smtpd. The function allows the user to query DNS based blocklist databases, both DNSBL and RHSBL style, for arbitrary data. This can be used for all kinds of filtering and anti-spam use, including score systems such as spamassassin. In the case of SMTP the query can be the client's IP, client's host name, and the domain used in the sender's address. I know that Rian is currently redesigning much of mod_smtpd but for demonstration purposes I have included code that brings spam blacklisting functionality to mod_smtpd 0.1. If a blacklisted client connects, they will be denied service. It is more standard however to have these checks done after RCPT TO, at which time the envelope domain can also be checked against RHSBL. It would be great to hear some feedback as I am very new to writing these modules. I've tried to use the proper apr_ calls whenever available.
Exporting functions (hooks?)
For what I'm currently trying to accomplish, I'm not sure if hooks are the way to go. Also, without examples of this kind of situation I'm somewhat in the dark so hopefully someone can push me in the right direction. I'm writing mod_dnsbl_lookup to provide DNS based blocklist lookup facilities. The module's configuration will identify zones to query. So that other modules can easily make these queries, I'd like to make a generic function that allows any other module to do dnsbl lookups and receive the results in a table (say request_rec's notes). So I want to export a generic utility function, such as int dnsbl_lookup_do_query(table* results) What is the most proper way to do this? Is it appropriate to use AP_DECLARE_HOOK etc as described at http://httpd.apache.org/docs/2.0/developer/hooks.html
Re: SSI for gzipped files
> I have an idea for someone to implement and give me credit for. I > recently needed to have my SSI work from a gz file. The server-intensive > way to make Interesting, I've wanted to accomplish the same thing but couldn't figure out a good way to do it. Maybe there are a lot of people who need this?
Re: Initial mod_smtpd code.
> Overall blacklists aren't that effective and cause a lot of false > positives. They may make sense in the case of something like > SpamAssassin which uses a blacklist in conjunction with other false > positives, but by themselves they really aren't a responsible way of > dealing with the spam problem. I think it's better to discourage "worst > practices" than to sucumb to plugin mania. Blocklists aren't fundamentally broken, they are a tool which can be used properly or misused (just like many other tools). Many admins choose to maintain their own DNSBLs for one reason or another. It may be a way to control relay access based on their own subscriber IP addressess. At my site we keep a record of IPs that have persistently abused our site over the past few days. i.e. DNSBL != (SPEWS or MAPS or whatever)
Re: Initial mod_smtpd code.
> But is anyone dealing with outgoing SMTP via a proxy_smtp in the > mod_proxy framework? I think you were discussing that a short while > ago, weren't you? I think that might be higher priority. I hesitated on that because I did not understand at all how mod_proxy fits into this. i.e. I don't see how the proxy mechanism helps in relaying out mail to other SMTP servers. Here's an idea on how we can start on the outbound SMTP side of things: I can start work on a "mod_smtp_relay" which takes an email and attempts to relay it via the appropriate MX relay. This involves some DNS queries for MX records, and making new TCP connections to another SMTP server. I recommend that mod_smtp_relay does not itself do any spooling or queueing, to isolate these tasks. i.e. some other delivery/scheduler will handle spooling and retries etc, and occasionally pass an email to mod_smtp_relay So given an input message, mod_smtp_relay would make an immediate relay attempt and then return success (sent) or an error describing where along the lines things went wrong -- it could be DNS, TCP connect failure, or SMTP error dictating permanent failure or temporary failure, needing retry.
Re: Initial mod_smtpd code.
> Hmm. That sounds like a good idea, maybe there already is a hook > defined that could deal with this, I'll look into it. I could also start work on a mod_smtpd_dnsbl if the mentors feel that is worthwhile? This would look up a connecting IP address against a blacklist and return a descriptive string to mod_smtpd if the client should be rejected with an error: "550 5.7.1 Email rejected because 127.0.0.2 is listed by sbl-xbl.spamhaus.org" I'd also like to include support for RHSBL, a newer type of listing by domain names from the envelope sender address. That's used by a growing number of projects.
Re: DNSBL filtering for Apache
> Apache -- the HTTP side too -- would benefit from DNSBL support. Or does > this already do this? For example, both the CBL and AHBL projects list > IP addresses of hosts engaging in activities such as proxy hijacking and > spam relaying. This means it would be useful for webmasters to be able > to make use of the published DNSBL to deny access to http requests. Gosh, it already exists thanks to Blars Blarson http://www.blars.org/mod_access_rbl.html I wonder if the existing module can somehow be used for mod_smtpd as well? I'm still not familiar with enough 2.x style modules to know if that would work somehow.
DNSBL filtering for Apache
While I was thinking about Nick's suggestion for mod_rbl (blacklist lookups with mod_smtpd) I happened upon this idea, which is somewhat unrelated to the smtp project. DNSBLs, the dominant form of real time blacklisting, are not specific to SMTP because this is just a way to publish lists of IP addresses. RHSBLs, which look up the address in an SMTP envelope, are specific to SMTP however. Apache -- the HTTP side too -- would benefit from DNSBL support. Or does this already do this? For example, both the CBL and AHBL projects list IP addresses of hosts engaging in activities such as proxy hijacking and spam relaying. This means it would be useful for webmasters to be able to make use of the published DNSBL to deny access to http requests. Because DNSBLs are an efficient way to publish lists, webmasters might start using a DNSBL lookup feature in Apache to limit abuse of say message forums, cgi scripts, proxy gateways. Currently, this has to be done by importing a complete list of IP address (often tens of megabytes) into a firewall script or Apache configuration.
Re: Initial mod_smtpd code.
This is my first attempt at writing an experimental version of mod_smtpd. I don't yet have svn access yet so this code can be downloaded from http://rian.merseine.nu/mod_smtpd-0.1.tar.gz. Nifty! I had some compilation problems involving regex, so in the attached patch I use ap_regex.h and change some defines. Hope this doesn't break anything. The other bug I partially fixed was, strstr in smtp_protocol.c only does exact matches so uppercase commands like MAIL FROM would fail. I added support for the upper case, but this needs to be improved still because mixed case doesn't work. Is there an APR function like stristr? The overall structure and the approach you took is very nice, easy to understand. I would recommend adding a hook immediately upon the client connection, because an external module (maybe for DNSBLs, or some rate limiting control) might not even want us to return a greeting at all -- i.e. close with "554 Service unavailable" right away. But I like what you have, would be happy to keep working around this design.*** smtp_protocol.c.orig2005-07-18 23:07:26.0 -0500 --- smtp_protocol.c 2005-07-18 23:39:18.0 -0500 *** *** 26,31 --- 26,32 #include "http_log.h" #include "ap_config.h" #include "ap_mmn.h" + #include "ap_regex.h" #include "apr_lib.h" #include "apr_buckets.h" #include "scoreboard.h" *** *** 151,156 --- 152,160 smtpd_request_rec *sr = smtpd_get_request_rec(r); char *loc = strstr(buffer, "from:"); int retval; + + if (loc == NULL) + loc = strstr(buffer, "FROM:"); if (loc == NULL) { *** *** 177,182 --- 181,189 smtpd_request_rec *sr = smtpd_get_request_rec(r); char *loc = strstr(buffer, "to:"); int retval = 0; + + if (loc == NULL) + loc = strstr(buffer, "TO:"); if (loc == NULL) { ap_rprintf(r, "%d %s\r\n", 501, "Syntax: RCPT TO:"); *** *** 277,283 } HANDLER_DECLARE(vrfy) { ! regex_t *compiled; int error; int retval = 0; --- 284,290 } HANDLER_DECLARE(vrfy) { ! ap_regex_t *compiled; int error; int retval = 0; *** *** 286,292 "[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])?" "(\\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])?)*$"; ! compiled = ap_pregcomp(r->pool, rexp, REG_EXTENDED | REG_NOSUB); error = ap_regexec(compiled, buffer, 0, NULL, 0); ap_pregfree(r->pool, compiled); --- 293,299 "[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])?" "(\\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])?)*$"; ! compiled = ap_pregcomp(r->pool, rexp, AP_REG_EXTENDED | AP_REG_NOSUB); error = ap_regexec(compiled, buffer, 0, NULL, 0); ap_pregfree(r->pool, compiled);
Re: Initial mod_smtpd code.
> Jem/Paul/Nick: I'm especially interested in what you think about the > design I've laid out in this implementation. I'll try this out today and send my feedback. With respect to hooking every command, the reason I suggested that is to offer some usefl facilities to those writing filter modules. It may work with the way you've laid it out too, I'll check for that angle of it.
Re: mod_smtpd design - protocol
> Have you considered using libapreq2 for parsing > the mime headers in there? The header parser > should really convenient for that, you could > even introduce a post-header-parser hook that > runs when the parser finishes. My own suggestion is that we don't touch or try to interpret MIME. Parsing the message headers into a table is straightforward but once you get into recognizing MIME you're moving out of the protocol realm and into the message format realm - and you start having to worry about messages within messages, boundaries, corrupt structures, and other things that I think are not mod_smtpd's problem. I don't know if the other smtp project people share my opinion here.
mod_smtpd design - protocol
I want to focus a bit on mod_smtpd design, in particular the protocol module (which accepts connections and does the E/SMTP talking). I've seen various ideas thrown around on what exactly the module should do. It would be nice if we could come up with at least the high level design specs for this, just so we're all on the same page about what the module will do and what facilities it will offer to external module users. Talking with bfrance on IRC gave me a better sense of what other developers are hoping the smtp module can provide. The competing technology seems to be Sendmail's milter interface which allows developers to hook their custom filters into various stages of SMTP transactions. Anyone using mod_smtpd for filtering purposes will want to hook into various stages of the SMTP transaction - anywhere between start and end, or specific middle commands. So let me throw this out there as a starting point because I don't think this has been documented yet? The mod_smtpd protocol module accepts client connections and speaks E/SMTP, with default processing for all commands. e.g. it has a default greeting upon connection, a default response to EHLO/HELO, default accepts any envelop sender in MAIL FROM, default rejects any recipient in RCPT TO to prevent open relay configuration, etc. However the module should provide several hooks to allow another module to use smtp. Off the top of my head, we need at least these hooks: - upon connection from some client User might introduce delay, lookup IP for RBL, customize greeting - upon receiving HELO/EHLO from client - upon receiving MAIL FROM - upon receiving RCPT TO etc - upon receiving other command like VRFY, RESET, NOOP - upon receiving invalid command I think this granularity is required. But I'm not sure about how the DATA hook would work? Among the two people who already have some code for smtp, are you coding something along these lines?
Re: mod_smtpd design.
I'd like to see more of you on IRC:-) OK, I'm regularly logging onto #apache-modules now SMTP is two tasks: accept incoming connections (a protocol module - c.f. the ftp modules), and make outgoing connections to another server. The latter would be a proxy_smtp module in the mod_proxy framework - c.f. proxy_http and proxy_ftp. These are clearly That makes more sense than the route I was going Have you done any research on available code for handling the protocol? I will do more along those lines. The only SMTP package I am familiar with is Postfix, but I think we encountered a licensing issue in earlier discussion. If the protocol handling has to be redone from scratch, I am willing to work on that. That's basically a similar task to CGI, and a first pass at that would be to try running procmail under CGI (with incoming HTTP POST or PUT requests will do fine). I imagine that would be a pretty trivial job with perl or python. Attached is a quick demo that allows mail delivery via CGI (HTTP POST). cat raw-email-message | curl -s --data-binary @- http://.../cgi-procmail It works on my Linux and FreeBSD system. It's hard coded to deliver to the current user only (web server user) but procmail runs suid root so you could deliver to other users too. Now look at what you needed to do. That's a first-pass spec for this task. It should be a fairly straightforward mod to mod_cgi(d). I'm not sure whether that's the best approach, but hacking it up will surely throw some light on the matter. I don't quite understand this... are you saying that instead of an external CGI program (as attached), the pipe to procmail functionality should be a server module? /* cgi-procmail 0.90 Copyright (C) 2005 Jem E. Berkes Accept an email from an HTTP POST request, for local delivery. The cgi program runs with web server user's privileges so mail can only be delivered to the web server user. Warning, uses popen() To send the web server an email, use: cat raw-email-message | curl -s --data-binary @- http://address */ #define PROCMAIL_CMD"/usr/bin/procmail -f -" #define MAXIMUM_LINE1000/* As per RFC, max line including \n and \0 */ #define EXIT_SUCCESS0 #define EXIT_FAILURE1 #include #include #include int main() { FILE* dest = popen(PROCMAIL_CMD, "w"); printf("Content-Type: text/plain\n\n"); if (dest) { char dataline[MAXIMUM_LINE]; /* pipe HTTP POST data to procmail */ while (fgets(dataline, sizeof(dataline), stdin)) fputs(dataline, dest); if (fclose(dest) == 0) { puts("Mail sent to procmail"); return EXIT_SUCCESS; } else { puts("Error closing pipe"); fprintf(stderr, "Error closing pipe to " PROCMAIL_CMD "\n"); return EXIT_FAILURE; } } else { puts("Pipe failure"); fprintf(stderr, "Unable to open pipe to " PROCMAIL_CMD "\n"); return EXIT_FAILURE; } }
Re: mod_smtpd design.
Where are we in the mod_smtpd design/task allocation? Since there are several people involved we're really going to have to divide up the tasks and at least decide on how our various modules will communicate. I'd like to start coding, if one of the mentors could push me in the direction they want please :) As I understand it, we want to produce one module that deals with the SMTP protocol, and a separate module to handle mail delivery (probably via procmail is easiest for now). Does this mean we should also produce separate modules, one for incoming and one for outgoing mail transactions? In other words, several modules: - SMTP protocol library - Incoming SMTP mail (hook connection/server) - Outgoing SMTP mail (spooling, DNS MX, sendmail-ish) - Delivery of SMTP message (to local, e.g. pipe to procmail) Or is that getting too fragmented?
mod_rdate, for what it's worth
I wrote a little mod_rdate if anyone is interested, http://www.sysdesign.ca/archive/apache/module-experiment/mod_rdate.c rdate allows a host to synchronize its time to a server, with approximately 1-second accuracy. It is far inferior to NTP but so much simpler and still my preferred method for dealing with drifting clocks. It gets the job done. In /etc/services, port 37 is the "time" service. Upon connecting, the client simply reads back a 32-bit value representing seconds since 1900. My mod_rdate.c is NOT well tested, and may not be very portable. But hey, there you go, arbitrary protocol :)
Re: Immediate output in process_connection
> What is your Listen line for this protocol's port? I just had Listen 1234 > Try something like: > > Listen 0.0.0.0:1234 myproto > AcceptFilter myproto none Well that did indeed solve my problem :) Thanks.
Re: mod_smtpd design.
> > I don't see why it matters if there are redundant members in > > request_rec. However, for purity, it might be cool to divide > > request_rec up into common elements and protocol-specific stuff in a > > union. > > That's not really a problem, though of course it's hacky. It's the > logical consequence of declaring HTTPD to be multi-protocol while making > so much of it revolve around the request_rec. Nick, in one of the background docs you sent me the 'connection filters' were described as operating outside the scope of HTTP or any request_rec. I know this is a living, changing work in progress but it seems confusing, at least to me, to shoehorn arbitrary protocols into HTTP-ish "requests". I'm having difficulty envisioning how an SMTP communication can be broken up into requests, i.e. something like HTTP requests. For example: 01 HELO joe 02 MAIL FROM: 03 RCPT TO: 04 RCPT TO: 05 DATA 06 ... 07 MAIL FROM: 08 RCPT TO: 09 DATA 10 ... 11 QUIT Now how do you split this up into requests? Does 05 represent a single request, or does that DATA generate two different requests - 03 and 04 (one for each recipient)? Or is the whole thing, until 11, a single request? Perhaps the MAIL FROM's start new requests, i.e. 02 and 07 (two requests). It just seems ambiguous and difficult to justify in any way. Mind you I still might be not fully understanding Apache's request paradigm.
Immediate output in process_connection
I've been playing with basic modules that implement their own protocol (process_connection hooked) along the lines of the mod_echo example. But one thing I can't seem to do is send output immediately back to the client, even though I am flushing the output filters. With the following code, if I telnet localhost I don't see the "hello" until I first send some line, or close the connection. Contrast to say a POP3 server where you see the +greeting as soon as you connect. I did look at the mod_pop3 code but can't understand what it does differently than this. Do I have to explicitly do something with the input_filters here? Currently I don't touch the input side, but something is introducing a delay. static int rdate_process_connection(conn_rec *c) { apr_bucket_brigade *bb; rdate_cfg* cfg = ap_get_module_config(c->base_server->module_config, &rdate_module); if (!cfg->rdate_enabled) return DECLINED; bb = apr_brigade_create(c->pool, c->bucket_alloc); ap_fprintf(c->output_filters, bb, "hello\n"); ap_fflush(c->output_filters, bb); return OK; }
Re: Questions from a newbie
> ... The other solution (letting mod_smtpd read the whole > thing into a buffer and then passing the buffer) seems way too memory > hungry for me. I was trying to figure that out too. It seems like a bad idea to read the whole thing into memory because the buffer could easily require several tens of megabytes.
Questions from a newbie
I'm just getting into module development for the first time (thanks to impetus provided by Google's Summer of Code)... I've got a test environment with 2.1.6-alpha and have succeeded in writing minimal modules and getting them working on a live server. But I have a few nagging questions that I hope someone can help me with. 1) ap_hook_* functions... obviously, there are lots of these that are used to register hooks. Where can I find a reference list of the available functions? I'm looking through code of modules, and it leaves me wondering, how would I know what hook possibilities exist? 2) I'm having trouble navigating and finding facilities I need. Let's say I was looking for something that turns out to be satisfied by ap_fwrite(), in the API for filters. Where should I start, to lead myself to finding that function? I thought apr.apache.org but Google shows no mention of ap_fwrite within that site. 3) I can understand how a module can tie into Apache's normal processing to intercept connections, requests, etc. But what is the structure and mechanism by which one module can make use of another module? I can use C functions but there must be some kind of standardized interface for inter-modular calls? For example, for SMTP support we are contemplating the protocol unit (mod_smtpd) passing on mail to a module that specifically delivers mail. How do those two entities communicate with each other? A mod_smtp_deliver would get a potentially large chunk of data (const char* ?) from mod_smtpd and deliver the mail via procmail, etc. This is a loose binding by the way since all received mails do not necessarily have to be delivered.
Questions from a newbie
I'm just getting into module development for the first time (thanks to impetus provided by Google's Summer of Code)... I've got a test environment with 2.1.6-alpha and have succeeded in writing minimal modules and getting them working on a live server. But I have a few nagging questions that I hope someone can help me with. 1) ap_hook_* functions... obviously, there are lots of these that are used to register hooks. Where can I find a reference list of the available functions? I'm looking through code of modules, and it leaves me wondering, how would I know what hook possibilities exist? 2) I'm having trouble navigating and finding facilities I need. Let's say I was looking for something that turns out to be satisfied by ap_fwrite(), in the API for filters. Where should I start, to lead myself to finding that function? I thought apr.apache.org but Google shows no mention of ap_fwrite within that site. 3) I can understand how a module can tie into Apache's normal processing to intercept connections, requests, etc. But what is the structure and mechanism by which one module can make use of another module? I can use C functions but there must be some kind of standardized interface for inter-modular calls? For example, for SMTP support we are contemplating the protocol unit (mod_smtpd) passing on mail to a module that specifically delivers mail. How do those two entities communicate with each other? A mod_smtp_deliver would get a potentially large chunk of data (const char* ?) from mod_smtpd and deliver the mail via procmail, etc. This is a loose binding by the way since all received mails do not necessarily have to be delivered. - Jem Berkes
Re: mod_smtpd project planning
> It's a SMTP protocol frontend for httpd. It will have the power to be a > sendmail replacer or to supply content via SMTP because it will delegate > most of the actual handling to other modules. All the details haven't > been worked out yet, but it will make use of the Apache 2.x filters and > handlers. For Instance: > > [core] -> [mod_smtpd] -> [mod_insert_special_use_mail_handler] > a setup like that could be used, but let's say you want to filter out > junk mail. Use an input filter! > > [core] -> [mod_smtpd] -> [mod_junk_mail_filter] -> [mod_other_thing] Just to give an idea of the added flexibility of SMTP support within an httpd module; if we were just using pipes to/from sendmail then you have very limited information - basically just the mail data after receipt and no ability to talk back to the peer during the mail transaction. This is the problem encountered by many spam filters, as to be most effective they really need to be _involved_ in the SMTP transaction and not just stage 2, after receipt happens. Think greylisting as an example. Also, since mod_smtpd will receive emails from MTAs itself, and prepare its own data structures for passing on the data to other modules, this means that it can pass along useful information that are difficult or impossible to determine from just message headers/body. For example, IP address of the incoming or outgoing relay, helo/intro identification of peer, protocol violations or warnings, connection data rate perhaps... brainstorming.
Re: mod_smtpd project planning
> As one of the students I can definitely appreciate that! > > To everyone managing SoC: about how long until our svn accounts are > activated? I know there are a lot details being worked out still, but I > still feel a little in the dark. Hi all, I'm another student working on mod_smtpd Been running httpd 2.x since it appeared, but am new to development. - Jem Berkes