Re: [PATCH] mod_smtpd_queue_smtp

2005-09-20 Thread Jem Berkes

But as far as I can tell, this code is all about SMTP forwarding (not
even relaying per-se). Confuses me anyway :)



I.e. smarthosting. Which might be a better name for the whole thing.


For anything involving filtering + forwarding on SMTP, might be worth 
including a note to admins to remember to use consistent filtering on all 
mail exchangers (MX) for the domain in question. Otherwise spammers often 
try to deliver to one of the lower priority MX with the hope of bypassing 
spam filtering.


I can't see how any of that is "our job" though, just thought I'd mention 
it.


Re: to the users of mod_smtpd

2005-09-20 Thread Jem Berkes

To Plugin Writers:


If examining the source on svn, also remember to check out my completed 
RBL / DNSBL modules under modules/access. In particular, mod_dnsbl_lookup 
provides functionality that could easily be used to implement a 
spamassassin-style scoring filter. That particular module can also be used 
without mod_smtpd.




Re: mod_smtpd module review

2005-09-20 Thread Jem Berkes
I just saw this old message now... I have been moving and my new ISP still 
hasn't connected my service after 3 weeks.



This past week I have finished up a few modules and ready for review.

http://www.brianfrance.com/software/apache/mod_smtpd_load.tar.gz


This is a nifty addition, thanks!

I would also like to set the error code, because looking over rfc0821 I 
think it should return 452 or may be that needs to be a default for 
smtpd_run_connect soft errors (552 for hard errors).  Should we allow the 
module to set the error code?


Error codes can be sent with SMTPD_DONE. Otherwise error codes won't change 
in the next fifty years for things like SMTPD_DENY and friends. I'll look 
into changing those specific codes. Most of the error codes I got from 
Postfix and Qpsmtpd but of course they aren't perfect.


I think there should be a way for other modules to specify a preferred 
return code. Access controls are a good example, whether mod_smtpd_rbl or 
something else that wants to ensure a specific code is returned.


Re: mod_smtpd filter support

2005-08-29 Thread Jem Berkes
> This mostly means that mod_smtpd is very close to completion. I expect
> some bug-fixes and I plan on adding a one-recipient/one-transaction
> feature and a message body reading abstraction, but other than that it
> seems to be in its final working state. Features include:

Thanks for the update Rian, glad I just caught your message now as I am in 
the process of moving (will be on highways for next few days) and due to 
moving/univ will probably not have computer access again until Sept. 8 or 
so.

Nick and/or other mentors, I sent an email to this effect and inquiring 
whether there was something else I am expected to do before September but I 
did not receive any replies.




Re: mod_smtpd overhaul

2005-08-23 Thread Jem Berkes

+1, Jem since you have checked in the first plugin for mod_smtpd would
you mind creating a directory structure similar to this if it seems fine
to you?


I've built the directory structure to hold multiple modules. I also 
updated my modules so they build without duplicating .h files, and fixed 
the various copyright and style issues Garrett mentioned earlier. Also 
updated mod_smtpd_rbl so that it uses the APR array method for msgs.


The changes have been committed, please give it a look-over.

- Jem


Re: mod_smtpd overhaul

2005-08-23 Thread Jem Berkes
> > I don't have a problem with it. Do I need a verification of my earlier
> > commit before I commit a new directory structure?
> 
> What do you mean by "verification"?

In your earlier email you said "Since the commit mail hasn't come through 
yet (needs to be approved I imagine)"




Re: mod_smtpd overhaul

2005-08-23 Thread Jem Berkes
> +1, Jem since you have checked in the first plugin for mod_smtpd would
> you mind creating a directory structure similar to this if it seems fine
> to you?

I don't have a problem with it. Do I need a verification of my earlier 
commit before I commit a new directory structure?




Committed mod_smtpd/trunk/mod_smtpd_rbl and mod_dnsbl_lookup

2005-08-22 Thread Jem Berkes
> Hopefully later today I should have this completely done and checked in.

I waited for Rian to update the mod_smtpd structures, and I have now 
checked in my code for RBL functionality. There are README files in both 
directories describing use. However could someone tell me how to properly 
use mod_smtpd.h and dnsbl_lookup.h in the build process? I've copied them 
between directories but this can't be the right way to do it.

https://svn.apache.org/repos/asf/httpd/mod_smtpd/trunk/mod_smtpd_rbl/
- Adds RBL whitelisting and blacklisting to mod_smtpd, either
rejecting client IPs upon connection (DNSBL) or envelope sender
domains (RHSBL). By hooking into Rian's smtp this remains totally
modular and does not alter mod_smtpd itself.

https://svn.apache.org/repos/asf/httpd/mod_smtpd/trunk/mod_dnsbl_lookup/
- Does the actual DNSBL and RHSBL lookups, supporting rather advanced
configuration in the form of distinct query chains.  Many chains can
be defined so admins can use chains for different purposes.  Flags to
the lookup functions allow different query styles, such as either
stopping on one match or querying everything and returning a table of
response details.

Sample configuration for mod_smtpd + mod_smtpd_rbl + mod_dnsbl_lookup

# Enable mod_smtpd
SmtpProtocol On

# Define whitelist and blacklist chains for mod_smtpd_rbl
SmtpWhitelist mywhitelist
SmtpBlacklist myblacklist

# Enable mod_dnsbl_lookup
DnsblLookups On

# The zones and chains for mod_dnsbl_lookup

RhsblZone myblacklist   rhsbl.ahbl.org. 127.0.0.2
RhsblZone myblacklist   abuse.rfc-ignorant.org. 127.0.0.4

DnsblZone myblacklist   sbl.spamhaus.org.   any
DnsblZone myblacklist   cbl.abuseat.org.any




Re: mod_smtpd overhaul

2005-08-22 Thread Jem Berkes
> Is this the right way or is there an example module I could compare
> with?

I noticed a couple posts about examples, there is now one as I have 
committed all the RBL stuff I wrote. See:

https://svn.apache.org/repos/asf/httpd/mod_smtpd/trunk/mod_smtpd_rbl/

This hooks into mod_smtpd in two places and returns various data (e.g. if 
the client IP is blacklisted then mod_smtpd is told to deny mail). I hope 
it serves as a good example, it seems to work quite nicely to give 
mod_smtpd all the DNSBL/RHSBL features in a modular fashion.




Re: Supporting RBL in mod_smtpd

2005-08-18 Thread Jem Berkes
> > smtpd_run_connect (might deny service to connecting IP, per
> > request_rec)
> > smtpd_run_mail (might deny service to this envelope domain, per loc)
> +1
> ...
> Don't do this just yet, mod_smtpd is changing completely! completely  =
> structures/io. I should commit my changes very soon so you can  start
> working on this.

OK, I'll watch for the changes. Make sure you keep what I need though :)

smtpd_run_connect should somehow pass the address of the peer (currently 
that's within the request_rec)

smtp_run_mail should still pass the MAIL FROM address the peer specifies, 
it currently comes in the char* loc.

As long as this data is still available to me and I can return a code to 
reject the mail, we should be good to go. Somewhat trivial I hope.




Supporting RBL in mod_smtpd

2005-08-18 Thread Jem Berkes
Here is my current plan for introducing the RBL support in mod_smtpd, using 
the existing mod_dnsbl_lookup which I posted earlier. This way of 
accomplishing the RBL support should not require any code modification to 
mod_smtpd itself. Nick and Rian, let me know if I should be going about 
this a different way?

I thought the most modular fashion would be to create a mod_smtpd_rbl that 
registers the following mod_smtpd hooks:
smtpd_run_connect (might deny service to connecting IP, per request_rec)
smtpd_run_mail (might deny service to this envelope domain, per loc)

These would query whitelists and blacklists, whatever is available.

I don't mind whipping up this bridging mod_smtpd_rbl module, but if it 
seems excessive to introduce a new module for this purpose then the other 
way of doing this would be to add the RBL supporting code into mod_smtpd 
itself.

Either way it's done, RBLs are still not required and mod_dnsbl_lookup does 
not have to be present for mod_smtpd to function normally. However, adding 
a new bridging module has the advantage of leaving mod_smtpd code alone and 
taking advantage of the hooks interface.




Re: mod_dnsbl_lookup 0.90

2005-08-15 Thread Jem Berkes
> That's super in-efficient for the majority case, and there's no
> application level caching, which tends to be a must for most
> implementations (even if it is only per-request, like Exim's or

We talked about this on IRC, and it seems the preferred approach is to 
delegate the caching responsibility to an entity that is made purely for 
that purpose, for example DJB's local DNS cache software or even rbldnsd 
(an extremely fast DNSBL server) running locally.

I did start to implement software side caching in mod_dnsbl_lookup but it 
raised questions as to whether it's appropriate to have global scale 
caching when we're doing connection and request oriented processing.

So I've left caching out of mod_dnsbl_lookup 0.91




New mod_dnsbl_lookup release

2005-08-15 Thread Jem Berkes
I don't have svn access yet, but I have posted the module here:
http://www.sysdesign.ca/archive/mod_dnsbl_lookup-0.91.tar.gz

This is much improved from my earlier 0.90, taking advice from Colm. With 
this new style of configuration the module can be used more flexibly for 
blacklists, whitelists, or other things. Configuration now looks like:

DnsblZone spammers  sbl.spamhaus.org.   any
DnsblZone spammers  dnsbl.sorbs.net.127.0.0.5
DnsblZone spammers  dnsbl.sorbs.net.127.0.0.6
DnsblZone whitelist customers.dnsbl any
RhsblZone spammers  rhsbl.ahbl.org. 127.0.0.2

The README in the above tarball is very thorough and describes how to use 
the module's functions. I'm interested in adding the functionality into 
mod_smtpd of course. Rian and Nick: how should we proceed on that?

Here in brief is a relevant part of my README

===
4. Using from mod_smtpd
===

The function calls work in isolation, without requiring any prior setup 
before using DNSBLs. The server configuration takes care of all 
DNSBL and RHSBL setup, including domains to query and responses to 
interpret as positive.

The important knowledge link between mod_dnsbl_lookup and its user, say 
mod_smtpd, is the chain name that defines the desired DNSBLs. Instead of 
hard coding a chain name, it makes much more sense to have a module such 
as mod_smtpd load during its configuration some chains to work with.

So mod_smtpd might have configuration directives such as:
SmtpBlacklistChain blackchain
SmtpWhitelistChain whitechain

Now mod_smtpd knows which chain to query for blacklisting purposes, and 
which chain to query for whitelisting purposes. The admin may leave either 
chain undefined of course and can easily modify the configuration by 
substituting different chain names (as used by DnsblZone and RhsblZone). 
The pseudo code within mod_smtpd might then be:

Attempt to load optional dnsbl_lookup functions
If functions are available
If dnsbl_lookup_ip("whitechain", client) == DNSBL_POSITIVE
return ALLOW_SERVICE// even if blacklisted
Else If dnsbl_lookup_ip("blackchain", client) == DNSBL_POSITIVE
return DENY_SERVICE
return ALLOW_SERVICE// default action

- Jem




Hash table growth

2005-08-15 Thread Jem Berkes
When I looked at the expand function used by apr_hash.c it looked to me 
like it keeps growing if you keep using 'set' with novel values. I was 
thinking of using apr_hash in order to cache DNSBL queries for my module. 
It would ensure rapid cache search but I am having trouble figuring out how 
I could remove existing entries. I really _want_ collisions to happen but 
I'm not sure if this is possible.

Any tips on how I can overwrite existing entries in the hash table, rather 
than keep expanding the table entries?

e.g. key ABC is already in the table, and it collides with XYZ which I now 
want to add. However, if I apr_hash_get(XYZ) it will tell me correctly that 
this key is not present; and apr_hash_set(XYZ) now expands the table right?




Re: New mod_smtpd release

2005-08-14 Thread Jem Berkes
> Well there's also another problem. RFC 2821 (SMTP) doesn't define a
> particular message format for SMTP (in wide use there the RFC 822 and
> MIME message formats). I don't think that mod_smtpd should assume a  RFC
> 822 or MIME message format since its strictly a SMTP module,  that's why

I agree with this

> I still think header parsing should be in another module.  Of course
> this module is free to register itself as an mod_smtpd  filter and do
> what it needs to do, but it shouldn't be part of the  main mod_smtpd.

That seems wise. Any weird thing can come through over SMTP, it could look 
very much unlike an email after all. You're handling the protocol in your 
module and that means the SMTP protocol as I understand, not MIME or 
anything.




Re: mod_dnsbl_lookup 0.90

2005-08-13 Thread Jem Berkes
Following up on mod_dnsbl, a new version is nearing completion although I 
have encountered some obstacles that slowed me down. I have taken some of 
Colm's advice to make mod_dnsbl_lookup more flexible and self sufficient. 
I'm attaching the documentation part of what I'm currently working on. If 
anyone sees any logic problems, please let me know!


I have made a major effort to document this thing sufficiently that anyone 
stumbling upon it won't have to struggle wit how the heck to use it. 
Hopefully I will post version 0.91 tomorrow or Monday.


- README follows

A DNSBL or RHSBL is just a form of efficient database that returns a 
simple code (expressed as an IP address) for a given lookup key. The 
lookup key is either an IPv4 or IPv6 address in the case of a DNSBL, or a 
host/sub/domain name in the case of a RHSBL. The return code from the 
database may be an IP address such as 127.0.0.2 or NXDOMAIN, indicating no 
match.


DNSBLs are often used in spam filtering, where the return code 127.0.0.x 
indicates that the lookup key (a relay's IP address) is blacklisted. 
However the meaning of the information returned by a database is on no way 
limited to this. Sometimes the DNSBL server intends positive matches to be 
whitelisted hosts; other times there are a variety of 127.0.0.x codes each 
meaning something different.


For this reason we discourage the use of the term blacklist or RBL (real
time blacklist) because this is just one use of DNSBLs and RHSBLs.

This mod_dnsbl_lookup aims to provide generic and flexible DNSBL and RHSBL 
use without limiting functionality. Each server has its own policy and 
return codess, so you must configure dnsbl_lookup_query appropriately as 
there is no intrinsic way to know if something is blacklisted, 
whitelisted, or somewhere in between.


Define servers and codes that you consider "positive matches" under one
or more chains. This allows you to make independent configurations for
different uses. Note that only IPv4 is supported at the moment.

# This might be under a mod_smtpd virtual server config


# Enable module
DnsblLookups On
#
# Need to get host names for RHSBL lookups to work
# Note that terminating dot in server names prevents local domain search
HostNameLookups On
#
# The following define positive matches for the chain I call "spammers"
#
# Any non-failure result from sbl.spamhaus.org is a positive match
DnsblIPv4 spammers  sbl.spamhaus.org.   any
#
# The 127.0.0.2 result from cbl.abuseat.org is a positive match
DnsblIPv4 spammers  cbl.abuseat.org.127.0.0.2
#
# Only the specific codes 127.0.0.5,6,9 from dnsbl.sorbs.net are positive
# The module internally caches queries, only one actual DNS query is made
DnsblIPv4 spammers  dnsbl.sorbs.net.127.0.0.5
DnsblIPv4 spammers  dnsbl.sorbs.net.127.0.0.6
DnsblIPv4 spammers  dnsbl.sorbs.net.127.0.0.9
#
# The following define positive matches for the chain I call "whitelist"
#
# A zone designed for whitelisting, any mail from Canada is positive
DnsblIPv4 whitelist ca.countries.nerd.dk.   127.0.0.2
#
# A local zone we run, customers or partners of ours are positive
DnsblIPv4 whitelist customers.dnsbl any
#
# A chain for RHSBL lookups (distinct from DNSBL chains)
#
RhsblZone spammers  rhsbl.ahbl.org. 127.0.0.2

With this configuration, a user could now do a DNSBL_ANYPOSTV_RETFIRST 
query on the "spammers" chain to see if a host is a spammer (returns 
DNSBL_POSITIVE when the first positive response is encountered). The user 
might also want to do a DNSBL_ANYPOSTV_RETFIRST on the "whitelist" chain 
and allow through any host that returns DNSBL_POSITIVE, meaning it is 
whitelisted. If the whitelist override is more stringent, a 
DNSBL_ALLPOSTV_RETEVERY query might be done instead to require that every 
single entry in the "whitelist" chain returns a positive result.


A more lenient admin might instead do a DNSBL_ANYPOSTV_RETEVERY query on 
the "spammers" chain and do post processing after getting DNSBL_POSITIVE. 
The table returned by the lookup (see below) contains detail on every 
positive match, so the admin may want to only block mail from the host if 
there are at least 2 positive zones. The disadvantage of this are many 
extra queries.


The configuration (above) simplifies the client code down to querying a
specific chain using a certain query mode. The functions used are:

dnsbl_lookup_ip(const char* chain, int querymode, apr_sockaddr_t* address,
apr_pool_t* p, server_rec* s, apr_table_t** zonedata)

dnsbl_lookup_domain(const char* chain, int querymode, const char* domain,
apr_pool_t* p, server_rec* s, apr_table_t** zonedata)

With return values:
DNSBL_POSITIVE - Positive match (zonedata has details, if requested)
DNSBL_NEGATIVE - Negative
DNSBL_FAILURE - Generic failure, e.g. DnsblLookups Off or invalid chain

For DNSBLs, you would use dnsbl_lookup_ip() and pass the IP address in the 
apr_sockaddr_t*. For 

Re: mod_dnsbl_lookup 0.90

2005-08-13 Thread Jem Berkes
Sure, we could support them but if they are the only one (and without 
public documentation on how to use) then aren't we making guesses from a 
rare case? I haven't found any public discussion on IPv6 DNSBL 
conventions.


For example, what is the standard for how to place the IPv6 string under 
the DNSBL zone? Are we still using decimal octets? Can you point me 
towards some examples?


On Sat, 13 Aug 2005, Colm MacCarthaigh wrote:


On Sat, Aug 13, 2005 at 03:20:10PM -0700, Jem Berkes wrote:

I haven't found any examples of IPv6 RBLs.


rbl-plus.hea.net. If you can give me a small fixed IP range, I can
arrange access.

--
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: mod_dnsbl_lookup 0.90

2005-08-13 Thread Jem Berkes

Cool. I'd split dnsbl_zones into ipv4_dnsbl_zones and ipv6_dnsbl_zones
and have the DnsblZones directive work like;

DnsblIPv4Zones
DnsblIPv6Zones

or similar. IPv6 RBL's do exist, and are incompatible with IPv4 ones, so
it's worth having the support early-on.


I haven't found any examples of IPv6 RBLs. Could you point me to one? What 
I'm finding on the web and usenet is that there is still no established 
standard for IPv6 blocklists. Unless I can find some reference, any 
implementation I make would be a guess so I'm leaving this unsupported 
right now.


Re: mod_dnsbl_lookup 0.90

2005-08-03 Thread Jem Berkes
Sorry for the slow replies, our phone landline +internet is dead and the
telco [TSX: MBT] won't fix it for a week. Terrible for getting work done.

> Cool. I'd split dnsbl_zones into ipv4_dnsbl_zones and ipv6_dnsbl_zones
> and have the DnsblZones directive work like;
> 
>   DnsblIPv4Zones 
>   DnsblIPv6Zones 

That's a good idea, I suspected IPv6 RBLs might exist :) I'll add the IPv6
support.

> dnsbl_lookup_query() takes an IP address argument as a string, but it
> would probably be a lot better to take it as an apr_sockaddr_t, since
> that's an IP version agnostic format, and is generally the way an Apache
> module would have the address available to it.

The problem this introduces is when looking up RHSBLs, which operate on host
names or domain names instead of IP address. Would you recommend different
functions for DNSBL (pass an IP) and RHSBL (pass a hostname or domain name)?

> Passing it around in binary format also helps you avoid using sscanf and
> the associated reentrancy problems on many platforms.

I did not know there were reentrancy problems with sscanf. strtok I know.

> The implementation is neat, but it could also do with efficiency being
> in mind, IME (I help run a very large RBL) rbl lookups tend to be a big
> source of latency during request/mail handling and it's worth making the
> effort to go a bit further :) 

Yes, I am going to add some caching for recent queries. I thought at first
that the resolver already does this but as far as I can tell, it does not do
any caching.

> Although the dnsbl_lookup_query() function's output is comprehensive,
> perhaps more useful and efficient would be to supply a framework for
> allowing modules to check DNSBL's in a boolean manner. As-is the code
> scans every registered RBL, even if one flags an address as listed.
> That's super in-efficient for the majority case, and there's no
> application level caching, which tends to be a must for most
> implementations (even if it is only per-request, like Exim's or
> sendmail's implementations for example).

I agree. What I've started can probably be taken much further but I want to
put the basic layers there first. I'll split up the code so it will be easier
to modify later to not query all at once.

> Part of the lack of boolean-checking reveals another problem, how are
> other modules supposed to know what constitutes a positive for a
> particular RBL?

What constitutes a positive depends entirely on the particular RBL's policy.
Some RBLs are whitelists themselves, so if an IP or domain matches then it
should NOT be blocked.



mod_dnsbl_lookup 0.90

2005-07-29 Thread Jem Berkes
I've posted it here. I've been testing it with 2.1.6-alpha
http://www.sysdesign.ca/archive/mod_dnsbl_lookup-0.90.tar.gz

The README file should describe everything. This is a module providing an 
optional utility function intended for (but not limited to) mod_smtpd. The 
function allows the user to query DNS based blocklist databases, both DNSBL 
and RHSBL style, for arbitrary data. This can be used for all kinds of 
filtering and anti-spam use, including score systems such as spamassassin.

In the case of SMTP the query can be the client's IP, client's host name, 
and the domain used in the sender's address. I know that Rian is currently 
redesigning much of mod_smtpd but for demonstration purposes I have 
included code that brings spam blacklisting functionality to mod_smtpd 0.1. 
If a blacklisted client connects, they will be denied service. It is more 
standard however to have these checks done after RCPT TO, at which time the 
envelope domain can also be checked against RHSBL.

It would be great to hear some feedback as I am very new to writing these 
modules. I've tried to use the proper apr_ calls whenever available.




Exporting functions (hooks?)

2005-07-25 Thread Jem Berkes
For what I'm currently trying to accomplish, I'm not sure if hooks are the 
way to go. Also, without examples of this kind of situation I'm somewhat in 
the dark so hopefully someone can push me in the right direction.

I'm writing mod_dnsbl_lookup to provide DNS based blocklist lookup 
facilities. The module's configuration will identify zones to query. So 
that other modules can easily make these queries, I'd like to make a 
generic function that allows any other module to do dnsbl lookups and 
receive the results in a table (say request_rec's notes).

So I want to export a generic utility function, such as
int dnsbl_lookup_do_query(table* results)

What is the most proper way to do this? Is it appropriate to use 
AP_DECLARE_HOOK etc as described at
http://httpd.apache.org/docs/2.0/developer/hooks.html




Re: SSI for gzipped files

2005-07-22 Thread Jem Berkes
> I have an idea for someone to implement and give me credit for. I
> recently needed to have my SSI work from a gz file. The server-intensive
> way to make

Interesting, I've wanted to accomplish the same thing but couldn't figure 
out a good way to do it. Maybe there are a lot of people who need this?




Re: Initial mod_smtpd code.

2005-07-20 Thread Jem Berkes
> Overall blacklists aren't that effective and cause a lot of false
> positives.  They may make sense in the case of something like
> SpamAssassin which uses a blacklist in conjunction with other false
> positives,  but by themselves they really aren't a responsible way of
> dealing with the spam problem.  I think it's better to discourage "worst
> practices" than to sucumb to plugin mania.

Blocklists aren't fundamentally broken, they are a tool which can be used 
properly or misused (just like many other tools).

Many admins choose to maintain their own DNSBLs for one reason or another. 
It may be a way to control relay access based on their own subscriber IP 
addressess. At my site we keep a record of IPs that have persistently 
abused our site over the past few days.

i.e. DNSBL != (SPEWS or MAPS or whatever)




Re: Initial mod_smtpd code.

2005-07-19 Thread Jem Berkes
> But is anyone dealing with outgoing SMTP via a proxy_smtp in the
> mod_proxy framework?  I think you were discussing that a short while
> ago, weren't you?  I think that might be higher priority.

I hesitated on that because I did not understand at all how mod_proxy fits 
into this. i.e. I don't see how the proxy mechanism helps in relaying out 
mail to other SMTP servers.

Here's an idea on how we can start on the outbound SMTP side of things:

I can start work on a "mod_smtp_relay" which takes an email and attempts to 
relay it via the appropriate MX relay. This involves some DNS queries for 
MX records, and making new TCP connections to another SMTP server. I 
recommend that mod_smtp_relay does not itself do any spooling or queueing, 
to isolate these tasks. i.e. some other delivery/scheduler will handle 
spooling and retries etc, and occasionally pass an email to mod_smtp_relay

So given an input message, mod_smtp_relay would make an immediate relay 
attempt and then return success (sent) or an error describing where along 
the lines things went wrong -- it could be DNS, TCP connect failure, or 
SMTP error dictating permanent failure or temporary failure, needing retry.




Re: Initial mod_smtpd code.

2005-07-19 Thread Jem Berkes
> Hmm. That sounds like a good idea, maybe there already is a hook
> defined that could deal with this, I'll look into it.

I could also start work on a mod_smtpd_dnsbl if the mentors feel that is 
worthwhile? This would look up a connecting IP address against a blacklist 
and return a descriptive string to mod_smtpd if the client should be 
rejected with an error: "550 5.7.1 Email rejected because 127.0.0.2 is 
listed by sbl-xbl.spamhaus.org"

I'd also like to include support for RHSBL, a newer type of listing by 
domain names from the envelope sender address. That's used by a growing 
number of projects.




Re: DNSBL filtering for Apache

2005-07-18 Thread Jem Berkes
> Apache -- the HTTP side too -- would benefit from DNSBL support. Or does
> this already do this? For example, both the CBL and AHBL projects list
> IP addresses of hosts engaging in activities such as proxy hijacking and
> spam relaying. This means it would be useful for webmasters to be able
> to make use of the published DNSBL to deny access to http requests.

Gosh, it already exists thanks to Blars Blarson
http://www.blars.org/mod_access_rbl.html

I wonder if the existing module can somehow be used for mod_smtpd as well? 
I'm still not familiar with enough 2.x style modules to know if that would 
work somehow.




DNSBL filtering for Apache

2005-07-18 Thread Jem Berkes
While I was thinking about Nick's suggestion for mod_rbl (blacklist lookups 
with mod_smtpd) I happened upon this idea, which is somewhat unrelated to 
the smtp project.

DNSBLs, the dominant form of real time blacklisting, are not specific to 
SMTP because this is just a way to publish lists of IP addresses. RHSBLs, 
which look up the address in an SMTP envelope, are specific to SMTP 
however.

Apache -- the HTTP side too -- would benefit from DNSBL support. Or does 
this already do this? For example, both the CBL and AHBL projects list IP 
addresses of hosts engaging in activities such as proxy hijacking and spam 
relaying. This means it would be useful for webmasters to be able to make 
use of the published DNSBL to deny access to http requests.

Because DNSBLs are an efficient way to publish lists, webmasters might 
start using a DNSBL lookup feature in Apache to limit abuse of say message 
forums, cgi scripts, proxy gateways. Currently, this has to be done by 
importing a complete list of IP address (often tens of megabytes) into a 
firewall script or Apache configuration.




Re: Initial mod_smtpd code.

2005-07-18 Thread Jem Berkes
This is my first attempt at writing an experimental version of mod_smtpd. I 
don't yet have svn access yet so this code can be downloaded from 
http://rian.merseine.nu/mod_smtpd-0.1.tar.gz.


Nifty! I had some compilation problems involving regex, so in the attached 
patch I use ap_regex.h and change some defines. Hope this doesn't break 
anything.


The other bug I partially fixed was, strstr in smtp_protocol.c only does 
exact matches so uppercase commands like MAIL FROM would fail. I added 
support for the upper case, but this needs to be improved still because 
mixed case doesn't work. Is there an APR function like stristr?


The overall structure and the approach you took is very nice, easy to 
understand. I would recommend adding a hook immediately upon the client 
connection, because an external module (maybe for DNSBLs, or some rate 
limiting control) might not even want us to return a greeting at all -- 
i.e. close with "554 Service unavailable" right away.


But I like what you have, would be happy to keep working around this 
design.*** smtp_protocol.c.orig2005-07-18 23:07:26.0 -0500
--- smtp_protocol.c 2005-07-18 23:39:18.0 -0500
***
*** 26,31 
--- 26,32 
  #include "http_log.h"
  #include "ap_config.h"
  #include "ap_mmn.h"
+ #include "ap_regex.h"
  #include "apr_lib.h"
  #include "apr_buckets.h"
  #include "scoreboard.h"
***
*** 151,156 
--- 152,160 
smtpd_request_rec *sr = smtpd_get_request_rec(r);
char *loc = strstr(buffer, "from:");
int retval;
+   
+   if (loc == NULL)
+ loc = strstr(buffer, "FROM:");
  
  
if (loc == NULL) {
***
*** 177,182 
--- 181,189 
smtpd_request_rec *sr = smtpd_get_request_rec(r);
char *loc = strstr(buffer, "to:");
int retval = 0;
+   
+   if (loc == NULL)
+ loc = strstr(buffer, "TO:");
  
if (loc == NULL) {
  ap_rprintf(r, "%d %s\r\n", 501, "Syntax: RCPT TO:");
***
*** 277,283 
  }
  
  HANDLER_DECLARE(vrfy) {
!   regex_t *compiled;
int error;
int retval = 0;
  
--- 284,290 
  }
  
  HANDLER_DECLARE(vrfy) {
!   ap_regex_t *compiled;
int error;
int retval = 0;
  
***
*** 286,292 
  "[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])?"
  "(\\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])?)*$";
   
!   compiled = ap_pregcomp(r->pool, rexp, REG_EXTENDED | REG_NOSUB);
error = ap_regexec(compiled, buffer, 0, NULL, 0);
ap_pregfree(r->pool, compiled);

--- 293,299 
  "[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])?"
  "(\\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])?)*$";
   
!   compiled = ap_pregcomp(r->pool, rexp, AP_REG_EXTENDED | AP_REG_NOSUB);
error = ap_regexec(compiled, buffer, 0, NULL, 0);
ap_pregfree(r->pool, compiled);



Re: Initial mod_smtpd code.

2005-07-18 Thread Jem Berkes
> Jem/Paul/Nick: I'm especially interested in what you think about the
> design I've laid out in this implementation.

I'll try this out today and send my feedback.

With respect to hooking every command, the reason I suggested that is to 
offer some usefl facilities to those writing filter modules. It may work 
with the way you've laid it out too, I'll check for that angle of it.




Re: mod_smtpd design - protocol

2005-07-16 Thread Jem Berkes
> Have you considered using libapreq2 for parsing
> the mime headers in there?  The header parser
> should really convenient for that, you could
> even introduce a post-header-parser hook that
> runs when the parser finishes.

My own suggestion is that we don't touch or try to interpret MIME. Parsing 
the message headers into a table is straightforward but once you get into 
recognizing MIME you're moving out of the protocol realm and into the 
message format realm - and you start having to worry about messages within 
messages, boundaries, corrupt structures, and other things that I think are 
not mod_smtpd's problem.

I don't know if the other smtp project people share my opinion here.




mod_smtpd design - protocol

2005-07-16 Thread Jem Berkes
I want to focus a bit on mod_smtpd design, in particular the protocol 
module (which accepts connections and does the E/SMTP talking). I've seen 
various ideas thrown around on what exactly the module should do. It would 
be nice if we could come up with at least the high level design specs for 
this, just so we're all on the same page about what the module will do and 
what facilities it will offer to external module users.

Talking with bfrance on IRC gave me a better sense of what other developers 
are hoping the smtp module can provide. The competing technology seems to 
be Sendmail's milter interface which allows developers to hook their custom 
filters into various stages of SMTP transactions. Anyone using mod_smtpd 
for filtering purposes will want to hook into various stages of the SMTP 
transaction - anywhere between start and end, or specific middle commands.

So let me throw this out there as a starting point because I don't think 
this has been documented yet?

The mod_smtpd protocol module accepts client connections and speaks E/SMTP, 
with default processing for all commands. e.g. it has a default greeting 
upon connection, a default response to EHLO/HELO, default accepts any 
envelop sender in MAIL FROM, default rejects any recipient in RCPT TO to 
prevent open relay configuration, etc.

However the module should provide several hooks to allow another module to 
use smtp. Off the top of my head, we need at least these hooks:

- upon connection from some client
User might introduce delay, lookup IP for RBL, customize greeting
- upon receiving HELO/EHLO from client
- upon receiving MAIL FROM
- upon receiving RCPT TO
etc
- upon receiving other command like VRFY, RESET, NOOP
- upon receiving invalid command

I think this granularity is required. But I'm not sure about how the DATA 
hook would work? Among the two people who already have some code for smtp, 
are you coding something along these lines?




Re: mod_smtpd design.

2005-07-14 Thread Jem Berkes

I'd like to see more of you on IRC:-)


OK, I'm regularly logging onto #apache-modules now


SMTP is two tasks: accept incoming connections (a protocol module -
c.f. the ftp modules), and make outgoing connections to another
server.  The latter would be a proxy_smtp module in the mod_proxy
framework - c.f. proxy_http and proxy_ftp.  These are clearly


That makes more sense than the route I was going


Have you done any research on available code for handling the protocol?


I will do more along those lines. The only SMTP package I am familiar with 
is Postfix, but I think we encountered a licensing issue in earlier 
discussion. If the protocol handling has to be redone from scratch, I am 
willing to work on that.



That's basically a similar task to CGI, and a first pass at that
would be to try running procmail under CGI (with incoming HTTP POST
or PUT requests will do fine).  I imagine that would be a pretty
trivial job with perl or python.


Attached is a quick demo that allows mail delivery via CGI (HTTP POST).
cat raw-email-message | curl -s --data-binary @- http://.../cgi-procmail

It works on my Linux and FreeBSD system. It's hard coded to deliver to the 
current user only (web server user) but procmail runs suid root so you 
could deliver to other users too.



Now look at what you needed to do.  That's a first-pass spec for
this task.  It should be a fairly straightforward mod to mod_cgi(d).
I'm not sure whether that's the best approach, but hacking it up
will surely throw some light on the matter.


I don't quite understand this... are you saying that instead of an 
external CGI program (as attached), the pipe to procmail functionality 
should be a server module?
/*
cgi-procmail 0.90
Copyright (C) 2005 Jem E. Berkes

Accept an email from an HTTP POST request, for local delivery.
The cgi program runs with web server user's privileges so mail can
only be delivered to the web server user. Warning, uses popen()

To send the web server an email, use:
cat raw-email-message | curl -s --data-binary @- http://address

*/


#define PROCMAIL_CMD"/usr/bin/procmail -f -"
#define MAXIMUM_LINE1000/* As per RFC, max line including \n and \0 */
#define EXIT_SUCCESS0
#define EXIT_FAILURE1


#include 
#include 
#include 


int main()
{
FILE* dest = popen(PROCMAIL_CMD, "w");
printf("Content-Type: text/plain\n\n");
if (dest)
{
char dataline[MAXIMUM_LINE];
/* pipe HTTP POST data to procmail */
while (fgets(dataline, sizeof(dataline), stdin))
fputs(dataline, dest);
if (fclose(dest) == 0)
{
puts("Mail sent to procmail");
return EXIT_SUCCESS;
}
else
{
puts("Error closing pipe");
fprintf(stderr, "Error closing pipe to " PROCMAIL_CMD 
"\n");
return EXIT_FAILURE;
}
}
else
{
puts("Pipe failure");
fprintf(stderr, "Unable to open pipe to " PROCMAIL_CMD "\n");
return EXIT_FAILURE;
}
}


Re: mod_smtpd design.

2005-07-11 Thread Jem Berkes
Where are we in the mod_smtpd design/task allocation? Since there are 
several people involved we're really going to have to divide up the tasks 
and at least decide on how our various modules will communicate. I'd like 
to start coding, if one of the mentors could push me in the direction they 
want please :)

As I understand it, we want to produce one module that deals with the SMTP 
protocol, and a separate module to handle mail delivery (probably via 
procmail is easiest for now). Does this mean we should also produce 
separate modules, one for incoming and one for outgoing mail transactions? 
In other words, several modules:

- SMTP protocol library
- Incoming SMTP mail (hook connection/server)
- Outgoing SMTP mail (spooling, DNS MX, sendmail-ish)
- Delivery of SMTP message (to local, e.g. pipe to procmail)

Or is that getting too fragmented?




mod_rdate, for what it's worth

2005-07-03 Thread Jem Berkes
I wrote a little mod_rdate if anyone is interested,
http://www.sysdesign.ca/archive/apache/module-experiment/mod_rdate.c

rdate allows a host to synchronize its time to a server, with approximately 
1-second accuracy. It is far inferior to NTP but so much simpler and still 
my preferred method for dealing with drifting clocks. It gets the job done.

In /etc/services, port 37 is the "time" service. Upon connecting, the 
client simply reads back a 32-bit value representing seconds since 1900. My 
mod_rdate.c is NOT well tested, and may not be very portable.

But hey, there you go, arbitrary protocol :)




Re: Immediate output in process_connection

2005-07-03 Thread Jem Berkes
> What is your Listen line for this protocol's port?

I just had Listen 1234

> Try something like:
> 
> Listen 0.0.0.0:1234 myproto
> AcceptFilter myproto none

Well that did indeed solve my problem :) Thanks.




Re: mod_smtpd design.

2005-07-03 Thread Jem Berkes
> > I don't see why it matters if there are redundant members in
> > request_rec. However, for purity, it might be cool to divide
> > request_rec up into common elements and protocol-specific stuff in a
> > union.
> 
> That's not really a problem, though of course it's hacky.  It's the
> logical consequence of declaring HTTPD to be multi-protocol while making
> so much of it revolve around the request_rec.

Nick, in one of the background docs you sent me the 'connection filters' 
were described as operating outside the scope of HTTP or any request_rec. I 
know this is a living, changing work in progress but it seems confusing, at 
least to me, to shoehorn arbitrary protocols into HTTP-ish "requests".

I'm having difficulty envisioning how an SMTP communication can be broken 
up into requests, i.e. something like HTTP requests. For example:

01  HELO joe
02  MAIL FROM:
03  RCPT TO:
04  RCPT TO:
05  DATA
06  ...
07  MAIL FROM:
08  RCPT TO:
09  DATA
10  ...
11  QUIT

Now how do you split this up into requests? Does 05 represent a single 
request, or does that DATA generate two different requests - 03 and 04 (one 
for each recipient)? Or is the whole thing, until 11, a single request? 
Perhaps the MAIL FROM's start new requests, i.e. 02 and 07 (two requests).

It just seems ambiguous and difficult to justify in any way. Mind you I 
still might be not fully understanding Apache's request paradigm.




Immediate output in process_connection

2005-07-03 Thread Jem Berkes
I've been playing with basic modules that implement their own protocol 
(process_connection hooked) along the lines of the mod_echo example. But 
one thing I can't seem to do is send output immediately back to the client, 
even though I am flushing the output filters.

With the following code, if I telnet localhost I don't see the "hello" 
until I first send some line, or close the connection. Contrast to say a 
POP3 server where you see the +greeting as soon as you connect. I did look 
at the mod_pop3 code but can't understand what it does differently than 
this. Do I have to explicitly do something with the input_filters here? 
Currently I don't touch the input side, but something is introducing a 
delay.

static int rdate_process_connection(conn_rec *c)
{
apr_bucket_brigade *bb;
rdate_cfg* cfg = ap_get_module_config(c->base_server->module_config,
&rdate_module);
if (!cfg->rdate_enabled)
return DECLINED;

bb = apr_brigade_create(c->pool, c->bucket_alloc);
ap_fprintf(c->output_filters, bb, "hello\n");
ap_fflush(c->output_filters, bb);
return OK;
}




Re: Questions from a newbie

2005-07-03 Thread Jem Berkes
> ... The other solution (letting mod_smtpd read the whole
> thing into a buffer and then passing the buffer) seems way too memory
> hungry for me.

I was trying to figure that out too. It seems like a bad idea to read the 
whole thing into memory because the buffer could easily require several 
tens of megabytes.




Questions from a newbie

2005-07-03 Thread Jem Berkes
I'm just getting into module development for the first time (thanks to 
impetus provided by Google's Summer of Code)... I've got a test 
environment with 2.1.6-alpha and have succeeded in writing minimal modules 
and getting them working on a live server. But I have a few nagging 
questions that I hope someone can help me with.


1) ap_hook_* functions... obviously, there are lots of these that are used 
to register hooks. Where can I find a reference list of the available 
functions? I'm looking through code of modules, and it leaves me 
wondering, how would I know what hook possibilities exist?


2) I'm having trouble navigating and finding facilities I need. Let's say 
I was looking for something that turns out to be satisfied by ap_fwrite(), 
in the API for filters. Where should I start, to lead myself to finding 
that function? I thought apr.apache.org but Google shows no mention of 
ap_fwrite within that site.


3) I can understand how a module can tie into Apache's normal processing 
to intercept connections, requests, etc. But what is the structure and 
mechanism by which one module can make use of another module? I can use C 
functions but there must be some kind of standardized interface for 
inter-modular calls?


For example, for SMTP support we are contemplating the protocol unit 
(mod_smtpd) passing on mail to a module that specifically delivers mail. 
How do those two entities communicate with each other? A mod_smtp_deliver 
would get a potentially large chunk of data (const char* ?) from mod_smtpd 
and deliver the mail via procmail, etc. This is a loose binding by the way 
since all received mails do not necessarily have to be delivered.


Questions from a newbie

2005-07-02 Thread Jem Berkes
I'm just getting into module development for the first time (thanks to 
impetus provided by Google's Summer of Code)... I've got a test 
environment with 2.1.6-alpha and have succeeded in writing minimal modules 
and getting them working on a live server. But I have a few nagging 
questions that I hope someone can help me with.


1) ap_hook_* functions... obviously, there are lots of these that are used 
to register hooks. Where can I find a reference list of the available 
functions? I'm looking through code of modules, and it leaves me 
wondering, how would I know what hook possibilities exist?


2) I'm having trouble navigating and finding facilities I need. Let's say 
I was looking for something that turns out to be satisfied by ap_fwrite(), 
in the API for filters. Where should I start, to lead myself to finding 
that function? I thought apr.apache.org but Google shows no mention of 
ap_fwrite within that site.


3) I can understand how a module can tie into Apache's normal processing 
to intercept connections, requests, etc. But what is the structure and 
mechanism by which one module can make use of another module? I can use C 
functions but there must be some kind of standardized interface for 
inter-modular calls?


For example, for SMTP support we are contemplating the protocol unit 
(mod_smtpd) passing on mail to a module that specifically delivers mail. 
How do those two entities communicate with each other? A mod_smtp_deliver 
would get a potentially large chunk of data (const char* ?) from mod_smtpd 
and deliver the mail via procmail, etc. This is a loose binding by the way 
since all received mails do not necessarily have to be delivered.


- Jem Berkes



Re: mod_smtpd project planning

2005-06-30 Thread Jem Berkes
> It's a SMTP protocol frontend for httpd. It will have the power to be a
> sendmail replacer or to supply content via SMTP because it will delegate
> most of the actual handling to other modules. All the details haven't
> been worked out yet, but it will make use of the Apache 2.x filters and
> handlers. For Instance:
> 
> [core] -> [mod_smtpd] -> [mod_insert_special_use_mail_handler]
> a setup like that could be used, but let's say you want to filter out
> junk mail. Use an input filter!
> 
> [core] -> [mod_smtpd] -> [mod_junk_mail_filter] -> [mod_other_thing]

Just to give an idea of the added flexibility of SMTP support within an 
httpd module; if we were just using pipes to/from sendmail then you have 
very limited information - basically just the mail data after receipt and 
no ability to talk back to the peer during the mail transaction.

This is the problem encountered by many spam filters, as to be most 
effective they really need to be _involved_ in the SMTP transaction and not 
just stage 2, after receipt happens. Think greylisting as an example.

Also, since mod_smtpd will receive emails from MTAs itself, and prepare its 
own data structures for passing on the data to other modules, this means 
that it can pass along useful information that are difficult or impossible 
to determine from just message headers/body. For example, IP address of the 
incoming or outgoing relay, helo/intro identification of peer, protocol 
violations or warnings, connection data rate perhaps... brainstorming.




Re: mod_smtpd project planning

2005-06-29 Thread Jem Berkes
> As one of the students I can definitely appreciate that!
> 
> To everyone managing SoC: about how long until our svn accounts are
> activated? I know there are a lot details being worked out still, but I
> still feel a little in the dark.

Hi all, I'm another student working on mod_smtpd

Been running httpd 2.x since it appeared, but am new to development.

- Jem Berkes