Jason R. Mastaler wrote:

I've prepared a FAQ entry based on some recent discussion for how to
improve TMDA's performance. It should be particularly helpful for
installations which push a large number of messages through their
TMDA. If anyone can think of additional tips, let me know.


As you might have noticed, I've been trying a few setups in order to evaluate TMDA, and see how it behaves. I'm researching this in order to advise a customer on a new mail system. This customer is an ISP and does local delivery for about 5 million messages a day. Peaks are up to 500msgs/sec. Offcourse, growth is somewhat expected, with the amount of spam rising.

The functionality TMDA offers is interesting, especially the virtual-user ability and the configurability for end-users, but at current, it would cause a lot of system overhead: 5 million invocations of tmda-filter a day, all initialising themselves, loading the needed modules, etc.

Looking at other mail-related (spam/malware/virus) tools, it is almost striking how these packages develop similarily. Let's take Spamassassin, amavis and clamav.

Each started out as a filter accepting messages on stdin or a file, doing their magic and afterwards handing the message back to the MDA, or rerouting it to the bitbucket.

Especially when those tools got more popular and systems processing more and more messages per second started to install them, there was a need to reduce resources. Each of them took the same route eventually: Daemonize the main process that does the meaty work. Make it listen on some sort of socket (inet/unix) and fork off resources as it is needed.

This way, no modules need to be loaded over and over again, you can use caches in a sensible way, etc.

Would this possibly be a way forward for TMDA as well ?
Since TMDA is typically an LDA, it would even be possible to add an LMTP interface, however I'm just theorising here, I haven't looked at TMDA internals.


Another issue currently (to me) is the local architecture of the configurations:

Typically in large-scale mail setups, users are virtual over a pool of machines. These machines do not have user details locally (ie: no uid) and delivery takes place by maildrops, or other vmbox-type delivery processes. In these kind of setups it is not handy to have to have a local configuration file per user:

1) If you use per-machine configuration files, they will get out of sync. No matter how hard you try.
2) If you use a shared medium to store the configuration files, there may be locking issues, which could cause some headaches (update a file using tmda-cgi from box A when tmda-filter on box B is trying to make a decision for instance ..)


Is centralised configuration (using some sort of database-like solution, instead of local files) also something you are considering ?

I realise TMDA is an open-source project and .. you get what you pay for ;-) What I am mainly curious about is the way forward for TMDA, because I might be able to contribute for the points mentioned above, but would like to know if this is actually where you want TMDA to go ;-)

Grtz,

Nils.
_____________________________________________
tmda-users mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-users

Reply via email to