NOTE: This is a technical digression on the ethics of MTAs, as Brian B. inculcated in my brain over the past 4 years, or better, as I developed his teachings. (Brian is master of Qmail, and Qmail is master of MTAs, being master a transitive property). I am just a slow learner...
So, if you don't want to get into technical details about email, well, hit DELETE now... On 26/6/03 17:28, "Stefano Mazzocchi" <[EMAIL PROTECTED]> wrote: > [...] > So, one night, when I was visiting them in London, Pier and I sit down > and talked about how feasible/useful/dangerous was to update our email > infrastructure to JAMES. [despite the little coding I've done on it, I'm > still emotionally attached to the idea of having an entire email > infrastructure based on the beauty of java modularity and pluggability] > > It turned out that Pier had pretty rock solid arguments *not* to use > JAMES as a MTA and all came from the sysadm paranoia that he grew > accustomed to (and which I totally lack, given my very basic sysadm > skills and experience). > > Unfortunately, I don't recall exactly what his arguments were, Pier, do > you have a minute to chime in? I think the JAMES people would love to > hear your criticism. There are a quite consistent number of advantages in running a native MTA compared to a Java-only solution on UNIX systems, all derived from one single winning point: multi-processing. Let's try to identify the main components of an MTA: The most important piece is the mail queue: a queue is a transient storage where messages are held temporarily, during the message processing stage. There might be different queues per MTA (incoming, outgoing, in-process), but one point is fundamental: the queue needs to be fast, reliable and less messages are in any queue, better it is, at all time. Other vital part is the "injector", aka something that reads a message from somewhere (file, network, another queue), and stores it in a queue. Third part is the despooler, taking a message from the queue and delivering it somehow (to a file, a pipe, through the network, or to another queue). Fourth and final component is the "processor", which is no more, no less, than the union of a injector and a despooler, but only operating on queues (therefore, a processor reads a message from the queue, does something with it, and puts it back into the queue). Diagram: +----------+ +-------+ +-----------+ INPUT----->| injector |----->| |----->| despooler |----->OUTPUT +----------+ | queue | +-----------+ +-->| |--+ | +-------+ | | V +----------------+ | processor | +----------------+ Complicate it as much as you want, but this is the basic... Add to this diagram another component, pervasive throughout the entire drawing, a "controller", or something that makes all those separated components talk together. All those components must run asynchronously, independently, completely separated from one another and (for security) under different user privileges. NOTHING (apart from the master controlling daemon, doing nothing) runs as root. The lifecycle of the MTA, then, is the following: 1) the controller starts up (root) and binds to all required listening ports 2a) once a connection from the input is established (to the controller), the controller forks, downgrades to the "queue" user and executes the "queue" process 2a) the controller forks again, downgrades to "injector", executes the "injector" process. 2b) the controller connects (usually via pipes, but could also work on local network) the newly created "injector" output with the input of the "queue" created in step 2a. 2c) the message is read from the original INPUT as the "injector" user, and "piped" to the queue by the other process as the "queue" user. No I/O happens as ROOT (call it defensive programming, Brian B., late 1999). 3) once the message is in the queue, if required the controller connects "queue" with "processor" and again with "queue" in a similar way as described in step 2. This happens as many times as it is required (a message can be re-injected, altered, god knows, but again, nothing works as "root" and everything is isolated from anything else). 4) once the message doesn't require further processing, again as in step 2, the controller connects "queue" with "despooler" and sends the message. So, overall, every single part is completely isolated from any other, nothing runs as a privileged user, no process has power to interact or disrupt the operation of another, apart from the controller that all it does is "create pipes, fork, downgrade, and execute". Notably, each interaction is transactional (so, for example, unless the "queue" process is terminated successfully, the SMTP injector won't report to the other end that the message has been accepted, and so on)... No messages are lost (in theory). You see how multi processing can hugely help in terms of reliability and security, but there are several other advantages: every process is TINY (on qmail in the order of 1 megabyte... It's fast to create, it's fast to destroy, and it runs in its little sandbox, if it dies (out of memory?) all other running processes are untouched... That is _BY_FAR_ the best architecture ever, it might be not the fastest one, but for sure it is the most secure and reliable. Plus you get the advantage of running other processes most of the times. For example, anti-virus engines, anti-spam engines, or even MUAs (mail user agents, like IMAP/POP3 server) are all little tiny things, they come packaged as simple binaries, and can be executed completely independently and separated completely from the whole mail injection-process-despooling thing. Now, take our (betaversion) example: > Now, Pier, Fede and I share our email infrastructure on betaversion.org. > It's a pretty complex (and very powerful) setup made with > qmail+cyrus+bogofilter+sieve+Horde/IMP > [...] Qmail works as described above (N processes running as N users doing the different bits and bobs), bogofilter is running as a "processor" completely separated from the Qmail processors (the ones doing alias rewriting and stuff), cyrus runs as an injector (and in its own is separated into several different processes as well), sieve/horde and similia run under Apache... What I get from all this "separation" and independence? Example: Qmail fails (or I have to take it down for some odd reason?), Stefano can read his email from Horde via Apache on his IMAP store running on Cyrus. Cyrus crashes? Well the message my mom sent me at the same time is queued by Qmail and will be delivered when Cyrus comes back up. I started disliking Qmail? Simple, I install postfix and don't touch anything else in my entire configuration... It's a "concerto" as Stefano pointed out correctly, of interconnected but completely independent and self-reliant pieces of software, and it works... Now, when I see James, I see a nice mail server, yeah, cool, but it has everything inside it... SMTP server, QUEUE, mailing list processor, MUA, SMTP client, web server EVERYTHING running in one big huge process, all with the same privileges from the OS point of view, and know what, if my SMTP engine causes a JVM internal error, my IMAP, my webmail, my mailing lists, and my outgoing SMTP queue are stalled as well... NOT nice, actually, it looks so much alike to Lotus Notes running on a Windoze Server... Bulky, monolithic, hardly scalable, or interoperable with other software... Pier --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]