On Fri, 17 Dec 2004, Peter Stuge wrote:
>On Thu, Dec 16, 2004 at 11:32:06PM +0100, Andreas Aardal Hanssen wrote:
>> PS: And yes, I know it's been mentioned before, but the perfect mailbox
>>     format has yet to be discovered. For example, one that allows snappy
>>     access even with 300000 messages, one that allows setting of flags
>>     without race conditions (such as the renaming in Maildir, which is
>>     very inefficient with big mailboxes with many accessors, such as a
>>     shared mailing list), one that allows moving, adding and deleting
>>     messages fast, and in a crash-safe way. One that deals with modern
>>     principles such as indexing, shared mailboxes and high concurrency.
>>     That would be something. :)
>Sounds a lot like an indexed SQL server..
>Does anyone on the list have experience from mail stored e.g. in
>MySQL?

No database that I know can handle the extreme amount of email arriving to
an email hub of a decent size. Even the most industry standard email hubs
use specialized storage to speed things up. The speed of direct access
compared to the latency of accessing a SQL server is very significant. And
the disk space and memory requirements of huge databases are huge.

I used to work at such an email hub, and at peeks it pumped over a million
emails per mail exchanger every day, and we had twelve of them. We stored
1.2 terrabyte of email data, in about one 1 billion (1024*1024*1024)
emails. Our mailbox format was maildir, the mail server was qmail. Now if
someone can come up with a database that can store random data of sizes
varying from 512 bytes to several megabytes, at 10-15 arrivals a second,
I'd be willing to change my opinion. ;)

Andy :-)

--
Andreas Aardal Hanssen   | http://www.andreas.hanssen.name/gpg
Author of Binc IMAP      |  "It is better not to do something
http://www.bincimap.org/ |        than to do it poorly."

Reply via email to