Better to store each message in a seperate file, just use the 
database to keep key info about that message file, storing 
large messages in such a db is problemmatic, file system is 
much more efficient. You might even keep the message in a 
prescanned format (even binary) for faster access.


Rgds,
-GSH

> Problem:
> 
> - People leaves mail in the mailbox. Scanning the mailbox every time is
> a I/O hungry operation
> 
> - Rewritting a partially updated mailbox is very expensive. UIDL update,
> partial mailbox deleting, mail arrives while the popper is running...
> 
> Solution:
> 
> A simple and efficient database (key/value) used to store messages. For
> example, BerkeleyDB (http://www.sleepycat.com/)
> 
> Qpopper would have six operations:
> 
> - Translate est�ndar mailboxes into the database.
> 
> - Serve mails from database.
> 
> - An additional tool to show statistics about users: messages in
> database, lenght, last login, quota...
> 
> - An additional tool to list and delete a concrete user message.
> 
> - An additional tool to delete an user and all its messages.
> 
> - An additional tool to kill all popper processes, disable POP3 logins
> and reconstruct the database if it's neccesary. This operation,
> tipically, lasts 4-5 seconds.
> 
> We could have have another tool to delete messages already read and
> older that a month, for example.
> 
> Example:
> 
> You could have a central mailbox database. Every email in the database
> would have a unique UID. Every message resides in two register, for
> example. One register contains the message body. The other register has
> the message headers, which can be modified by qpopper (UIDL, Status,
> etc).
> 
> There are per user registers to keep data like messages UID, messages
> length, quota, last login, perhaps UIDL and Status.
> 
> There is a global register that keep a global serial number (used as a
> UID generator), atomically updated every time a message is added to the
> database.
> 
> When an user enters POP3, qpopper would translate new messages in user
> standard mailbox into the database (erasing the original mailbox). Then,
> the messages are served from the database. The message migration can be
> implemented, also, with a cron job to migrate mailboxes with infrequent
> logins.
> 
> The unique remaining problem would be "quotas", a very problematic issue
> for current qpopper also. If you control the local mailer you can talk
> to the database and control quotas there.
> 
> Advantages:
> 
> - You don't need scan anything when you have the messages in the
> database. You know, everytime, how many messages an user has, lenght,
> and so on. If new email arrives, you migrate it to the database.
> 
> - You can delete individual messages without needing a mailbox
> rewriting.
> 
> - You can modify headers without expensive I/O, since headers (tipically
> <2Kbytes) are kept separated from message bodies.
> 
> - New messages arriving while qpopper is working don't require mailbox
> rewriting.
> 
> - Berkeley DB, for example, can retrieves partial registers. That is,
> you can have a 15 MB message, and you don't need to read it in a shot.
> In fact, you can read the message in 64 Kbytes chunks, for example, to
> keep memory and I/O small.
> 
> - Berkeley DB overhead in disk space and CPU is fairly small.
> 
> - Berkeley DB implements atomic transactions. In fact, you have full
> ACID semantic. A popper processs can die any time and the database is
> always consistent.
> 
> - Berkeley DB detects and resolve deadlocks when multiple processes
> access the database.
> 
> - Berkeley DB is free for non commercial usages.
> 
> - Last Berkeley DB version supports replication.
> 
> - You can support multiple mailboxes format: mailbox and maildir, for
> example. The unique impact would be to program the mailbox to database
> converter. This step if fairly simple.
> 
> PS: I'm advocating Berkeley DB because I'm using the system for years in
> big (millions of registers) and critical environments, and its
> performance and safety are stunning. But any similar DB will do the
> work. Observe that I'm not talking about SQL database. That's not the
> way. I'm talking about fully ACID semantic key/value databases.
> 
> -- 
> Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
> [EMAIL PROTECTED] http://www.argo.es/~jcea/ _/_/    _/_/  _/_/    _/_/  _/_/
>                                       _/_/    _/_/          _/_/_/_/_/
> PGP Key Available at KeyServ   _/_/  _/_/    _/_/          _/_/  _/_/
> "Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
> "My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
> "El amor es poner tu felicidad en la felicidad de otro" - Leibniz

Reply via email to