Better to store each message in a seperate file, just use the database to keep key info about that message file, storing large messages in such a db is problemmatic, file system is much more efficient. You might even keep the message in a prescanned format (even binary) for faster access.
Rgds, -GSH > Problem: > > - People leaves mail in the mailbox. Scanning the mailbox every time is > a I/O hungry operation > > - Rewritting a partially updated mailbox is very expensive. UIDL update, > partial mailbox deleting, mail arrives while the popper is running... > > Solution: > > A simple and efficient database (key/value) used to store messages. For > example, BerkeleyDB (http://www.sleepycat.com/) > > Qpopper would have six operations: > > - Translate est�ndar mailboxes into the database. > > - Serve mails from database. > > - An additional tool to show statistics about users: messages in > database, lenght, last login, quota... > > - An additional tool to list and delete a concrete user message. > > - An additional tool to delete an user and all its messages. > > - An additional tool to kill all popper processes, disable POP3 logins > and reconstruct the database if it's neccesary. This operation, > tipically, lasts 4-5 seconds. > > We could have have another tool to delete messages already read and > older that a month, for example. > > Example: > > You could have a central mailbox database. Every email in the database > would have a unique UID. Every message resides in two register, for > example. One register contains the message body. The other register has > the message headers, which can be modified by qpopper (UIDL, Status, > etc). > > There are per user registers to keep data like messages UID, messages > length, quota, last login, perhaps UIDL and Status. > > There is a global register that keep a global serial number (used as a > UID generator), atomically updated every time a message is added to the > database. > > When an user enters POP3, qpopper would translate new messages in user > standard mailbox into the database (erasing the original mailbox). Then, > the messages are served from the database. The message migration can be > implemented, also, with a cron job to migrate mailboxes with infrequent > logins. > > The unique remaining problem would be "quotas", a very problematic issue > for current qpopper also. If you control the local mailer you can talk > to the database and control quotas there. > > Advantages: > > - You don't need scan anything when you have the messages in the > database. You know, everytime, how many messages an user has, lenght, > and so on. If new email arrives, you migrate it to the database. > > - You can delete individual messages without needing a mailbox > rewriting. > > - You can modify headers without expensive I/O, since headers (tipically > <2Kbytes) are kept separated from message bodies. > > - New messages arriving while qpopper is working don't require mailbox > rewriting. > > - Berkeley DB, for example, can retrieves partial registers. That is, > you can have a 15 MB message, and you don't need to read it in a shot. > In fact, you can read the message in 64 Kbytes chunks, for example, to > keep memory and I/O small. > > - Berkeley DB overhead in disk space and CPU is fairly small. > > - Berkeley DB implements atomic transactions. In fact, you have full > ACID semantic. A popper processs can die any time and the database is > always consistent. > > - Berkeley DB detects and resolve deadlocks when multiple processes > access the database. > > - Berkeley DB is free for non commercial usages. > > - Last Berkeley DB version supports replication. > > - You can support multiple mailboxes format: mailbox and maildir, for > example. The unique impact would be to program the mailbox to database > converter. This step if fairly simple. > > PS: I'm advocating Berkeley DB because I'm using the system for years in > big (millions of registers) and critical environments, and its > performance and safety are stunning. But any similar DB will do the > work. Observe that I'm not talking about SQL database. That's not the > way. I'm talking about fully ACID semantic key/value databases. > > -- > Jesus Cea Avion _/_/ _/_/_/ _/_/_/ > [EMAIL PROTECTED] http://www.argo.es/~jcea/ _/_/ _/_/ _/_/ _/_/ _/_/ > _/_/ _/_/ _/_/_/_/_/ > PGP Key Available at KeyServ _/_/ _/_/ _/_/ _/_/ _/_/ > "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ > "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ > "El amor es poner tu felicidad en la felicidad de otro" - Leibniz
