> The imap server (From the all the code in maildir.c) never changes
> a message. The only thing it changes is flags which are file name
> changes.
Right. I was wrong when I thought it did.
> The question is what sort of performance gain would you get using a
> database alongside (Assuming you can't use locking so you have to copy
> the entire thing for every update you want to make to the database)
>
> When you make your local copy should you use that copy for the entire
> session of the imap and only copy it back when you are done (or do some
> sort of checkpointing of moving it back every 10 minutes or so) Or
> copy/move it for every single change?
>
> How big would the database get with about 1000-2000 messages.
Hum.. you bring up some valid concerns with the "copy and move" strategy for
maintaining a database without locking. I'll have to do some more thinking and
investigation and see what I come up with. For example, I'm not even sure what
header information should be stored in the database.
> I guess the only way to know for sure is to write the code and measure the
> perfomance, but any ideas? Would it really help?
Well, I'm particularly concerned about web based e-mail clients, which I
suspect have to grab a listing of all the messages in a folder whenever they
show the Inbox, whereas terminal clients would only grab a listing once a
session or less if they store their own database.
I think it would be a good idea to do some testing with a real web-based e-mail
client and the current Maildir driver. I could setup recordio to capture the
IMAP conversation between the server and the client, as well as running a
strace on the client to see how labored things get with lots of messages.
> I can help write the code (or write it) if people think it might speed
> things up... I have already tweaked the UW/Imap code to split the Maildir
> cur directory into 10 sub directories so that each sub directory has a
> balanced number of messages in it. (This increases speed for flag changes
> significantly since flag changes are filename changes, which if you have
> 5000 files in a single directory can be time consuming :) The next step
> would be to use a database for UID/Header/Flag data. Where the Maildir is
> still an "Authority" on the data but there is a helper database that can
> be consulted.
It would be great if I could get my hands on this patch from you. The patch you
describe and any summary database that we might develop would go a long way
towards making the Maildir driver more robust for large mailboxes, and I'd like
to post them on www.davideous.com/imap-maildir/.
Also, you mention the idea of moving the UID/Header/Flag data into a database
while keep reverse compatibility. I'm not sure that we have to really keep
reverse compatibility, because the 10 hashed cur directories that you have
created already killed the ability for other Maildir clients to read the
Maildir, right?
- David Harris
Principal Engineer, DRH Internet Services