On Sat, 14 Aug 1999, David Harris wrote:

> Okay. Lets assume we've got some kind of database backend for the index file
> where we can add, remove, and modify records. How exactly would one keep it in
> sync with the Maildir? I guess each time you would have to compare the messages
> in the maildir with the messages in the index, and summarize new messages while
> deleting old ones. Also detect if a message has changed, and re-read the
> summary info. You are handling this with timestamps I assume?

Yes.  It is simply faster and easier to rebuild the index from scratch
every time you detect that the Maildir has been changed, instead of trying
to truly sync the two.  Timestamps are perfectly appropriate to detect
when a Maildir changed, as long as you are aware of certain race
conditions which are easily detectable and avoidable, and as long as the
clocks on your NFS clients and servers are all synchronized.

When you're dealing with lots of mail, you don't leave all of it in the
INBOX.  You file it all away in the folders.  I do not rebuild the index
unless it is absolutely necessary.  If I do not open a folder, I don't
need to see what's in there, so I do not need to use its index.  So,
there's no need to rebuild an index for a large folder every time you move
some messages from a small INBOX to a large folder.

The theory is that if you have a huge folder, it's probably an archive of
some sorts, so you really don't access it that often, so on rare occasions
that you need to peek into it, you'll wait a few seconds for your initial
access, while its index gets rebuilt.

> Also, what kind of summary information are you storing? Well, I guess that's
> dependant on the app and might be different for c-client.

Yes, since I only need to see the sender's name, the subject, and the
date, that's all I cache.

> I don't know if this or any Maildir app could scale to hundreds of thousands of
> messages... there is an inherent limitation in Maildir because it stores all of
> the messages in the same directory without any kind of hashing. The filesystem
> would grind to a halt before you could even deliver a hundred thousand messages
> to a maildir, I would think.

Not with a properly-tuned XFS.  Maildir under XFS should rock.

> How does your implementation keep the index updated?

Nothing too complicated.  Read the contents of the Maildir, read the
headers of every message, build the contents of the index in memory, sort
it, dump it.  And I don't have to worry about exception conditions.  If
the system crashes while the index is partially written out, biiiiiig
deal.  After the next message is delivered into the Maildir, the index
will be rebuilt from scratch.

>                                                      If you can scale to 1,000
> messages,

My trashcan is a maildir folder, and I have set my retention interval to
7 days.  I usually have 700-800 messages held in the Trash, waiting to be
purged out.  On the couple of occasions where I have to open the Trash to
find something, I hang for a couple of seconds, and that's about it.  But
that was on a P-200MMX workstation with

>           I'll bet that's good enough for most people and it's far better than
> plain-old-Maildir. Also, where in your source code is this done?

All the caching logic is in maildir.c -- it has lots of other stuff as
well, but everything that deals with the cache is in there.


Reply via email to