On Wed, 4 Aug 2004, Sam Varshavchik wrote:

> Jon Nelson writes:
> >
> > I recently gave Dovecot a try.
> > It's not nearly as featureful (or seemingly as stable) as
> > courier-imap, but it has one very important distinction:
> >
> > It is *wicked* fast.
> >
> > It made me think - indexes are what makes dovecot so fast.  What would
> > it take to add similar indexing capabilities to courier-imap?
> By showing a need for it without using the word “benchmark”.

OK.  I know for a fact a number of people that refuse to use
courier-imap because it is much too slow when dealing with large
mailboxes.  By "large" I mean anything much over 10 thousand or so
entries.  Don't talk to me about filesystems, or CPU speed, or anything
else.  As I explain below, on identical hardware with identical
filesystems with identical everything, dovecot was much faster.

This is how I arrived at that conclusion:

To satisfy my own curiosity, I installed dovecot, because it could use
my maildirs in-place.  Most of the server-created files are identical,
if named differently (courierimapuiddb is the same as dovecot-uid, or
whatever, and so on).

I cracked open my client, pine, and opened a folder. It contains 4000
messages.  I then sorted and timed the operation. It took 17 seconds.
I repeated this operation several times, and consistently got between
5 and 6 seconds (I assume because of cache effect, because the client
sends the same request each time.)

I closed pine, shut down courier-imap, and started dovecot.  Performing
the same tests on the same folder netted me a re-sorted folder in less
than 1 second every time.  There was a very slightly longer delay
opening the folder, the first time (I assume because of the time it
took to index the folder).  Subsequent openings were faster in
dovecot than in courier.

Then I performed the tests back-to-back.  courier then dovecot (and vice
versa), each time deleting the indexes.  The results were fairly
consistent -- dovecot produced *no* delay opening a folder once the
index had been created, and *no* delay to sort the folder in any way I
chose.  You'll notice the caveat, which I explained in the previous

An example of the IMAP session follows.  Here, I am sorting by THREAD.

19:58:43.823813 read(0, "0000001a THREAD REFERENCES US-ASCII ALL\r\n",
8192) = 41
19:58:55.474217 write(1, "* THREAD ((2)(3)(4 ....

As you can see, it took 12 seconds (11.5 or so) to perform the
operation.  I performed these tests on several folders.

The fact of the matter is that on identical hardware, with identical
mailboxes, dovecot was faster, sometimes much much faster, than

Searching and sorting are the two easiest ways of experiencing this
difference.  The client used was pine.

> People do not use IMAP servers to run benchmarks.  People use IMAP servers
> to read mail.  All IMAP-reading mail clients that might be considered
> popular will cache all message metadata.  When you're scrolling through the
> folder's index the IMAP client is not going to issue a server request for
> every new message that's scrolled into view.  All of the message metadata
> will be cached.  So if the mail client wants to resort the folder it won't
> ask the server to do it, it'll do the job itself.

I have as much healthy disdain for synthetic benchmarks as the next guy.

> So, if you want to evaluate indexing you need to take a reasonably
> popular IMAP client, log its IMAP commands, then show how indexing
> will help. Arbitrary benchmarks won't cut it, and adding indexes for
> the benefit of a lesser-used IMAP tool will come at the expense of
> greater overhead for the rest of the IMAP clients, which makes no
> sense.

If you consider pine a 'lesser-used' IMAP tool, then what you are saying
in effect is that courier-imap is not suitable for use with pine and
large mailboxes.

Why does adding an index necessarily create more overhead?  (The real
question here is whether the additional overhead results in a more
efficient or faster time-to-response on client queries).

Certainly there is overhead in maintaining an index, but the purpose of
an index is faster data retrieval.  It stands to reason that having an
index would benefit even very sophisticated IMAP clients.

One of the reasons IMAP exists is because people wanted *less* storage
on the client end, *less* client-side state from

The overwhelming majority of IMAP clients that I'm aware of don't store
much if anything at all between invocations (this statement is clearly
at odds with your statement: "All IMAP-reading mail clients that might
be considered popular will cache all message metadata.".)  Certainly,
some do, but for many people it's simply not practical to have to use
the same client on the same machine all the time.  It may be true that
Outlook and friends do cache metadata, but pine doesn't, and pine is
very popular, too.  By looking at this mailing list's User-Agent and
X-Mailer strings, Mozilla is the most popular, trailed by outlook, mutt,
sylpheed, kmail, imp, squirrelmail, and even Apple Mail.  Pine does not
use either of those but out of 1550 messages I counted 182 uses of pine
in the message id. (Approx. 11%)  Hardly scientific.

It's not my intention to start a war here, but I can't ignore my
experiences.  I was wondering if you (Sam) had any intention of adding
indexes, in the future or ever at all, and if not, then is that decision
set in stone.

I'm also not going to switch any time soon, but for some people this is
a really big issue.

Ensign Walnut approaches Dr. Crusher with caution...

C and Python Code Gardener

This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
courier-users mailing list
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users

Reply via email to