On Wed, 4 Aug 2004, Sam Varshavchik wrote: > Jon Nelson writes: > > > > > I recently gave Dovecot a try. > > It's not nearly as featureful (or seemingly as stable) as > > courier-imap, but it has one very important distinction: > > > > It is *wicked* fast. > > > > It made me think - indexes are what makes dovecot so fast. What would > > it take to add similar indexing capabilities to courier-imap? > > By showing a need for it without using the word “benchmark”.
OK. I know for a fact a number of people that refuse to use courier-imap because it is much too slow when dealing with large mailboxes. By "large" I mean anything much over 10 thousand or so entries. Don't talk to me about filesystems, or CPU speed, or anything else. As I explain below, on identical hardware with identical filesystems with identical everything, dovecot was much faster. This is how I arrived at that conclusion: To satisfy my own curiosity, I installed dovecot, because it could use my maildirs in-place. Most of the server-created files are identical, if named differently (courierimapuiddb is the same as dovecot-uid, or whatever, and so on). I cracked open my client, pine, and opened a folder. It contains 4000 messages. I then sorted and timed the operation. It took 17 seconds. I repeated this operation several times, and consistently got between 5 and 6 seconds (I assume because of cache effect, because the client sends the same request each time.) I closed pine, shut down courier-imap, and started dovecot. Performing the same tests on the same folder netted me a re-sorted folder in less than 1 second every time. There was a very slightly longer delay opening the folder, the first time (I assume because of the time it took to index the folder). Subsequent openings were faster in dovecot than in courier. Then I performed the tests back-to-back. courier then dovecot (and vice versa), each time deleting the indexes. The results were fairly consistent -- dovecot produced *no* delay opening a folder once the index had been created, and *no* delay to sort the folder in any way I chose. You'll notice the caveat, which I explained in the previous paragraph. An example of the IMAP session follows. Here, I am sorting by THREAD. 19:58:43.823813 read(0, "0000001a THREAD REFERENCES US-ASCII ALL\r\n", 8192) = 41 19:58:55.474217 write(1, "* THREAD ((2)(3)(4 .... As you can see, it took 12 seconds (11.5 or so) to perform the operation. I performed these tests on several folders. The fact of the matter is that on identical hardware, with identical mailboxes, dovecot was faster, sometimes much much faster, than courier-imap. Searching and sorting are the two easiest ways of experiencing this difference. The client used was pine. > People do not use IMAP servers to run benchmarks. People use IMAP servers > to read mail. All IMAP-reading mail clients that might be considered > popular will cache all message metadata. When you're scrolling through the > folder's index the IMAP client is not going to issue a server request for > every new message that's scrolled into view. All of the message metadata > will be cached. So if the mail client wants to resort the folder it won't > ask the server to do it, it'll do the job itself. I have as much healthy disdain for synthetic benchmarks as the next guy. > So, if you want to evaluate indexing you need to take a reasonably > popular IMAP client, log its IMAP commands, then show how indexing > will help. Arbitrary benchmarks won't cut it, and adding indexes for > the benefit of a lesser-used IMAP tool will come at the expense of > greater overhead for the rest of the IMAP clients, which makes no > sense. If you consider pine a 'lesser-used' IMAP tool, then what you are saying in effect is that courier-imap is not suitable for use with pine and large mailboxes. Why does adding an index necessarily create more overhead? (The real question here is whether the additional overhead results in a more efficient or faster time-to-response on client queries). Certainly there is overhead in maintaining an index, but the purpose of an index is faster data retrieval. It stands to reason that having an index would benefit even very sophisticated IMAP clients. One of the reasons IMAP exists is because people wanted *less* storage on the client end, *less* client-side state from invocation-to-invocation. The overwhelming majority of IMAP clients that I'm aware of don't store much if anything at all between invocations (this statement is clearly at odds with your statement: "All IMAP-reading mail clients that might be considered popular will cache all message metadata.".) Certainly, some do, but for many people it's simply not practical to have to use the same client on the same machine all the time. It may be true that Outlook and friends do cache metadata, but pine doesn't, and pine is very popular, too. By looking at this mailing list's User-Agent and X-Mailer strings, Mozilla is the most popular, trailed by outlook, mutt, sylpheed, kmail, imp, squirrelmail, and even Apple Mail. Pine does not use either of those but out of 1550 messages I counted 182 uses of pine in the message id. (Approx. 11%) Hardly scientific. It's not my intention to start a war here, but I can't ignore my experiences. I was wondering if you (Sam) had any intention of adding indexes, in the future or ever at all, and if not, then is that decision set in stone. I'm also not going to switch any time soon, but for some people this is a really big issue. -- Ensign Walnut approaches Dr. Crusher with caution... Jon Nelson <[EMAIL PROTECTED]> C and Python Code Gardener ------------------------------------------------------- This SF.Net email is sponsored by OSTG. Have you noticed the changes on Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now, one more big change to announce. We are now OSTG- Open Source Technology Group. Come see the changes on the new OSTG site. www.ostg.com _______________________________________________ courier-users mailing list [EMAIL PROTECTED] Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users