Mark,

Mark Crispin wrote:
On Tue, 19 Feb 2008, Bob Atkins wrote:
As always - thank you for your continuing efforts to improve what is already excellent imap server.

Thank you!  [Today was not a good day...flat tire on my motorcycle on the way to work...]
Bummer - I hope it wasn't raining as well....

While we are having a group hug :-) I would like to thank the entire UW imapd user community for its support and encouragement over the years. You are essential partners in this effort.  It wouldn't have happened without you!
I'm feelin' the love...  :-)

Turning to how append in mix works today:
 . opens the destination mailbox (which has the effect of reading the
   metadata, index, and status file)
 . reads the index and status file a *second* time (keeping it locked
   this time)
 . appends the message(s) to the current data file
 . rewrites the entire metadata, index, and status file.

As you can see, there is a *lot* of fat which can (and should!) be trimmed.

An easy and obvious step would be to create a form of open that leaves the index and status files locked, so that the second read (and lock) is unnecessary.  That may reduce the append time be as much as 33%, while still being as conservative and cautious as possible.

However, I think that a better approach is to be more aggressive and less cautious.

To do an append, it is not necessary to read anything other than the metadata and the sequences from the index and status.  Similarly, it is only necessary to rewrite the metadata, rewrite the sequences in the index and status, and append the new message data to the index and status. Everything else is fluff, done only to be "damn sure" that all is well. It was good when we weren't sure how reliable and robust mix was; but now is a costly burden.
A couple of thoughts. On large email boxes, the .mixindex can be very large so reading the whole thing is 'expensive'.  My inbox's index file is 1.2M for 16408 messages.

Why is it necessary to read the entire .mixindex before an update? Is there a possibility of just reading the last record so you know where it is at? Then you could just perform an append rather than a re-write to add the new record to the file. Same idea for the .mixstatus file. This would also substantially reduce the lock time and delays for another process that needs to access the box.

I need to satisfy myself that this can all be done without significantly impacting reliability and robustness.  One of the lessons that I learned with mbx format was that excessive reliance upon update-mode I/O can cause unpredictable problems.  The only update-mode I/O in mix today is when burping expunged messages from the data files; the metadata, index, and status files are always whole-file read and written.  That's excessively cautious, and certainly can be relaxed.
In theory, if locks are being respected there should be little chance of corruption - particularly if you are only doing appends when delivering new messages.

So, there's where we are now, and that's where we're going.  Fixing the append-to-large-mailbox performance issue is at the top of my list.

After that, I'm probably going to be looking at caching message parses, much like what is done now for sortcaching.  This will be big boost to users of some clients, including (I hope) Pine and Alpine.

-- Mark --

http://panda.com/mrc
Democracy is two wolves and a sheep deciding what to eat for lunch.
Liberty is a well-armed sheep contesting the vote.


--
Untitled Document
Bob Atkins  
President/CEO

DigiLink, Inc.
Business Inter-net-working
The Cure for the Common ISP!

Phone: (310) 577-9450
Fax: (310) 577-3360
eMail: [EMAIL PROTECTED]

 

_______________________________________________
Imap-uw mailing list
[email protected]
https://mailman1.u.washington.edu/mailman/listinfo/imap-uw

Reply via email to