Now that MOVE is settling down, I really want to explore a vendor-neutral DUMP and RESTORE format for IMAP mail stores.
I haven't given up on the IMAP5 concept - though in discussion with Arnt recently we have both come to the conclusion that another stateful long-lived protocol is probably not right. What would be more useful to those of us trying to build large clustered solutions is a protocol which is more batch based, so it could be used in a connectionless way, without needing a complex stateful proxy in the middle. I'll come back to that. Anyway - DUMP/RESTORE and Replication. David (cc'd) did the initial work for replication in Cyrus, and I've built on that with with new protocol in Cyrus 2.4. It has some warts though - it exposes far too much Cyrus implementation specifics. I want to use the same concepts for a backup format which is durable, incremental, and can be restored in such a way that a client which connected again later can not tell that it's connecting to a restored copy of the mailbox rather than the original mailbox. I would also change our replication protocol for Cyrus to be based on making synthetic incremental backups and applying them to the replica - so that in theory any server which supported the DUMP/RESTORE format would be able to be a replication source or destination. Why do I think this is useful? ============================== For users: * ability to move between mail providers more easily (just import and export mailboxes) For providers: * ability to switch server software. * ability to take efficient incremental backups which can be restored exactly as taken (at least in the case of Cyrus, this doesn't currently exist - you can't backup an entire mailbox at a point in time without having to reconstruct afterwards to clean up the mess - it doesn't do snapshots) For us at FastMail/Opera: * replace our custom-built Cyrus backup tool which does under-the-hood locking magic with something standard. How Cyrus Replication works: ============================ We take advantage of the MODSEQ values from CONDSTORE/QRESYNC to work out what needs to be sent to bring the replica into date. At FastMail we test our replicas once per week by running a set of IMAP queries against both the master and replica (repeating multiple times in case of mismatch so we don't get too many false positives). These queries are designed to exercise all the interesting fields, so that we know the replication protocol is working correctly. To ensure integrity, we also calculate a "SYNC_CRC" value. This is the XOR of a CRC32 for each UID in the mailbox. The CRC32 for the individual UIDs is calculated on a string formatted from all the mutable fields of the message. This allows efficient updates, because you can just calculate the old and new values on each change, and XOR them together to get the new SYNC_CRC value. In the case of a SYNC_CRC mismatch, we download all the metadata (similar to FETCH 1:* (UID FLAGS INTERNALDATE MODSEQ)) for every record and calculate what changes need to be made on the replica to bring it back into sync. This is where UID promotion happens if the content of the message is different. We do mismatch detection with a GUID field (actually DIGEST.SHA1 on the RFC822 body) Here's an example from the wire: <1354488670<COMPRESS DEFLATE >1354488670>OK DEFLATE active <1354488670<SET OPTIONS %(CRC_VERSION 2) >1354488670>OK success Creating two mailboxes: <1354488685<GET MAILBOXES (user.foo user.foo.subdir) >1354488685>OK success <1354488685<APPLY MAILBOX %(UNIQUEID fa8673d7-471e-4d5b-9c7f-f3ee64106447 MBOXNAME user.foo LAST_UID 0 HIGHESTMODSEQ 2 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0 POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1354488684 PARTITION default ACL "foo lrswipkxtecdan admin lrswipkxtecd " OPTIONS P SYNC_CRC 0 RECORD ()) >1354488685>OK success <1354488685<APPLY MAILBOX %(UNIQUEID b2381db2-2e89-4a35-92d5-5ce55bf9fc4d MBOXNAME user.foo.subdir LAST_UID 0 HIGHESTMODSEQ 1 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0 POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1354488684 PARTITION default ACL "foo lrswipkxtecdan " OPTIONS P SYNC_CRC 0 RECORD ()) >1354488685>OK success Appending two messages. First we query the remote end to see what's there now: <1354488687<GET MAILBOXES (user.foo) >1354488687>* MAILBOX %(UNIQUEID fa8673d7-471e-4d5b-9c7f-f3ee64106447 MBOXNAME >user.foo LAST_UID 0 HIGHESTMODSEQ 2 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 0 >POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1354488684 PARTITION default >ACL "foo lrswipkxtecdan admin lrswipkxtecd " OPTIONS P SYNC_CRC 0) OK success Then we try to reserve those messages (this is a waste of time really, it shouldn't bother trying, that's a bug): <1354488687<APPLY RESERVE %(PARTITION default MBOXNAME (user.foo) GUID (196922b6d822b618c665874fb523b9058a0adb56 ec4a76ae5e5f772dee837494134c75069286623a)) >1354488687>* MISSING (196922b6d822b618c665874fb523b9058a0adb56 >ec4a76ae5e5f772dee837494134c75069286623a) OK success <1354488687<APPLY MESSAGE (%{default 196922b6d822b618c665874fb523b9058a0adb56 92} From: test <t...@example.com> To: test <t...@example.com> Some stuff in the body... . %{default ec4a76ae5e5f772dee837494134c75069286623a 372} Return-Path: <br...@brong.net> Received: from local (slot2 [127.0.0.52]) by test_slot2_4092 (Cyrus git2.5+0) with LMTPA; Sun, 02 Dec 2012 23:51:26 +0100 X-Sieve: CMU Sieve 2.4 From: test <t...@example.com> To: test <t...@example.com> Message-ID: <cmu-lmtpd-4251-1354488686-0@test_slot2_4092> Date: Sun, 02 Dec 2012 23:51:26 +0100 Some stuff in the body... ) >1354488687>OK success And finally, now that the messages are spooled, we update the mailbox view. <1354488687<APPLY MAILBOX %(UNIQUEID fa8673d7-471e-4d5b-9c7f-f3ee64106447 MBOXNAME user.foo LAST_UID 2 HIGHESTMODSEQ 5 RECENTUID 0 RECENTTIME 0 LAST_APPENDDATE 1354488687 POP3_LAST_LOGIN 0 POP3_SHOW_AFTER 0 UIDVALIDITY 1354488684 PARTITION default ACL "foo lrswipkxtecdan admin lrswipkxtecd hello lrswipkxtecd " OPTIONS P SYNC_CRC 31431471 RECORD (%(UID 1 MODSEQ 4 LAST_UPDATED 1354488686 FLAGS (\Flagged) INTERNALDATE 1268029091 SIZE 92 GUID 196922b6d822b618c665874fb523b9058a0adb56) %(UID 2 MODSEQ 5 LAST_UPDATED 1354488687 FLAGS () INTERNALDATE 1354488686 SIZE 372 GUID ec4a76ae5e5f772dee837494134c75069286623a))) >1354488687>OK success This is somewhat wasteful - it should be able to cache the remote state and speculatively calculate a diff against what it expects to be there and upload that as a single incremental dump. In the usual case that the other end hasn't changed, it could just apply that dump and return the successful result. NOTE though: SYNC_CRC 31431471. The remote end calculates that after applying these changes, and returns "OK" because it matched. You can also see plenty of the warts there - both the %() syntax and things like POP3_LAST_LOGIN and OPTIONS which are horribly vendor-specific. Interesting bits to consider: ============================= When backing up an entire user, you definitely want message de-duplication. For FastMail, we delay EXPUNGE for a week, and also back up the EXPUNGEd messages so that we our "restore from backup" feature can always find all messages EXPUNGEd in the last week, even if we lose a server. So this means there needs to be a way to cross-refererence a message in another mailbox. Cyrus uses SHA1 as the GUID, and uploads messages with that identifier first before applying the rest of the changes. For a standard I believ we don't want to use a hash that's already on its way out as the default, so we should look at an alternative to this. There needs to be a way to read just the essential metadata from a backup file quickly (that is uidvalidity,lastuid,highestmodseq) to calculate which UIDs need their data included in the new backup, and which also need their message bodies or XREFs included - similar to the "GET MAILBOXES ()" query in the example above. It also makes sense to have the format allow including an entire user's mailboxes rather than doing each mailbox individually, since the XREFs would otherwise be across backup files. The format needs to both support every piece of data needed for every current extension, and be extensible enough that new extensions' data can be added. Things I can think of immediately are METADATA/ANNOTATION information, and what we in Cyrus call DELETEDMODSEQ - strictly, the MODSEQ of the last EXPUNGEd message for which you have forgotten the metadata. Without this, you can't efficiently reply to QRESYNC queries, because you need to tell about every gap in the UID sequence in case the a message in there went away. Thanks you: =========== If you've read this far, thank you! My goal is a format which captures every piece of data which is required for any client connecting to the server to be unable to see that it's a different server than previously after a DUMP/RESTORE (assuming the server supports the same extensions of course). Anything which can be re-parsed from the message RFC822 doesn't belong in this format, only fields which are mutable (like FLAGS), set externally (like INTERNALDATE), or necessary metadata about the past (like MODSEQ and friends). The biggest question facing me up front - what does it look like on the disk/wire? The Cyrus protocol at the moment looks almost like IMAP, and parses almost like IMAP - with the added warts that it uses %() to designate a list with key/value pairs rather than a list of items, and it uses %{partition sha1 size} rather than {size+} to designate rfc822 messages. That is clearly bogus for a generally applicable protocol. The list of fields to include is quite clear - the only real consideration is whether to support backups without MODSEQ information in them. They make incremental backups a lot harder, since you have to read all the UID records from the old backup and compare them to the current values to determine if anything has changed (like naive clients doing FETCH 1:*). I would like a backup format that can support ANY server though, and be built over regular IMAP by a standalone tool. Regards, Bron. -- Bron Gondwana br...@fastmail.fm