> In great news, the code running at FastMail is now fully rebased on top of > 2.5. I'm really happy with the state of almost everything.
FYI: this is the current code running at FastMail: https://github.com/brong/cyrus-imapd/tree/fastmail-2014 Top commit: https://github.com/brong/cyrus-imapd/commit/2e730d25acfad53f510fd2d46928572c84d58409 I'm going to leave that there, with tons of rebases coming up in the early new year to get master out, at least this is a point in time that I'm pretty happy with :) Bron. > What's still to do is fixing the replication code. The same thing that has > been an issue since forever. Maybe the best thing is to just revert to the > 2.4 protocol, and ignore the new fields totally for the initial release. > They will still replicate, just not be protected by the sync_crc. > > There are also fixes to pick back from the FastMail branch. For the past few > weeks I've been focused on getting things ready for the carddav release, so > not so much on having them upstream maintainable. > > I am really sorry to everyone about the state of unix hierarchy separator and > alt namespace stuff. Well meaning but misguided fixes have just made it > worse. It's exactly the same problem that every web programmer deals with - > you need to "entity encode" exactly once. I have the correct fix for this in > progress... basically it's this: > > 1) on disk/in database format changes so that the separator is a control > character (less than space, so there's no need for improved_mboxlist_sort) > 2) in-memory format is ALWAYS a 'struct mboxname_parts' (short name: > mbname_t). This format is all individual strings, with the mailbox name > being a strarray_t, so no separators encoded in it. > 3) the external format is the only thing that depends on the configuration. > > Along with this are major changes to how LIST works (yes, again) - this time > with a serious eye to passing all of imaptest.org's tests. > > Rob M and I sat down the other day and created a giant whiteboard full of > things that we want to see in Cyrus for the future. We are planning to > employ somebody to work full time on this: > > https://www.fastmail.com/about/jobs/2015-01-cyrus.html > > Here's a typed up version of the list:: > > * Unix HS and Alt Namespace => make consistent (see above) > * mailboxes.db format: > * U[]foo.bar[]Sub[]Folder (for user namespace) > * S[]shared[]folder (for shared namespace) - so that user NS isn't a > sub-part of shared NS, speeds up listing. > * domains as part of user: U[]foo....@domain.com[]Trash > * $ => version key for tracking contents of mailboxes.db - always read at > startup (we use the same trick in conversations.db) > * FAST reverse ACL map: > * U:$userid => folders with ACLs > * G:$groupname => folders with ACLs > * combine those folders, eliminate common prefixes, search just those > prefixes. > - Makes LIST fast, even on big servers/giant murders. > * Mailbox on-disk paths == folder uniqueid > * fast, atomic rename - including multiple folders > * fix delayed_delete to just keep old uniqueid in mailboxes.db => no > DELETED. prefix > * fast undelete of entire folders > * store current mailbox name inside cyrus.header for reconstruct > * only works now that we store uniqueid in mailboxes.db (DLIST format) > * Sieve standards support => vacation time period, etc. Also check other > features for latest standard compatibility, e.g. imap4flags > * per-message annotations: change format to be more like cyrus.cache: offset > based, MVCC updatable such that QRESYNC and QUOTA are reliable. > * UNIFIED MURDER + sync: > > **** THIS IS THE BIG ONE **** > > I have dreamed of this forever. It's a giant job. Basically store multiple > locations in mailboxes.db for a folder. This combines replication with > murder, and sync_client needs a manager so that you can create arbitrary sync > patterns. > > Sub parts: > * sync_server in imapd (Ken's XFER-sync work ported from 2.4) > * generic change-log system (sync_log, squatter log, etc from current > FastMail code, plus extras) > * sync_client manager that reads. > > * central cleanup task: > * instead of running repack/cleanup/etc at mailbox_close, we log that it's > needed and let the current task continue. > * a background daemon tries (non-blocking lock) to pick up the exclusive > lock to do the repack, meaning that clients never pay the delay themselves. > Also fits with: > * short-locks for unlink > * at the moment, we take an exclusive lock for the ENTIRE time that we're > unlinking deleted messages from a folder. That can be quite slow, because > unlink is slow on most filesystems. We need the exclusive lock to ensure no > other task still expects to be able to read the file... BUT, we only need the > exclusive lock for a moment to ensure nobody else held the lock over this > time. We can release it straight away and know that the files which were > seem with FLAG_UNLINKED during the lock can be safely deleted, because nobody > can remember them as existing any more. > > * sync-state cache > * right now, we always query the replica for the current mailbox state > before sending a SYNC APPLY. In the general case, the replica won't have > changed since the last sync. We could cache the remote state in a local > database, and send an optimistic apply. If the old state hasn't changed, the > apply could happen immediately. Along with optimistic reserve, we can apply > changes in a single round trip, instead of the current 3. > * change sync_client do do partial user sync rather than grouping mailboxes > across users - means a single lock for user-level database updates (calendar > sync-token, conversations, etc) > > * Conversations mark 2 - FastMail have plans to fix our conversations > implementation to be better, then push that upstream. There's work underway > to standardise THRID and MSGID the way that Gmail do it, and our > conversations would be compatible. > > * Search: > - get the existing Xapian stuff upstreamed. > - external provider support: e.g. elastic search. > > * Archive: > - FastMail supports archiving parts of the mailbox to a different disk. > It's how we keep the first week's email on SSD while storing older emails on > big slow SATA. > - Make this more general and allow storing old email to a central object > store, so indexes are replicated and emails are stored in a separate > replicated system. > > * Backups > - backup format based on replication protocol > - optional inline blobs for the rfc822 messages or index them separately > > * JMAP (http://jmap.io/) support directly in Cyrus > > * Sane Restart/Failover process. > > * Nginx authentication backend > > This is actually really awesome with the unified murder above. You could run > an nginx non-blocking proxy on every frontend, which uses the mailboxes.db to > find the correct backend for the user, then proxies their connection to the > right server. This then means that you don't have tons of processes running > on the frontends that are just proxying to another full-weight imapd, but you > get the advantages of murder too - since it's unified, the backends have the > full mailboxes.db and can connect through to other backends directly for > shared folders which aren't on the same machine. > > I have ideas around backend failover and handover through nginx as well, but > they are longer term dreams... > > > So there's tons of work to go on with :) > > Bron. > > > > > > -- > Bron Gondwana > br...@fastmail.fm -- Bron Gondwana br...@fastmail.fm