On Fri, Jul 15, 2005 at 02:02:27PM -0400, Geo Carncross wrote:

> Won't work. As much as it seems like this would be a good idea (and
> believe me: about half a dozen people on this list have had it, so
> it certainly is a good idea. better still, don't believe me, check
> the archive yourself :) )

Thanks.  I've now gone through a bunch of archived messages looking to
understand this.

Suggestion in short: why not keep track of which messages have been
seen by each server, and if a server senses a potential issue
(ie. after network problems are fixed, or whatever) correct it then?
[More detail later in the message.]

As I understand it, the requirements are like so:


(R0) Dropping email is not acceptable.

(R1) Follow RFC: UIDs must be 32-bit values.

(R2) Follow RFC: UIDs must monotonically increase.  In particular, a
     replication that results in messages with UIDs that precede a UID
     value reported to a user is bad, and must be corrected.

(R3) Each server in a replicating cluster should be read-write,
     ie. multimaster updates should be allowed.

(R4) Multimaster should work gracefully in failure scenario where
     client and internet can reach each server, but servers cannot
     reach each other.

Assumptions:

(A0) Duplicating email (ie. causing the user to see it a second time)
     isn't so bad if it's infrequent.

(A1) Users start new IMAP session relatively infrequently -- a new
     connection is not established within seconds of a previous
     connection.

(A2) Users should have an affinity to a particular server, and should
     only switch/be switched to another server in the event of a
     failure.

(A3) The most likely mode of failure is that a server is unreachable
     by a user.  This is the scenario that should be most engineered
     to not duplicate mail.

(A4) Loss of connectivity between two or more mail servers while
     both/all servers are still visible to the same users is a rare
     occurence.  Mail server should meet minimal requirements (ie. not
     drop mail) but some duplicate mail in this case is acceptable.
     This can be mitigated by giving users an affinity for a server.


Suggestion in detail:


(S0) Each server in a multimaster cluster is assigned a unique
     server_id.  If multimaster isn't necessary, server_id for all
     servers is 0.

(S1) Locally generate UIDs in a way that is globally unique
     (ie. splitting message sequence count using a local_sequence *
     num_servers + server_id type scheme, or some similar method.)

(S2) Each server keeps a replicated table "high_saved" of the last
     locally-generated UID it's saved.  Index by mailbox and
     server_id.

(S3) Each server keeps a replicated table "high_reported" of the last
     UID it's reported to the client.  Index by mailbox and server_id.

(S4) Each server keeps a replicated table, "process_message_UID", that
     is basically a message to each other server to make sure the UID
     is OK.  Index by mailbox, remote server_id, UID.

(S5) Each time an email arrives at a server (via SMTP or IMAP, not via
     replication), the server generates a new UID using scheme from
     part (S1) that is greater than any value currently in high_saved,
     for any server.  Then, for each server_id other than itself, it
     creates a row in process_message_queue.  Then, update high_saved
     with the new UID.

(S6) Periodically, as a maintenance thread, each server checks
     process_message_UIDs for any message sent to its server_id.

     LOOP foreach message: if the message UID is lower than the last
     reported UID known for this server and mailbox, change the UID of
     the email to a new UID as per (S1) and (S5), and delete message
     from process_message_UIDs.  If greater than or equal, delete
     message from process_message_UIDs without taking an action.

(S7) When the user client connects and wants the last UID, first
     perform step (S6).

     When done processing all messages in the process_message_UID
     queue for that server_id, report new high UID to user and update
     high_reported.

Examples/scenarios/analysis:

(E1) Single server, or multiple servers with a single master.  Step
     (S1) degenerates into a simple sequence.  Step (S5) does the
     same.  Steps (S6) and (S7) are basically skipped, since there are
     no other servers to exchange messages with, so the loops are
     empty.  So the server does no additional heavy lifting.

(E2) Multimaster load sharing, communication OK between servers: all
     messages assigned UIDs uniquely.  In general, UIDs will increase,
     but under some race conditions, a server will perceive a UID to
     step back due to replication.  If no client actually asked about
     UIDs during the race condition, no action is taken.  If a user
     timed things "just right", so that message with ID N arrives on
     server A and message with ID N+1 arrives on server B, and user
     queries B, gets N+1, then email is replicated to B, and the
     replication spreads the news that A has a message UID N for B.  B
     should auto-sense the problem (the next time the user queries B,
     or the next time B does its maintenance check) and B should
     update the UID to something beyond the current known max.  So, if
     the second client query is to B, the client will automatically
     correct.  If the second client query is to A, the server will
     initially give an old UID, but then B will correct it.  If the
     configuration is such that clients prefer their last server or a
     certain server, the second client query is more likely to go to
     server B, which is better.

     Note #1: User will sometimes seems to have a duplicate email.
     Since duplicate email is more acceptable than lost email, this
     should be acceptable in most environments.

     Note #2: if clients restart sessions very often in this scenario,
     it's possible to have thrashing.  But under normal conditions,
     ie. where new sessions are relatively rare, this should be a
     relatively rare occurrence.

(E3) Multimaster, connection breaks, user can only reach one server:
     until the connection breaks, communication is the same as in
     scenario (E2).  Once the connections breaks, each server is
     generating UIDs locally without being aware that the other server
     is assigning them as well.  If the user can only connect to one
     server (ie. user is at a WAN site, WAN site has local server "A",
     WAN connection is down) the user can continue to send and receive
     mail using the local server.  Server "A" will update
     high_reported appropriately.  Remote server "B" may continue to
     receive mail for the user, but high_reported will not be updated.
     When connectivity is restored, so long as user continues (for the
     short term) to use the same server, no duplicate email should
     result.

(E4) Multimaster, connection between servers breaks, user can reach
     both: until the connection breaks, communication is the same as
     in scenario (E2).  If the user communicates with both servers,
     each server will independently increase reported UID.  When
     connection is reestablished, one or both servers will reasign
     UIDs to the other's email, resulting in apparently duplicate
     email.  Gotta break some eggs.  Can be mitigated if user has
     affinity for last mail server.

(E5) When connectivity is broken between servers and then is
     reestablished, there will be a while when the server is both
     catching up and receiving new email and IMAP connections.  There
     is potential here for duplicate email.  Can be mitigated if user
     has affinity for last mail server.

(E6) The process_message_UID table will be a choke point if you have a
     lot of servers.  This scheme is good for high availability, bad
     for scalability.

OK, I probably spent way too long thinking this through and working
scenarios.  Did I miss anything?

- Morty

Reply via email to