As http://wiki.dovecot.org/NFS describes, the main problem with NFS has always 
been caching problems. One NFS client changes two files, but another NFS client 
sees only one of the changes, which Dovecot then assumes is caused by 
corruption.

The recommended solution has always been to redirect the same user to only a 
single server at the same time. User doesn't have to be permanently assigned 
there, but as long as a server has some of user's files cached, it should be 
the only server accessing the user's mailbox. Recently I was thinking about a 
way to make this possible with an SQL database: 
http://dovecot.org/list/dovecot/2010-January/046112.html

The company here in Italy didn't really like such idea, so I thought about 
making it more transparent and simpler to manage. The result is a new 
"director" service, which does basically the same thing, except without SQL 
database. The idea is that your load balancer can redirect connections to one 
or more Dovecot proxies, which internally then figure out where the user should 
go. So the proxies act kind of like a secondary load balancer layer.

When a connection from a newly seen user arrives, it gets assigned to a mail 
server according to a function:

  host = vhosts[ md5(username) mod vhosts_count ]

This way all of the proxies assign the same user to the same host without 
having to talk to each others. The vhosts[] is basically an array of hosts, 
except each host is initially listed there 100 times (vhost count=100). This 
vhost count can then be increased or decreased as necessary to change the 
host's load, probably automatically in future.

The problem is then of course that if (v)hosts are added or removed, the above 
function will return a different host than was previously used for the same 
user. That's why there is also an in-memory database that keeps track of 
username -> (hostname, timestamp) mappings. Every new connection from user 
refreshes the timestamp. Also existing connections refresh the timestamp every 
n minutes. Once all connections are gone, the timestamp expires and the user is 
removed from database.

The final problem then is how multiple proxies synchronize their state. The 
proxies connect to each others forming a connection ring. For example with 4 
proxies the connections would go like A -> B -> C -> A. Each time a user is 
added/refreshed, a notification is sent to both directions in the ring (e.g. B 
sends to A and C), which in turn forward it until it reaches a server that has 
already seen it. This way if a proxy dies (or just hangs for a few seconds), 
the other proxies still get the changes without waiting for it to timeout. Host 
changes are replicated in the same way.

It's possible that two connections from a user arrive to different proxies 
while (v)hosts are being added/removed. It's also possible that only one of the 
proxies has seen the host change. So the proxies could redirect users to 
different servers during that time. This can be prevented by doing a ring-wide 
sync, during which all proxies delay assigning hostnames to new users. This 
delay shouldn't be too bad because a) they should happen rarely, b) it should 
be over quickly, c) users already in database can still be redirected during 
the sync.

The main complexity here comes from how to handle proxy server failures in 
different situations. Those are less interesting to describe and I haven't yet 
implemented all of it, so let's just assume that in future it all works 
perfectly. :) I was also thinking about writing a test program to simulate the 
director failures to make sure it all works.

Finally, there are the doveadm commands that can be used to:

1) List the director status:
# doveadm director status
mail server ip  vhosts  users
11.22.3.44              100             1312
12.33.4.55              50              1424

1) Add a new mail server (defaults are in dovecot.conf):
# doveadm director add 1.2.3.4

2) Change a mail server's vhost count to alter its connection count (also works 
during adding):
# doveadm director add 1.2.3.4 50

3) Remove a mail server completely (because it's down):
# doveadm director remove 1.2.3.4

If you want to slowly get users away from a specific server, you can assign its 
vhost count to 0 and wait for its user count to drop to zero. If the server is 
still working while "doveadm director remove" is called, new connections from 
the users in that server are going to other servers while the old ones are 
still being handled.

Reply via email to