Hi Timo, Am 02.11.17 um 10:34 schrieb Timo Sirainen: > On 30 Oct 2017, at 11.05, Ralf Becker <[email protected]> wrote: >> It happened now twice that replication created folders and mails in the >> wrong mailbox :( >> >> Here's the architecture we use: >> - 2 Dovecot (2.2.32) backends in two different datacenters replicating >> via a VPN connection >> - Dovecot directors in both datacenters talks to both backends with >> vhost_count of 100 vs 1 for local vs remote backend >> - backends use proxy dict via a unix domain socket and socat to talk via >> tcp to a dict on a different server (kubernetes cluster) >> - backends have a local sqlite userdb for iteration (also containing >> home directories, as just iteration is not possible) >> - serving around 7000 mailboxes in a roughly 200 different domains >> >> Everything works as expected, until dict is not reachable eg. due to a >> server failure or a planed reboot of a node of the kubernetes cluster. >> In that situation it can happen that some requests are not answered, >> even with Kubernetes running multiple instances of the dict. >> I can only speculate what happens then: it seems the connection failure >> to the remote dict is not correctly handled and leads to situation in >> which last mailbox/home directory is used for the replication :( > It sounds to me like a userdb lookup changes the username during a dict > failure. Although I can't really think of how that could happen.
Me neither. Users are in multiple MariaDB databases on a Galera cluster. We have no problems or unexpected changes there. The dict is running multiple time, but that might not guarantee no single request might fail. > The only thing that comes to my mind is auth_cache, but in that case I'd > expect the same problem to happen even when there aren't dict errors. > > For testing you could see if it's reproducible with: > > - get random username > - do doveadm user <user> > - verify that the result contains the same input user > > Then do that in a loop rapidly and restart your test kubernetes once in a > while. Ok, I'll give that a try. It's would be a lot easier then the whole replication setup. Ralf -- Ralf Becker EGroupware GmbH [www.egroupware.org] Handelsregister HRB Kaiserslautern 3587 Geschäftsführer Birgit und Ralf Becker Leibnizstr. 17, 67663 Kaiserslautern, Germany Telefon +49 631 31657-0
signature.asc
Description: OpenPGP digital signature
