Hi Timo,

Am 02.11.17 um 10:34 schrieb Timo Sirainen:
> On 30 Oct 2017, at 11.05, Ralf Becker <[email protected]> wrote:
>> It happened now twice that replication created folders and mails in the
>> wrong mailbox :(
>>
>> Here's the architecture we use:
>> - 2 Dovecot (2.2.32) backends in two different datacenters replicating
>> via a VPN connection
>> - Dovecot directors in both datacenters talks to both backends with
>> vhost_count of 100 vs 1 for local vs remote backend
>> - backends use proxy dict via a unix domain socket and socat to talk via
>> tcp to a dict on a different server (kubernetes cluster)
>> - backends have a local sqlite userdb for iteration (also containing
>> home directories, as just iteration is not possible)
>> - serving around 7000 mailboxes in a roughly 200 different domains
>>
>> Everything works as expected, until dict is not reachable eg. due to a
>> server failure or a planed reboot of a node of the kubernetes cluster.
>> In that situation it can happen that some requests are not answered,
>> even with Kubernetes running multiple instances of the dict.
>> I can only speculate what happens then: it seems the connection failure
>> to the remote dict is not correctly handled and leads to situation in
>> which last mailbox/home directory is used for the replication :(
> It sounds to me like a userdb lookup changes the username during a dict 
> failure. Although I can't really think of how that could happen. 

Me neither.

Users are in multiple MariaDB databases on a Galera cluster. We have no
problems or unexpected changes there.

The dict is running multiple time, but that might not guarantee no
single request might fail.

> The only thing that comes to my mind is auth_cache, but in that case I'd 
> expect the same problem to happen even when there aren't dict errors.
>
> For testing you could see if it's reproducible with:
>
>  - get random username
>  - do doveadm user <user>
>  - verify that the result contains the same input user
>
> Then do that in a loop rapidly and restart your test kubernetes once in a 
> while.


Ok, I'll give that a try. It's would be a lot easier then the whole
replication setup.

Ralf

-- 
Ralf Becker
EGroupware GmbH [www.egroupware.org]
Handelsregister HRB Kaiserslautern 3587
Geschäftsführer Birgit und Ralf Becker
Leibnizstr. 17, 67663 Kaiserslautern, Germany
Telefon +49 631 31657-0


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to