Hello,

I have two mail servers and am also experiencing sporadic replication errors over tcps, similar to Reuben. Each server is running Dovecot 2.3.11.3 (502c39af9) on Debian 10.6.

*Log entries from MX1*
Nov 18 00:39:26 mx1 dovecot: dsync-local([email protected])<Ow3zAjWxtF+TDgAAPHKnuQ>: Error: dsync(mx2.example.com): I/O has stalled, no activity for 600 seconds (last sent=mailbox, last recv=mailbox_state) Nov 18 00:39:26 mx1 dovecot: dsync-local([email protected])<Ow3zAjWxtF+TDgAAPHKnuQ>: Error: Timeout during state=sync_mails (send=mailbox recv=mailbox) Nov 18 06:39:32 mx1 dovecot: dsync-local([email protected])<6bScGpwFtV+vEQAAPHKnuQ>: Error: dsync(mx2.example.com): I/O has stalled, no activity for 600 seconds (last sent=mailbox, last recv=mailbox_state) Nov 18 06:39:32 mx1 dovecot: dsync-local([email protected])<6bScGpwFtV+vEQAAPHKnuQ>: Error: Timeout during state=sync_mails (send=mailbox recv=mailbox)
*End*

*Log entries from MX2*
Nov 18 00:29:55 mx2 dovecot: dsync-local([email protected])<fKK3JzWxtF9zAgAA5XpYKg>: Error: Couldn't lock /var/vmail/[email protected]/.dovecot-sync.lock: fcntl(/var/vmail/[email protected]/.dovecot-sync.lock, write-lock, F_SETLKW) locking failed: Timed out after 30 seconds (WRITE lock held by pid 628) Nov 18 00:34:56 mx2 dovecot: dsync-local([email protected])<9IKaB2KytF92AgAA5XpYKg>: Error: Couldn't lock /var/vmail/[email protected]/.dovecot-sync.lock: fcntl(/var/vmail/[email protected]/.dovecot-sync.lock, write-lock, F_SETLKW) locking failed: Timed out after 30 seconds (WRITE lock held by pid 628) Nov 18 00:39:26 mx2 dovecot: doveadm: Error: dsync(mx1.example.com): I/O has stalled, no activity for 600 seconds (last sent=mail_change (EOL), last recv=mailbox) Nov 18 06:39:32 mx2 dovecot: doveadm: Error: dsync(mx1.example.com): I/O has stalled, no activity for 600 seconds (last sent=mail_change (EOL), last recv=mailbox)
*End*

I have configured "replication_full_sync_interval = 1 hours", which explains why some of the sync errors occur at the same increment on the hour (if the error does occur).

I've tested replication over tcps using either IPv6 or IPv4 -- this did not appear to make a difference.

Changing replication to occur over tcp solves the issue (with "ssl = yes" commented out, as well).

IMAP clients are primarily connecting to MX1 using SSL, which works well (SSL connections to MX2 also work). These are very low traffic machines at the moment (just 1 user as I continue testing).

I've attached the output of "dovecot -n" from each server.

Are there known bugs with replication using SSL? I'd appreciate any guidance.

Thank you,
AP

# 2.3.11.3 (502c39af9): /etc/dovecot/dovecot.conf
# OS: Linux 4.19.0-12-amd64 x86_64 Debian 10.6
# Hostname: mx1.example.com
doveadm_password = # hidden, use -P to show it
doveadm_port = 12345
mail_location = maildir:~/Maildir
mail_plugins = " notify replication"
namespace inbox {
  inbox = yes
  location =
  mailbox Archive {
    special_use = \Archive
  }
  mailbox "Deleted Messages" {
    special_use = \Trash
  }
  mailbox Drafts {
    special_use = \Drafts
  }
  mailbox Junk {
    special_use = \Junk
  }
  mailbox Sent {
    special_use = \Sent
  }
  mailbox "Sent Messages" {
    special_use = \Sent
  }
  mailbox Trash {
    special_use = \Trash
  }
  prefix =
}
passdb {
  args = scheme=sha512-crypt /usr/local/etc/creds
  driver = passwd-file
}
plugin {
  mail_replica = tcps:mx2.example.com:12345
}
protocols = " imap"
replication_full_sync_interval = 1 hours
service aggregator {
  fifo_listener replication-notify-fifo {
    user = vmail
  }
  unix_listener replication-notify {
    user = vmail
  }
}
service doveadm {
  inet_listener {
    port = 12345
    ssl = yes
  }
}
service replicator {
  process_min_avail = 1
  unix_listener replicator-doveadm {
    mode = 0600
    user = vmail
  }
}
ssl_cert = </etc/letsencrypt/live/mx1.example.com/fullchain.pem
ssl_client_ca_dir = /etc/ssl/certs
ssl_key = # hidden, use -P to show it
userdb {
  args = username_format=%u /usr/local/etc/creds
  default_fields = uid=vmail gid=vmail home=/var/vmail/%u
  driver = passwd-file
}
# 2.3.11.3 (502c39af9): /etc/dovecot/dovecot.conf
# OS: Linux 4.19.0-12-amd64 x86_64 Debian 10.6
# Hostname: mx2.example.com
doveadm_password = # hidden, use -P to show it
doveadm_port = 12345
mail_location = maildir:~/Maildir
mail_plugins = " notify replication"
namespace inbox {
  inbox = yes
  location =
  mailbox Archive {
    special_use = \Archive
  }
  mailbox "Deleted Messages" {
    special_use = \Trash
  }
  mailbox Drafts {
    special_use = \Drafts
  }
  mailbox Junk {
    special_use = \Junk
  }
  mailbox Sent {
    special_use = \Sent
  }
  mailbox "Sent Messages" {
    special_use = \Sent
  }
  mailbox Trash {
    special_use = \Trash
  }
  prefix =
}
passdb {
  args = scheme=sha512-crypt /usr/local/etc/creds
  driver = passwd-file
}
plugin {
  mail_replica = tcps:mx1.example.com:12345
}
protocols = " imap"
replication_full_sync_interval = 1 hours
service aggregator {
  fifo_listener replication-notify-fifo {
    user = vmail
  }
  unix_listener replication-notify {
    user = vmail
  }
}
service doveadm {
  inet_listener {
    port = 12345
    ssl = yes
  }
}
service replicator {
  process_min_avail = 1
  unix_listener replicator-doveadm {
    mode = 0600
    user = vmail
  }
}
ssl_cert = </etc/letsencrypt/live/mx2.example.com/fullchain.pem
ssl_client_ca_dir = /etc/ssl/certs
ssl_key = # hidden, use -P to show it
userdb {
  args = username_format=%u /usr/local/etc/creds
  default_fields = uid=vmail gid=vmail home=/var/vmail/%u
  driver = passwd-file
}

Reply via email to