Thank you, Howard! I'll give that a try in case the keepalive option Quanah
mentioned is not fixing the issue.

Mircea
--
Mircea Baciu | Senior Unix Systems Administrator
Simmons University | 300 The Fenway | Boston, MA 02115 | 617-521-2194


On Mon, Sep 20, 2021 at 12:02 PM Howard Chu <[email protected]> wrote:

> Mircea Baciu wrote:
> > Hi,
> >
> > I have an issue with a consumer replication starting to fail until
> OpenLDAP is restarted.
> >
> > My setup consists of a pair of on-prem MirrorMode replicated providers
> (only one is active at a given time using a virtual IP managed by
> Keepalived), and one
> > off-site (AWS) consumer. The providers use a dedicated port (LDAPS on
> 1636) for their own replication, as well as for the consumer to connect to
> them, so the
> > consumer has access to both servers, regardless of where the providers'
> virtual IP is residing.
> >
> > All the connections happen over LDAPS, and the syncrepl configs have the
> tls_reqcert=allow option.
> >
> > The providers are always in sync and I'm able to switch make one or the
> other one the "active" one with ease. The consumer does the initial sync
> and stays in
> > sync for a while, but I find it often (almost daily) out of sync. I see
> error messages on both the consumer and provider side:
>
> Sounds like an issue in the TLS layer. You should increase the debug level
> on both provider and consumer to see
> if there are any TLS-specific error messages being generated. If you have
> cn=monitor configured you can set the
> debuglevel using ldapmodify, so no need to restart the servers for it to
> take effect. That'll let you see the
> problem as it's occurring.
> >
> > On the consumer (every minute):
> > Sep 20 08:19:31 <consumer> slapd[1440]: slap_client_connect:
> URI=ldaps://<provider1>:1636/
> DN="uid=replication,ou=sysaccounts,dc=example,dc=com"
> > ldap_sasl_bind_s failed (-1)
> > Sep 20 08:19:31 <consumer> slapd[1440]: do_syncrepl: rid=001 rc -1
> retrying
> > Sep 20 08:19:31 <consumer> slapd[1440]: slap_client_connect:
> URI=ldaps://<provider2>:1636/
> DN="uid=replication,ou=sysaccounts,dc=example,dc=com"
> > ldap_sasl_bind_s failed (-1)
> > Sep 20 08:19:31 <consumer> slapd[1440]: do_syncrepl: rid=002 rc -1
> retrying
> > Sep 20 08:20:31 <consumer> slapd[1440]: slap_client_connect:
> URI=ldaps://<provider1>:1636/
> DN="uid=replication,ou=sysaccounts,dc=example,dc=com"
> > ldap_sasl_bind_s failed (-1)
> > Sep 20 08:20:31 <consumer> slapd[1440]: do_syncrepl: rid=001 rc -1
> retrying
> > Sep 20 08:20:31 <consumer> slapd[1440]: slap_client_connect:
> URI=ldaps://<provider2>:1636/
> DN="uid=replication,ou=sysaccounts,dc=example,dc=com"
> > ldap_sasl_bind_s failed (-1)
> > Sep 20 08:20:31 <consumer> slapd[1440]: do_syncrepl: rid=002 rc -1
> retrying
> >
> > On the provider (every minute):
> > Sep 20 08:19:31 <provider1> slapd[1057]: conn=11242 fd=14 ACCEPT from
> IP=<consumer IP>:45438 (IP=<provider1 IP>:1636)
> > Sep 20 08:19:31 <provider1> slapd[1057]: conn=11242 fd=14 TLS
> established tls_ssf=256 ssf=256
> > Sep 20 08:19:31 <provider1> slapd[1057]: conn=11242 fd=14 closed
> (connection lost)
> > Sep 20 08:20:31 <provider1> slapd[1057]: conn=11243 fd=14 ACCEPT from
> IP=<consumer IP>:45458 (IP=<provider1 IP>:1636)
> > Sep 20 08:20:31 <provider1> slapd[1057]: conn=11243 fd=14 TLS
> established tls_ssf=256 ssf=256
> > Sep 20 08:20:31 <provider1> slapd[1057]: conn=11243 fd=14 closed
> (connection lost)
> >
> > Sep 20 08:19:31 <provider2> slapd[1051]: conn=215893 fd=18 ACCEPT from
> IP=<consumer IP>:41706 (IP=<provider2 IP>:1636)
> > Sep 20 08:19:31 <provider2> slapd[1051]: conn=215893 fd=18 TLS
> established tls_ssf=256 ssf=256
> > Sep 20 08:19:31 <provider2> slapd[1051]: conn=215893 fd=18 closed
> (connection lost)
> > Sep 20 08:20:31 <provider2> slapd[1051]: conn=215898 fd=18 ACCEPT from
> IP=<consumer IP>:41726 (IP=<provider2 IP>:1636)
> > Sep 20 08:20:31 <provider2> slapd[1051]: conn=215898 fd=18 TLS
> established tls_ssf=256 ssf=256
> > Sep 20 08:20:31 <provider2> slapd[1051]: conn=215898 fd=18 closed
> (connection lost)
> >
> > There must be something wrong on the consumer side since when the issue
> starts, the consumer is not able to connect to either provider.
> >
> > Once I restart the consumer, it quickly resyncs and works just fine, for
> a while.
> >
> > The providers are OpenLDAP 2.4.44 (openldap-2.4.44-24.el7_9.x86_64),
> running on RHEL 7.
> > The consumer is OpenLDAP 2.4.44 (openldap-2.4.44-24.el7_9.x86_64),
> running on CentOS 7.
> >
> > The consumer syncrepl config is:
> > olcSyncrepl: {0}rid=001
> >   provider=ldaps://<provider1>:1636/
> >   searchbase="dc=example,dc=com"
> >   type=refreshAndPersist
> >   retry="60 +"
> >   timeout=1
> >   bindmethod=simple
> >   binddn="uid=replication,ou=SysAccounts,dc=example,dc=com"
> >   credentials=<credentials>
> >   tls_reqcert=allow
> > olcSyncrepl: {1}rid=002
> >   provider=ldaps://<provider1>:1636/
> >   searchbase="dc=example,dc=com"
> >   type=refreshAndPersist
> >   retry="60 +"
> >   timeout=1
> >   bindmethod=simple
> >   binddn="uid=replication,ou=SysAccounts,dc=example,dc=com"
> >   credentials=<credentials>
> >   tls_reqcert=allow
> >
> > The "uid=replication,ou=SysAccounts,dc=example,dc=com" DN has full
> read-only permissions for the entire "dc=example,dc=com" tree.
> >
> > Any idea on what might be my issue here?
> >
> > Thank you,
> > Mircea
> > --
> > Mircea Baciu | Senior Unix Systems Administrator
> > Simmons University | 300 The Fenway | Boston, MA 02115 | 617-521-2194
>
>
> --
>   -- Howard Chu
>   CTO, Symas Corp.           http://www.symas.com
>   Director, Highland Sun     http://highlandsun.com/hyc/
>   Chief Architect, OpenLDAP  http://www.openldap.org/project/
>

Reply via email to