Hi

I've got a small environment which had until recently 2 IPA servers.
Both CentOS 7.4.1708

Version info:

id1:
Name        : ipa-server
Version     : 4.5.0
Release     : 21.el7.centos.2.2
Kernel: 3.10.0-693.5.2.el7.x86_64
389-ds-base is at version 1.3.6.1

id5:
Name        : ipa-server
Version     : 4.5.0
Release     : 21.el7.centos.2.2
Kernel: 3.10.0-693.5.2.el7.x86_64
389-ds-base is at version 1.3.6.1

I recently had an issue with high IO/load, and noted that the following file:
/var/lib/dirsrv/slapd-PROD-MYDOMAIN-COM/cldb/<long-filename>.db
was huge (5GB-ish) in a very small 2-master environment.  This is on
the master.  My understanding is that the entries in this file, which
have timestamps from months ago, exist because of failed replication.
I don't understand how to clear this without breaking things.

Second issue; not sure if related:

I've since lost the replica (id2) but I've prepared a new machine
(id5) to be a new replica of id1.  I've cleaned the RUVs and deleted
the replication agreements but when I join the new machine to the
existing one using `ipa-replica-install` then I get the following on
the replica:

################
Starting replication, please wait until this has completed.
Update in progress, 10 seconds elapsed
[ldap://id1.prod.mydomain.com:389] reports: Update failed! Status:
[-11 connection error: Unknown connection error (-11) - Total update
aborted]

  [error] RuntimeError: Failed to start replication
Your system may be partly configured.
Run /usr/sbin/ipa-server-install --uninstall to clean up.

ipa.ipapython.install.cli.install_tool(CompatServerReplicaInstall):
ERROR    Failed to start replication
ipa.ipapython.install.cli.install_tool(CompatServerReplicaInstall):
ERROR    The ipa-replica-install command failed. See
/var/log/ipareplica-install.log for more information
[root@id5 ~]# ipa-replica-manage re-initialize --from id1.prod.mydomain.com
Re-run /usr/sbin/ipa-replica-manage with --verbose option to get more
information
Unexpected error: cannot connect to 'ldaps://id5.prod.mydomain.com:636':
################

and the following on the master:

################
[14/Nov/2017:10:05:28.671905981 +0000] - INFO - NSMMReplicationPlugin
- repl5_tot_run - Beginning total update of replica
"agmt="cn=meToid5.prod.mydomain.com" (id5:389)".
[14/Nov/2017:10:05:38.031033860 +0000] - ERR - NSMMReplicationPlugin -
repl5_tot_log_operation_failure - agmt="cn=meToid5.prod.mydomain.com"
(id5:389): Received error -1 (Can't contact LDAP server):  for total
update operation
[14/Nov/2017:10:05:38.032272148 +0000] - ERR - NSMMReplicationPlugin -
release_replica - agmt="cn=meToid5.prod.mydomain.com" (id5:389):
Unable to send endReplication extended operation (Can't contact LDAP
server)
[14/Nov/2017:10:05:38.095893236 +0000] - ERR - NSMMReplicationPlugin -
repl5_tot_run - Total update failed for replica
"agmt="cn=meToid5.prod.mydomain.com" (id5:389)", error (-11)
[14/Nov/2017:10:05:38.113388624 +0000] - INFO - NSMMReplicationPlugin
- bind_and_check_pwp - agmt="cn=meToid5.prod.mydomain.com" (id5:389):
Replication bind with GSSAPI auth resumed
[14/Nov/2017:10:05:38.425682940 +0000] - WARN - NSMMReplicationPlugin
- repl5_inc_run - agmt="cn=meToid5.prod.mydomain.com" (id5:389): The
remote replica has a different database generation ID than the local
database.  You may have to reinitialize the remote replica, or the
local replica.
################

I've checked the firewalls on both machines, and gone as far as to
flush all the iptables rules to get it to work.  No luck.

I'm also getting hundreds of the last line "different database
generation ID" but my understanding is that this is only logged
because the replica is yet to be set up.

Would anyone please be able to provide some guidance?  I've been at
this for a few days now!

Thanks!
MIke
_______________________________________________
FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org
To unsubscribe send an email to freeipa-users-le...@lists.fedorahosted.org

Reply via email to