On 11/14/2017 11:40 AM, Mike Johnson via FreeIPA-users wrote:
Hi

I've got a small environment which had until recently 2 IPA servers.
Both CentOS 7.4.1708

Version info:

id1:
Name        : ipa-server
Version     : 4.5.0
Release     : 21.el7.centos.2.2
Kernel: 3.10.0-693.5.2.el7.x86_64
389-ds-base is at version 1.3.6.1

id5:
Name        : ipa-server
Version     : 4.5.0
Release     : 21.el7.centos.2.2
Kernel: 3.10.0-693.5.2.el7.x86_64
389-ds-base is at version 1.3.6.1

I recently had an issue with high IO/load, and noted that the following file:
/var/lib/dirsrv/slapd-PROD-MYDOMAIN-COM/cldb/<long-filename>.db
was huge (5GB-ish) in a very small 2-master environment.  This is on
the master.  My understanding is that the entries in this file, which
have timestamps from months ago, exist because of failed replication.
I don't understand how to clear this without breaking things.
looks like you have changelog trimming not enabled, if you enable trimming now this would reduce the content, but not necessary reduce the file size, but it would prevent it to grow. If you stop the server and remove it, it will be recreated. What can happen then is that required changes to update another replica are missing and repl will ask you to reinit the other server.

Now, the second problem should be unrelated. Looks like total init tries to connect to port 636 and fails, the normal repl session fals because the init didn't happen. Could you verify that id5 is listening on 636 or if you have any errors in its error logs.

Second issue; not sure if related:

I've since lost the replica (id2) but I've prepared a new machine
(id5) to be a new replica of id1.  I've cleaned the RUVs and deleted
the replication agreements but when I join the new machine to the
existing one using `ipa-replica-install` then I get the following on
the replica:

################
Starting replication, please wait until this has completed.
Update in progress, 10 seconds elapsed
[ldap://id1.prod.mydomain.com:389] reports: Update failed! Status:
[-11 connection error: Unknown connection error (-11) - Total update
aborted]

   [error] RuntimeError: Failed to start replication
Your system may be partly configured.
Run /usr/sbin/ipa-server-install --uninstall to clean up.

ipa.ipapython.install.cli.install_tool(CompatServerReplicaInstall):
ERROR    Failed to start replication
ipa.ipapython.install.cli.install_tool(CompatServerReplicaInstall):
ERROR    The ipa-replica-install command failed. See
/var/log/ipareplica-install.log for more information
[root@id5 ~]# ipa-replica-manage re-initialize --from id1.prod.mydomain.com
Re-run /usr/sbin/ipa-replica-manage with --verbose option to get more
information
Unexpected error: cannot connect to 'ldaps://id5.prod.mydomain.com:636':
################

and the following on the master:

################
[14/Nov/2017:10:05:28.671905981 +0000] - INFO - NSMMReplicationPlugin
- repl5_tot_run - Beginning total update of replica
"agmt="cn=meToid5.prod.mydomain.com" (id5:389)".
[14/Nov/2017:10:05:38.031033860 +0000] - ERR - NSMMReplicationPlugin -
repl5_tot_log_operation_failure - agmt="cn=meToid5.prod.mydomain.com"
(id5:389): Received error -1 (Can't contact LDAP server):  for total
update operation
[14/Nov/2017:10:05:38.032272148 +0000] - ERR - NSMMReplicationPlugin -
release_replica - agmt="cn=meToid5.prod.mydomain.com" (id5:389):
Unable to send endReplication extended operation (Can't contact LDAP
server)
[14/Nov/2017:10:05:38.095893236 +0000] - ERR - NSMMReplicationPlugin -
repl5_tot_run - Total update failed for replica
"agmt="cn=meToid5.prod.mydomain.com" (id5:389)", error (-11)
[14/Nov/2017:10:05:38.113388624 +0000] - INFO - NSMMReplicationPlugin
- bind_and_check_pwp - agmt="cn=meToid5.prod.mydomain.com" (id5:389):
Replication bind with GSSAPI auth resumed
[14/Nov/2017:10:05:38.425682940 +0000] - WARN - NSMMReplicationPlugin
- repl5_inc_run - agmt="cn=meToid5.prod.mydomain.com" (id5:389): The
remote replica has a different database generation ID than the local
database.  You may have to reinitialize the remote replica, or the
local replica.
################

I've checked the firewalls on both machines, and gone as far as to
flush all the iptables rules to get it to work.  No luck.

I'm also getting hundreds of the last line "different database
generation ID" but my understanding is that this is only logged
because the replica is yet to be set up.

Would anyone please be able to provide some guidance?  I've been at
this for a few days now!

Thanks!
MIke
_______________________________________________
FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org
To unsubscribe send an email to freeipa-users-le...@lists.fedorahosted.org

--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander
_______________________________________________
FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org
To unsubscribe send an email to freeipa-users-le...@lists.fedorahosted.org

Reply via email to