[Freeipa-users] Re: FreeIPA v4.5.0 install lost topology suffixes

Ludwig Krispenz via FreeIPA-users Fri, 06 Apr 2018 00:38:37 -0700


On 04/05/2018 11:28 PM, Gavin Williams via FreeIPA-users wrote:

Petr
Yeh, I was unable to see the suffixes and replication agreements viathe WebUI.
However searching using ldapsearch, they were still present. So Itracked the issue down to my named user account not having enoughpermissions. Logged in as ‘admin’ user and was able to see all thedetails.
So that just leaves the issue with the fact that replication broke inthe first place. Looking back through the slapd error log, I cameacross this:

The errors below do not indicate that replication is broken, areplication session failed and is retried, you can see that the errorsare 10 sec apart and refer to different replication conenctions, soreplication was probably working in between and has probably resumed again.

The underlying problem is that there is concurrent access to thechangelog by incoming connection writing to the changelog and byoutgoing replication connections reading it. The access is protected bylocks at the db (BDB) level and in some situations there can bedeadlocks. The db layer has a mechanism to abort one of the threads andlet the other continue. The aborted thread will log an error and has tobe retried, either immediately or after returning an error to the client-that is what you are seeing.

Which thread is aborted is determined by the configured deadlock policy,the default tends to abort the writing one. If these errors occurfrequently and make issues it is worth to change this policy. In the entry:


dn: cn=config,cn=ldbm database,cn=plugins,cn=config

change the attribute: nsslapd-db-deadlock-policy

to
nsslapd-db-deadlock-policy: 6

this will abort the thread with the minimal write locks and will abortthe outgoing repl connection and should have less impact

[28/Mar/2018:17:27:04.558588967 +0100] - ERR - NSMMReplicationPlugin - 
changelog program - _cl5WriteOperationTxn - retry (49) the transaction (c
sn=5abbc252002800040000) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker 
killed to resolve a deadlock))
[28/Mar/2018:17:27:04.575793325 +0100] - ERR - NSMMReplicationPlugin - 
changelog program - _cl5WriteOperationTxn - Failed to write entry with cs
n (5abbc252002800040000); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker 
killed to resolve a deadlock
[28/Mar/2018:17:27:04.578285790 +0100] - ERR - NSMMReplicationPlugin - 
write_changelog_and_ruv - Can't add a change for ipaUniqueID=6dc4846c-27a
3-11e8-a0a5-fa163e82604c,cn=sudorules,cn=sudo,dc=weareact,dc=net (uniqid: 
64da9801-27a311e8-8bfb8904-640ff48c, optype: 8) to changelog csn 5abbc
252002800040000
[28/Mar/2018:17:27:04.595240157 +0100] - ERR - NSMMReplicationPlugin - 
process_postop - Failed to apply update (5abbc252002800040000) error (1).
   Aborting replication session(conn=453585 op=18)
[28/Mar/2018:17:27:14.160079067 +0100] - ERR - NSMMReplicationPlugin - 
changelog program - _cl5WriteOperationTxn - retry (49) the transaction (c
sn=5abbc252002800040000) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker 
killed to resolve a deadlock))
[28/Mar/2018:17:27:14.161481168 +0100] - ERR - NSMMReplicationPlugin - 
changelog program - _cl5WriteOperationTxn - Failed to write entry with csn 
(5abbc252002800040000); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker 
killed to resolve a deadlock
[28/Mar/2018:17:27:14.162533841 +0100] - ERR - NSMMReplicationPlugin - 
write_changelog_and_ruv - Can't add a change for 
ipaUniqueID=6dc4846c-27a3-11e8-a0a5-fa163e82604c,cn=sudorules,cn=sudo,dc=weareact,dc=net
 (uniqid: 64da9801-27a311e8-8bfb8904-640ff48c, optype: 8) to changelog csn 
5abbc252002800040000
[28/Mar/2018:17:27:14.177194703 +0100] - ERR - NSMMReplicationPlugin - 
process_postop - Failed to apply update (5abbc252002800040000) error (1).  
Aborting replication session(conn=453594 op=6)
Any pointers on identifying possible cause?

Cheers
Gav

On 5 Apr 2018, at 18:24, Petr Vobornik <pvobo...@redhat.com<mailto:pvobo...@redhat.com>> wrote:
On Wed, Apr 4, 2018 at 4:31 PM, Gavin Williams via FreeIPA-users
<freeipa-users@lists.fedorahosted.org<mailto:freeipa-users@lists.fedorahosted.org>> wrote:
Afternoon all
I’ve got a slightly strange one with one of our FreeIPA clusters,whereby the topology suffixes appear to have disappeared.
How is this manifested? No visible in Web UI, CLI?
From what I can see, this is causing replication issues between thehosts, which is causing us issues with bootstrapping new clientsagainst FreeIPA.
I’m not aware of any config changes that have happened on theFreeIPA hosts that could have caused this issue, so am a bit stumpedatm.
Is someone able to advise next steps on how to investigate the causeand correct the configuration?
For anything regarding replication, a good start is to check directory
server error and access logs on both sides.

https://www.freeipa.org/page/Files_to_be_attached_to_bug_report#Directory_server_failed
https://www.freeipa.org/page/Troubleshooting#Directory_Server_issues

Next step could be to check for replication conflicts:

https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/html/administration_guide/managing_replication-solving_common_replication_conflicts
Regards
Gavin
--
Petr Vobornik




_______________________________________________
FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org
To unsubscribe send an email to freeipa-users-le...@lists.fedorahosted.org


--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

_______________________________________________
FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org
To unsubscribe send an email to freeipa-users-le...@lists.fedorahosted.org

[Freeipa-users] Re: FreeIPA v4.5.0 install lost topology suffixes

Reply via email to