Thanks Ludwig for the suggestion and thanks to Maciej for the confirmation from 
his end. This issue is happening for us for several weeks, so I don’t think 
this is a transient problem. 

What is the best way to sanitize the logs without removing useful info before 
sending them your way? Will the files mentioned on 
"https://www.freeipa.org/page/Files_to_be_attached_to_bug_report -> Directory 
server failed" be sufficient? 

I’ve also run the ipa_consistency_check script, and the output shows that 
something is indeed wrong with the sync:
“””
FreeIPA servers:    inf01    inf01    inf02    inf02    STATE
=============================================================
Active Users        15       15       15       15       OK
Stage Users         0        0        0        0        OK
Preserved Users     3        3        3        3        OK
User Groups         9        9        9        9        OK
Hosts               45       45       45       46       FAIL
Host Groups         7        7        7        7        OK
HBAC Rules          6        6        6        6        OK
SUDO Rules          7        7        7        7        OK
DNS Zones           33       33       33       33       OK
LDAP Conflicts      NO       NO       NO       NO       OK
Ghost Replicas      2        2        2        2        FAIL
Anonymous BIND      YES      YES      YES      YES      OK
Replication Status  inf01.prod 0inf01.dev 0inf01.dev 0inf01.dev 0
                    inf02.dev 0inf02.dev 0inf01.prod 0inf01.prod 0
                    inf02.prod 0inf02.prod 0inf02.prod 0inf02.dev 0
=============================================================
“””

Thanks,
Goran

> On May 15, 2017, at 6:35 AM, Ludwig Krispenz <lkris...@redhat.com> wrote:
> 
> The messages you see could be transient messages, and if replication is 
> working than this seems to be the case. If not we would need more data to 
> investigate: deployment info, relicaIDs of all servers, ruvs, logs,.....
> 
> Here is some background info: there are some scenarios where a csn could not 
> be found in the changelog, eg if updates were aplied on the supplier during a 
> total init, they could be part of the data and database ruv, but not in the 
> changelog of the initialized replica.
> ds did try to use an alternative csn in cases where it could not be found, 
> but this had the risk of missing updates, so we decided to change it and make 
> this misssing csn a non fatal error, backoff and retry, if another supplier 
> would have updated the replica in between, the starting csn could have 
> changed and be found. so if the reported missing csns change and replication 
> continues everything is ok, although I think the messages should stop at some 
> point.
> 
> There is a configuration parameter for a replciation agreement to trigger the 
> previous behaviour of picking an alternative csn: 
> nsds5ReplicaIgnoreMissingChange
> with potential values "once", "always".
> 
> where "once" just tries to kickstart replication by using another csn and 
> "always" changes the default behaviour
> 
> 
> On 05/11/2017 06:53 PM, Goran Marik wrote:
>> Hi,
>> 
>> After an upgrade to Centos 7.3.1611 with “yum update", we started seeing the 
>> following messages in the logs:
>> “””
>> May  9 21:58:28 inf01 ns-slapd[4323]: [09/May/2017:21:58:28.519724479 +0000] 
>> NSMMReplicationPlugin - changelog program - 
>> agmt="cn=cloneAgreement1-inf02.dev.ecobee.com-pki-tomcat" (inf02:389): CSN 
>> 576b34e8000a050f0000 not found, we aren't as up to date, or we purged
>> May  9 21:58:28 inf01 ns-slapd[4323]: [09/May/2017:21:58:28.550459233 +0000] 
>> NSMMReplicationPlugin - 
>> agmt="cn=cloneAgreement1-inf02.dev.ecobee.com-pki-tomcat" (inf02:389): Data 
>> required to update replica has been purged from the changelog. The replica 
>> must be reinitialized.
>> May  9 21:58:32 inf01 ns-slapd[4323]: [09/May/2017:21:58:32.588245476 +0000] 
>> agmt="cn=cloneAgreement1-inf02.dev.ecobee.com-pki-tomcat" (inf02:389) - 
>> Can't locate CSN 576b34e8000a050f0000 in the changelog (DB rc=-30988). If 
>> replication stops, the consumer may need to be reinitialized.
>> May  9 21:58:32 inf01 ns-slapd[4323]: [09/May/2017:21:58:32.611400689 +0000] 
>> NSMMReplicationPlugin - changelog program - 
>> agmt="cn=cloneAgreement1-inf02.dev.ecobee.com-pki-tomcat" (inf02:389): CSN 
>> 576b34e8000a050f0000 not found, we aren't as up to date, or we purged
>> May  9 21:58:32 inf01 ns-slapd[4323]: [09/May/2017:21:58:32.642226385 +0000] 
>> NSMMReplicationPlugin - 
>> agmt="cn=cloneAgreement1-inf02.dev.ecobee.com-pki-tomcat" (inf02:389): Data 
>> required to update replica has been purged from the changelog. The replica 
>> must be reinitialized.
>> “””
>> 
>> The log messages are pretty frequently, every few seconds, and report few 
>> different CSN numbers that cannot be located. 
>> 
>> This happens only on one replica out of 4. We’ve tried "ipa-replica-manage 
>> re-initialize —from” and “ipa-csreplica-manage re-initialize —from” several 
>> times, but while both commands report success, the log messages continue to 
>> happen. The server was rebooted and “systemctl restart ipa” was done few 
>> times as well. 
>> 
>> The replica seems to be working fine despite the errors, but I’m worried 
>> that the logs indicate underlaying problem we are not fully detecting. I 
>> would like to understand better what is triggering this behaviour and how to 
>> fix it, and if someone else saw them after a recent upgrades. 
>> 
>> The software versions are 389-ds-base-1.3.5.10-20.el7_3.x86_64 and 
>> ipa-server-4.4.0-14.el7.centos.7.x86_64
>> 
>> Thanks,
>> Goran
>> 
>> --
>> Goran Marik
>> Senior Systems Developer
>> 
>> ecobee
>> 250 University Ave, Suite 400
>> Toronto, ON M5H 3E5
>> 
>> 
>> 
>> 
> 
> -- 
> Red Hat GmbH, 
> http://www.de.redhat.com/
> , Registered seat: Grasbrunn, 
> Commercial register: Amtsgericht Muenchen, HRB 153243,
> Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, 
> Eric Shander
> 
> -- 
> Manage your subscription for the Freeipa-users mailing list:
> https://www.redhat.com/mailman/listinfo/freeipa-users
> Go to http://freeipa.org for more info on the project

--
Goran Marik
Senior Systems Developer

ecobee
250 University Ave, Suite 400
Toronto, ON M5H 3E5



-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Reply via email to