Thanks Ludwig for the suggestion and thanks to Maciej for the confirmation from
his end. This issue is happening for us for several weeks, so I don’t think
this is a transient problem.
What is the best way to sanitize the logs without removing useful info before
sending them your way? Will the files mentioned on
"https://www.freeipa.org/page/Files_to_be_attached_to_bug_report -> Directory
server failed" be sufficient?
I’ve also run the ipa_consistency_check script, and the output shows that
something is indeed wrong with the sync:
“””
FreeIPA servers:inf01inf01inf02inf02STATE
=
Active Users15 15 15 15 OK
Stage Users 0000OK
Preserved Users 3333OK
User Groups 9999OK
Hosts 45 45 45 46 FAIL
Host Groups 7777OK
HBAC Rules 6666OK
SUDO Rules 7777OK
DNS Zones 33 33 33 33 OK
LDAP Conflicts NO NO NO NO OK
Ghost Replicas 2222FAIL
Anonymous BIND YES YES YES YES OK
Replication Status inf01.prod 0inf01.dev 0inf01.dev 0inf01.dev 0
inf02.dev 0inf02.dev 0inf01.prod 0inf01.prod 0
inf02.prod 0inf02.prod 0inf02.prod 0inf02.dev 0
=
“””
Thanks,
Goran
> On May 15, 2017, at 6:35 AM, Ludwig Krispenz <lkris...@redhat.com> wrote:
>
> The messages you see could be transient messages, and if replication is
> working than this seems to be the case. If not we would need more data to
> investigate: deployment info, relicaIDs of all servers, ruvs, logs,.
>
> Here is some background info: there are some scenarios where a csn could not
> be found in the changelog, eg if updates were aplied on the supplier during a
> total init, they could be part of the data and database ruv, but not in the
> changelog of the initialized replica.
> ds did try to use an alternative csn in cases where it could not be found,
> but this had the risk of missing updates, so we decided to change it and make
> this misssing csn a non fatal error, backoff and retry, if another supplier
> would have updated the replica in between, the starting csn could have
> changed and be found. so if the reported missing csns change and replication
> continues everything is ok, although I think the messages should stop at some
> point.
>
> There is a configuration parameter for a replciation agreement to trigger the
> previous behaviour of picking an alternative csn:
> nsds5ReplicaIgnoreMissingChange
> with potential values "once", "always".
>
> where "once" just tries to kickstart replication by using another csn and
> "always" changes the default behaviour
>
>
> On 05/11/2017 06:53 PM, Goran Marik wrote:
>> Hi,
>>
>> After an upgrade to Centos 7.3.1611 with “yum update", we started seeing the
>> following messages in the logs:
>> “””
>> May 9 21:58:28 inf01 ns-slapd[4323]: [09/May/2017:21:58:28.519724479 +]
>> NSMMReplicationPlugin - changelog program -
>> agmt="cn=cloneAgreement1-inf02.dev.ecobee.com-pki-tomcat" (inf02:389): CSN
>> 576b34e8000a050f not found, we aren't as up to date, or we purged
>> May 9 21:58:28 inf01 ns-slapd[4323]: [09/May/2017:21:58:28.550459233 +]
>> NSMMReplicationPlugin -
>> agmt="cn=cloneAgreement1-inf02.dev.ecobee.com-pki-tomcat" (inf02:389): Data
>> required to update replica has been purged from the changelog. The replica
>> must be reinitialized.
>> May 9 21:58:32 inf01 ns-slapd[4323]: [09/May/2017:21:58:32.588245476 +]
>> agmt="cn=cloneAgreement1-inf02.dev.ecobee.com-pki-tomcat" (inf02:389) -
>> Can't locate CSN 576b34e8000a050f in the changelog (DB rc=-30988). If
>> replication stops, the consumer may need to be reinitialized.
>> May 9 21:58:32 inf01 ns-slapd[4323]: [09/May/2017:21:58:32.611400689 +]
>> NSMMReplicationPlugin - changelog program -
>> agmt="cn=cloneAgreement1-inf02.dev.ecobee.com-pki-tomcat" (inf02:389): CSN
>> 576b34e8000a050f not found, we aren't as up to date, or we purged
>> May 9 21:58:32 inf01 ns-slapd[4323]: [09/May/2017:21:58:32.642226385 +]
>> NSMMReplicationPlugin -
>> agmt="cn=cloneAgreement1-inf02.dev.ecobee.com-pki-tomcat" (inf02:389): Data
>> required to update replica has be