Ludwig, > you still only grep the replication connection, but before being replicated > the entry has to be added by some client connection, can you get all > references to the entry ? > the log snippet you provide shows also csns with tag=103, which indicate a > MOD, are these MODs for the added entries ? or other mods ?
..... I can't believe I did that! Ok, so the logs have been rotated (I didn't think to adjust logrotate..), so there aren't any logs to peruse for the case I've presented so far. However, I was able to reproduce the errors by "bulk" deleting 39 DNS entries, and only the MASTER reported "replica_generate_next_csn" entries. Given the size of the logs, I think it would be pointless to do any kind of sanitization. I'll go ahead and gzip them for you and email you off-list. I've labeled them as MASTER and REPLICA. John DeSantis 2016-08-19 9:18 GMT-04:00 Ludwig Krispenz <[email protected]>: > > On 08/18/2016 05:28 PM, John Desantis wrote: >> >> Ludwig, >> >>> unfortunately this is not enough to determine what is going on. The >>> intersting generated/used csn is only logged in the >>> corresponding RESULT message and these are only the replication >>> connections, >>> it would be necessary to see the >>> original ADD operation, was it added once or twice by a client ? >>> you could pick one entry eg server-6-3-sp and grep for all references in >>> the >>> access logs of both servers (maybe there are mods as well) and then >>> get also get the RESULT line for the ops found >> >> Here are the updated log snippets looking for ADD and RESULT: > > you still only grep the replication connection, but before being replicated > the entry has to be added by some client connection, can you get all > references to the entry ? > the log snippet you provide shows also csns with tag=103, which indicate a > MOD, are these MODs for the added entries ? or other mods ? > >> >> PROD:11:20:13-root@REPLICA:/var/log/dirsrv/slapd-DOM-DOM-DOM >> # grep -E '17/Aug/2016:13:50:4.*conn=602.*(RESULT|ADD)' access.2016081* >> access.20160817-124811:[17/Aug/2016:13:50:41 -0400] conn=602 op=4139 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-124811:[17/Aug/2016:13:50:41 -0400] conn=602 op=4140 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-124811:[17/Aug/2016:13:50:42 -0400] conn=602 op=4141 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-124811:[17/Aug/2016:13:50:42 -0400] conn=602 op=4142 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-124811:[17/Aug/2016:13:50:42 -0400] conn=602 op=4143 >> ADD >> dn="idnsname=server-6-3-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom" >> access.20160817-124811:[17/Aug/2016:13:50:42 -0400] conn=602 op=4143 >> RESULT err=0 tag=105 nentries=0 etime=0 csn=57b4a4bb000300040000 >> access.20160817-124811:[17/Aug/2016:13:50:44 -0400] conn=602 op=4144 >> RESULT err=0 tag=103 nentries=0 etime=0 csn=57b4a4bb000400040000 >> access.20160817-124811:[17/Aug/2016:13:50:44 -0400] conn=602 op=4145 >> RESULT err=0 tag=103 nentries=0 etime=0 >> access.20160817-124811:[17/Aug/2016:13:50:47 -0400] conn=602 op=4146 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-124811:[17/Aug/2016:13:50:47 -0400] conn=602 op=4147 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-124811:[17/Aug/2016:13:50:47 -0400] conn=602 op=4148 >> ADD >> dn="idnsname=server-6-4-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom" >> access.20160817-124811:[17/Aug/2016:13:50:47 -0400] conn=602 op=4148 >> RESULT err=0 tag=105 nentries=0 etime=0 csn=57b4a4bb000800040000 >> access.20160817-124811:[17/Aug/2016:13:50:49 -0400] conn=602 op=4149 >> RESULT err=0 tag=103 nentries=0 etime=0 csn=57b4a4bc000100040000 >> access.20160817-124811:[17/Aug/2016:13:50:49 -0400] conn=602 op=4150 >> RESULT err=0 tag=103 nentries=0 etime=0 >> access.20160817-124811:[17/Aug/2016:13:50:49 -0400] conn=602 op=4151 >> ADD >> dn="idnsname=server-6-5-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom" >> access.20160817-124811:[17/Aug/2016:13:50:49 -0400] conn=602 op=4151 >> RESULT err=0 tag=105 nentries=0 etime=0 csn=57b4a4c1000500040000 >> access.20160817-124811:[17/Aug/2016:13:50:49 -0400] conn=602 op=4152 >> RESULT err=0 tag=103 nentries=0 etime=0 csn=57b4a4c1000600040000 >> >> PROD:11:19:54-root@MASTER:/var/log/dirsrv/slapd-DOM-DOM-DOM >> # grep -E '17/Aug/2016:13:50:4.*conn=1395.*(RESULT|ADD)' access.2016081* >> access.20160817-111940:[17/Aug/2016:13:50:41 -0400] conn=1395 op=4148 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-111940:[17/Aug/2016:13:50:41 -0400] conn=1395 op=4149 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-111940:[17/Aug/2016:13:50:41 -0400] conn=1395 op=4150 >> RESULT err=0 tag=103 nentries=0 etime=0 csn=57b4a4b9000500160000 >> access.20160817-111940:[17/Aug/2016:13:50:43 -0400] conn=1395 op=4151 >> ADD >> dn="idnsname=server-6-3-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom" >> access.20160817-111940:[17/Aug/2016:13:50:43 -0400] conn=1395 op=4151 >> RESULT err=0 tag=105 nentries=0 etime=0 >> access.20160817-111940:[17/Aug/2016:13:50:44 -0400] conn=1395 op=4152 >> RESULT err=0 tag=103 nentries=0 etime=1 csn=57b4a4bc000000160000 >> access.20160817-111940:[17/Aug/2016:13:50:46 -0400] conn=1395 op=4153 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-111940:[17/Aug/2016:13:50:47 -0400] conn=1395 op=4154 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-111940:[17/Aug/2016:13:50:47 -0400] conn=1395 op=4155 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-111940:[17/Aug/2016:13:50:47 -0400] conn=1395 op=4156 >> RESULT err=0 tag=120 nentries=0 etime=0 >> access.20160817-111940:[17/Aug/2016:13:50:48 -0400] conn=1395 op=4157 >> RESULT err=0 tag=103 nentries=0 etime=1 csn=57b4a4c1000100160000 >> access.20160817-111940:[17/Aug/2016:13:50:49 -0400] conn=1395 op=4158 >> ADD >> dn="idnsname=server-6-5-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom" >> access.20160817-111940:[17/Aug/2016:13:50:49 -0400] conn=1395 op=4158 >> RESULT err=0 tag=105 nentries=0 etime=0 >> access.20160817-111940:[17/Aug/2016:13:50:49 -0400] conn=1395 op=4159 >> RESULT err=0 tag=103 nentries=0 etime=0 >> access.20160817-111940:[17/Aug/2016:13:50:49 -0400] conn=1395 op=4160 >> RESULT err=0 tag=103 nentries=0 etime=0 csn=57b4a4c3000000160000 >> >> I'm positive that I was the only one performing DNS updates during >> this time, and I was only using 1 console. >> >> Thanks, >> John DeSantis >> >> >> 2016-08-18 10:09 GMT-04:00 Ludwig Krispenz <[email protected]>: >>> >>> On 08/18/2016 03:15 PM, John Desantis wrote: >>>> >>>> Ludwig, >>>> >>>> Thank you for your response! >>>> >>>>> This is a normal scenario, but you could check if the simultaneous >>>>> updates >>>>> on 4 and 16 are intentional. >>>> >>>> In regards to the simultaneous updates, the only items I have noted so >>>> far >>>> are: >>>> >>>> * The time sync between the master (4) and replica (16) was off by >>>> about 1-2 seconds, with the latter being ahead; >>> >>> yes, this happens, but the replication protocol tries to handle this, in >>> a >>> replication session the supplier and consumer >>> exchange their ruvs and if the time differs the csn state generator is >>> updated with a local or remote offset so that the >>> generated time is always based on the most advanced clock - on all >>> servers. >>> And even if you adjust the system time, the csn >>> time will never go back. >>>> >>>> * There are continual log entries referencing >>>> "replication-multimaster-extop" and "Netscape Replication End Session" >>>> in the dirsrv "access" logs, and during one of the manifestations of >>>> "replica_generate_next_csn", I found this: >>>> >>>> PROD:08:46:08-root@REPLICA:/var/log/dirsrv/slapd-DOM-DOM-DOM >>>> # grep -E '17/Aug/2016:13:50:4.*conn=602.*ADD' access.2016081* >>>> access.20160817-124811:[17/Aug/2016:13:50:42 -0400] conn=602 op=4143 >>>> ADD >>>> >>>> dn="idnsname=server-6-3-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom" >>>> access.20160817-124811:[17/Aug/2016:13:50:47 -0400] conn=602 op=4148 >>>> ADD >>>> >>>> dn="idnsname=server-6-4-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom" >>>> access.20160817-124811:[17/Aug/2016:13:50:49 -0400] conn=602 op=4151 >>>> ADD >>>> >>>> dn="idnsname=server-6-5-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom" >>>> >>>> PROD:08:47:44-root@MASTER:/var/log/dirsrv/slapd-DOM-DOM-DOM >>>> # grep -E '17/Aug/2016:13:50:4.*conn=1395.*ADD' access.2016081* >>>> access.20160817-111940:[17/Aug/2016:13:50:43 -0400] conn=1395 op=4151 >>>> ADD >>>> >>>> dn="idnsname=server-6-3-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom" >>>> access.20160817-111940:[17/Aug/2016:13:50:49 -0400] conn=1395 op=4158 >>>> ADD >>>> >>>> dn="idnsname=server-6-5-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom" >>>> >>>> It looks like the entries for server-6-3-sp and 6-5-sp were referenced >>>> twice. Do you think that the time being off by 1-2 seconds between >>>> the master and replica could be the issue? The connection 602 is the >>>> replication between the replica and master, and the connection 1395 is >>>> the replication between the master and replica. >>> >>> unfortunately this is not enough to determine what is going on. The >>> intersting generated/used csn is only logged in the >>> corresponding RESULT message and these are only the replication >>> connections, >>> it would be necessary to see the >>> original ADD operation, was it added once or twice by a client ? >>> you could pick one entry eg server-6-3-sp and grep for all references in >>> the >>> access logs of both servers (maybe there are mods as well) and then >>> get also get the RESULT line for the ops found >>>> >>>> >>>> Since I know these operations were performed using the console via a >>>> for loop 'ipa dnsrecord-add dom.dom.dom server-6-$i-sp >>>> --a-rec=10.250.12.$i' on one of our login nodes, do you think that >>>> specifying an _srv_ record in the DOMAIN configuration with the >>>> address of the master server, e.g.: ipa_server = _srv_, >>>> MASTER.dom.dom.dom could be the issue (coupled with the time syncing)? >>>> >>>> I know that these questions are probably leaning more towards the >>>> 389ds team, so feel free to pass me over to them if need be. >>> >>> I think I can address the ds related questions, but I don't know about >>> console and dns to assess if the behaviour is normal >>> >>>> Again, thank you very much for responding! >>>> >>>> John DeSantis >>>> >>>> 2016-08-18 4:14 GMT-04:00 Ludwig Krispenz <[email protected]>: >>>>> >>>>> On 08/17/2016 08:54 PM, John Desantis wrote: >>>>>> >>>>>> Hello all, >>>>>> >>>>>> We've been re-using old host names and IP addresses for a new >>>>>> deployment of nodes, and recently I've been seeing the messages pasted >>>>>> below in the slapd-DC.DC.DC "error" log on our nodes. >>>>>> >>>>>> [17/Aug/2016:10:30:30 -0400] - replica_generate_next_csn: >>>>>> opcsn=57b475cd001200040000 <= basecsn=57b475cf000000160000, adjusted >>>>>> opcsn=57b475cf000100040000 >>>>>> [17/Aug/2016:11:09:44 -0400] - replica_generate_next_csn: >>>>>> opcsn=57b47f00000200040000 <= basecsn=57b47f00000200160000, adjusted >>>>>> opcsn=57b47f00000300040000 >>>>>> [17/Aug/2016:11:09:44 -0400] - replica_generate_next_csn: >>>>>> opcsn=57b47f00000400040000 <= basecsn=57b47f00000400160000, adjusted >>>>>> opcsn=57b47f00000500040000 >>>>>> [17/Aug/2016:11:10:33 -0400] - replica_generate_next_csn: >>>>>> opcsn=57b47f2f001000040000 <= basecsn=57b47f30000200160000, adjusted >>>>>> opcsn=57b47f30000300040000 >>>>>> [17/Aug/2016:13:50:45 -0400] - replica_generate_next_csn: >>>>>> opcsn=57b4a4bb000900040000 <= basecsn=57b4a4bc000000160000, adjusted >>>>>> opcsn=57b4a4bc000100040000 >>>>>> [17/Aug/2016:13:52:54 -0400] - replica_generate_next_csn: >>>>>> opcsn=57b4a53e000a00040000 <= basecsn=57b4a53f000000160000, adjusted >>>>>> opcsn=57b4a53f000100040000 >>>>>> [17/Aug/2016:13:53:15 -0400] - replica_generate_next_csn: >>>>>> opcsn=57b4a552000700040000 <= basecsn=57b4a553000000160000, adjusted >>>>>> opcsn=57b4a553000100040000 >>>>>> [17/Aug/2016:13:53:32 -0400] - replica_generate_next_csn: >>>>>> opcsn=57b4a562000900040000 <= basecsn=57b4a564000000160000, adjusted >>>>>> opcsn=57b4a564000100040000 >>>>> >>>>> Each modification (add/del/mod) gets a csn assignged used in >>>>> replication >>>>> update resolution. And each assigned csn has to newer than an existing >>>>> one. >>>>> The messages you see is from code that double checks that the entry >>>>> doesn't >>>>> have already a lareg csn - and adjusts it. >>>>> The logs indicate that entries are more or less concurrently updated on >>>>> replica 4 and 16, and the updates from16 are received while processing >>>>> the >>>>> updates on 4. >>>>> This is a normal scenario, but you could check if the simultaneous >>>>> updates >>>>> on 4 and 16 are intentional. >>>>> >>>>>> They seem to only occur when updating DNS entries, whether on the >>>>>> console or via the GUI (tail -f'ing the log). >>>>>> >>>>>> A search in this mailing-list returns nothing, but a message is found >>>>>> on the 389-ds list [1]; it seems to suggest that the messages aren't >>>>>> fatal and are purely informational, yet if they are occurring >>>>>> constantly that there could be a problem with the replication >>>>>> algorithm and/or deployment. >>>>>> >>>>>> We're using ipa-server 3.0.0-47 and 389-ds 1.2.11.15-60. Nothing has >>>>>> changed on the deployment side of things, and I don't recall seeing >>>>>> this message before. >>>>>> >>>>>> I'm wondering if it's safe to disregard these messages due to the >>>>>> re-use of the entries, or if something else should be looked into. >>>>>> >>>>>> Thank you, >>>>>> John DeSantis >>>>>> >>>>>> [1] https://fedorahosted.org/389/ticket/47959 >>>>>> >>>>> -- >>>>> Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn, >>>>> Commercial register: Amtsgericht Muenchen, HRB 153243, >>>>> Managing Directors: Charles Cachera, Michael Cunningham, Michael >>>>> O'Neill, >>>>> Eric Shander >>>>> >>>>> -- >>>>> Manage your subscription for the Freeipa-users mailing list: >>>>> https://www.redhat.com/mailman/listinfo/freeipa-users >>>>> Go to http://freeipa.org for more info on the project >>> >>> >>> -- >>> Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn, >>> Commercial register: Amtsgericht Muenchen, HRB 153243, >>> Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, >>> Eric Shander >>> >>> -- >>> Manage your subscription for the Freeipa-users mailing list: >>> https://www.redhat.com/mailman/listinfo/freeipa-users >>> Go to http://freeipa.org for more info on the project > > > -- > Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, > Eric Shander > > -- > Manage your subscription for the Freeipa-users mailing list: > https://www.redhat.com/mailman/listinfo/freeipa-users > Go to http://freeipa.org for more info on the project -- Manage your subscription for the Freeipa-users mailing list: https://www.redhat.com/mailman/listinfo/freeipa-users Go to http://freeipa.org for more info on the project
