[email protected] wrote: > --On Wednesday, May 16, 2012 10:27 PM +0000 [email protected] wrote: > >> Full_Name: Quanah Gibson-Mount >> Version: 2.4.31 >> OS: Linux 2.6 >> URL: ftp://ftp.openldap.org/incoming/ >> Submission from: (NULL) (75.108.184.39) > > We can see that the script turning it into a master ran here: > > Thu May 17 16:05:46 2012 *** Running as zimbra user: > /opt/zimbra/libexec/zmldapenable-mmr -s 2 -m > ldap://zre-ldap002.eng.vmware.com:389/ > > so 16:05:46 > > In the accesslog, we see: > > dn: cn=accesslog > objectClass: auditContainer > cn: accesslog > structuralObjectClass: auditContainer > contextCSN: 20120517225152.913667Z#000000#000#000000 > contextCSN: 20120517230823.615364Z#000000#001#000000 > contextCSN: 20120517230546.409118Z#000000#002#000000 > > dn: reqStart=20120517230546.000019Z,cn=accesslog > objectClass: auditAdd > structuralObjectClass: auditAdd > reqStart: 20120517230546.000019Z > reqEnd: 20120517230546.000020Z > reqType: add > reqSession: 100 > reqAuthzID: cn=config > reqDN: cn=zimbra > reqResult: 0 > reqMod: objectClass:+ organizationalRole > reqMod: description:+ Zimbra Systems Application Data > reqMod: cn:+ zimbra > reqMod: structuralObjectClass:+ organizationalRole > reqMod: entryUUID:+ 40f78bea-34be-1031-8a5d-e1466f667e19 > reqMod: creatorsName:+ cn=config > reqMod: createTimestamp:+ 20120517224907Z > reqMod: entryCSN:+ 20120517224907.221672Z#000000#000#000000 > reqMod: modifiersName:+ cn=config > reqMod: modifyTimestamp:+ 20120517224907Z > reqEntryUUID: 40f78bea-34be-1031-8a5d-e1466f667e19 > entryUUID: 948929e2-34c0-1031-9a14-c93bd10ff0f2 > creatorsName: cn=config > createTimestamp: 20120517224907Z > entryCSN: 20120517224907.221672Z#000000#000#000000 > modifiersName: cn=config > modifyTimestamp: 20120517224907Z > > so it is tracking "000" as a third master? This seems to be why the > original server (which was 000 before being promoted to 001) replicates > these entries back to itself.
The loop is caused by the patch to ITS#6872, which considers a consumer out of date whenever the number of CSNs in its sync request doesn't match the number known to the provider. The data here is basically invalid: server1 has entries generated using SID=0 but it has no contextCSN value with SID=0. It only sent SID=1 and SID=2 in its sync request. Server2, which just updated from server1, has a contextCSN for SID=0 in addition to 1 and 2 (and that's all correct). Server1 should have always had a contextCSN value for SID=0 but doesn't. This problem would not occur if server1 was converted first from standalone into a single-master. I.e., load syncprov on it, let it scan the DB and generate the first sid=0 contextCSN, before turning it intu a MMR node. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
