https://bugs.openldap.org/show_bug.cgi?id=9282
--- Comment #8 from Ondřej Kuzník <[email protected]> --- On Thu, Jul 02, 2020 at 01:19:40PM +0000, [email protected] wrote: > --- Comment #7 from Howard Chu <[email protected]> --- > (In reply to Ondřej Kuzník from comment #6) >> Thanks for the reproducer script. >> >> This is due to >> https://git.openldap.org/openldap/openldap/-/blob/master/servers/slapd/ >> syncrepl.c#L1638 causing A to skip the present cull. >> >> Based on the git history, this was introduced to deal with ITS#5470 but that >> seems wrong, if the number of SIDs in the cookie differs from what we >> requested then either: >> - a SID disappeared from the set we received, which sounds like what >> ITS#5470 is about? But slapd doesn't really allow this at the moment as it >> will say consumer is newer than provider) so that shouldn't happen > > A SID can't disappear. They tend to stay in the contextCSN forever. (This is > actually another problem, nodes that are converted from single-provider to > multi-provider generally still have a SID 0 CSN, which is always ancient > relative to the active SIDs. Routines that check for oldest CSN to still exist > in the DB lead to wasteful checks because of that. Right now all you can do is > use mage privs and delete the obsolete CSN.) Yeah, and it would not be so wasteful if we could query the database for the oldest/newest entry with a given SID in entryCSN. Removing a SID from the set is always going to be a manual operation unless we can coordinate with all provider and consumer nodes somehow. >> - a SID is added to the set by the provider, like here. This could be due to >> a delete (like here) and that delete has to be replicated - that is the >> point of running syncrepl_del_nonpresent > > Yes, the problem that was being addressed is that if the local node knows > about > more SIDs than the remote node, then the incoming present list from the remote > node can't be trusted. Doing a del_nonpresent could delete a lot of entries > that the remote node doesn't know about, but exist legitimately on the local > node. The scenario I describe here is if we start a search with a cookie containing only SIDs {1, 2} but finish present phase by receiving a cookie with SIDs {1, 2, 3}. Accepting that cookie implies we have to process the (implied) deletes too or we have desynced. If, in the meantime, we added entries with a SID of 4, those are not part of the original cookie and should not be deleted, that's for sure. I think we do the right thing already or are close to doing so. > I think a proper fix would require a change in the syncrepl protocol > sequencing. E.g., two nodes should refresh from each other with all of their > new Adds/Modifies first, and once those changes have been settled, then they > can perform a present cross-check. This would also require saving some > intermediate cookie state in case the the full sequence gets interrupted. > > Or, put in another way, there needs to be a separately tracked > contextDeleteCSN. That's ITS#8125 work, I should get back to that eventually. -- You are receiving this mail because: You are on the CC list for the issue.
