https://bugs.openldap.org/show_bug.cgi?id=9282

--- Comment #8 from Ondřej Kuzník <[email protected]> ---
On Thu, Jul 02, 2020 at 01:19:40PM +0000, [email protected] wrote:
> --- Comment #7 from Howard Chu <[email protected]> ---
> (In reply to Ondřej Kuzník from comment #6)
>> Thanks for the reproducer script.
>> 
>> This is due to
>> https://git.openldap.org/openldap/openldap/-/blob/master/servers/slapd/
>> syncrepl.c#L1638 causing A to skip the present cull.
>> 
>> Based on the git history, this was introduced to deal with ITS#5470 but that
>> seems wrong, if the number of SIDs in the cookie differs from what we
>> requested then either:
>> - a SID disappeared from the set we received, which sounds like what
>> ITS#5470 is about? But slapd doesn't really allow this at the moment as it
>> will say consumer is newer than provider) so that shouldn't happen
> 
> A SID can't disappear. They tend to stay in the contextCSN forever. (This is
> actually another problem, nodes that are converted from single-provider to
> multi-provider generally still have a SID 0 CSN, which is always ancient
> relative to the active SIDs. Routines that check for oldest CSN to still exist
> in the DB lead to wasteful checks because of that. Right now all you can do is
> use mage privs and delete the obsolete CSN.)

Yeah, and it would not be so wasteful if we could query the database for
the oldest/newest entry with a given SID in entryCSN. Removing a SID
from the set is always going to be a manual operation unless we can
coordinate with all provider and consumer nodes somehow.

>> - a SID is added to the set by the provider, like here. This could be due to
>> a delete (like here) and that delete has to be replicated - that is the
>> point of running syncrepl_del_nonpresent
> 
> Yes, the problem that was being addressed is that if the local node knows 
> about
> more SIDs than the remote node, then the incoming present list from the remote
> node can't be trusted. Doing a del_nonpresent could delete a lot of entries
> that the remote node doesn't know about, but exist legitimately on the local
> node.

The scenario I describe here is if we start a search with a cookie
containing only SIDs {1, 2} but finish present phase by receiving a
cookie with SIDs {1, 2, 3}. Accepting that cookie implies we have to
process the (implied) deletes too or we have desynced.

If, in the meantime, we added entries with a SID of 4, those are not
part of the original cookie and should not be deleted, that's for sure.
I think we do the right thing already or are close to doing so.

> I think a proper fix would require a change in the syncrepl protocol
> sequencing. E.g., two nodes should refresh from each other with all of their
> new Adds/Modifies first, and once those changes have been settled, then they
> can perform a present cross-check. This would also require saving some
> intermediate cookie state in case the the full sequence gets interrupted.
> 
> Or, put in another way, there needs to be a separately tracked
> contextDeleteCSN.

That's ITS#8125 work, I should get back to that eventually.

-- 
You are receiving this mail because:
You are on the CC list for the issue.

Reply via email to