Rein Tollevik wrote: > I've been trying to figure out why syncrepl used on a backend that is > subordinate to a glue database with the syncprov overlay should save the > contextCSN in the suffix of the glue database rather than the suffix of > the backend where syncrepl is used. But all I come up with are reasons > why this should not be the case. So, unless anyone can enlighten me as > to what I'm missing, I suggest that this be changed. > > The problem with the current design is that it makes it impossible to > reliably replicate more than one subordinate db from the same remote > server, as there are now race conditions where one of the subordinate > backends could save an updated contextCSN value that is picked up by the > other before it has finished its synchronization. An example of a > configuration where more than one subordinate db replicated from the > same server might be necessary is the central master described in my > previous posting in > http://www.openldap.org/lists/openldap-devel/200806/msg00041.html > > My idea as to how this race condition could be verified was to add > enough entries to one of the backends (while the consumer was stopped) > to make it possible to restart the consumer after the first backend had > saved the updated contextCSN but before the second has finished its > synchronization. But I was able to produce it by simply add or delete > of an entry in one of the backends before starting the consumer. Far to > often was the backend without any changes able to pick up and save the > updated contextCSN from the producer before syncrepl on the second > backend fetched its initial value. I.e it started with an updated > contextCSN and didn't receive the changes that had taken place on the > producer. If syncrepl stored the values in the suffix of their own > database then they wouldn't interfere with each other like this.
OK. > There is a similar problem in syncprov, as it must use the lowest > contextCSN value (with a given sid) saved by the syncrepl backends > configured within the subtree where syncprov is used. But to do that it > also needs to distinguish the contextCSN values of each syncrepl > backend, which it can't do when they all save them in the glue suffix. > This also implies that syncprov must ignore contextCSN updates from > syncrepl until all syncrepl backends has saved a value, and that > syncprov on the provider must send newCookie sync info messages when it > updates its contextCSN value when the changed entry isn't being > replicated to a consumer. I.e as outlined in the message referred to above. Then (at least) at server startup time syncprov must retrieve the contextCSNs from all of its subordinate DBs. Perhaps a subtree search with filter "(contextCSN=*)" would suffice; this would of course require setting a presence index on this attribute to run reasonably. (Or we can add a glue function to return a list of the subordinate suffixes or DBs...) By the way, please use "subordinate database" and "superior database" when discussing these things; "glue database" is too ambiguous. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/