[389-devel] Re: Replication after full online init

Ludwig Krispenz Wed, 06 Jul 2016 08:39:19 -0700

Hi Noriko,

I have  test scenario for not correctly handled modrdns during total init:

- have a database with n entries, n large enough tthat total init takeslong enough to be able to apply an update while it is running

- add two entries
-- n+1: cn=child,$SUFFIX
-- n+2: cn=parent,$SUFFIX
both have the same parentid and n+2 will be replayed after n+1
- start total update
- do a modrdn
cn=child,$SUFFIX
changetype: modrdn
newrdn: cn=child
newsuperior: cn=parent,$SUFFIX

now cn=child,cn=paren,$SUFFIX will be sent before its parent

I do not know how we can fix this, I think it is a corner case, but weshould keep it in mind


Ludwig

On 06/30/2016 11:53 PM, Noriko Hosoi wrote:

On 06/30/2016 12:45 AM, Ludwig Krispenz wrote:
Hi William,
the reason that after a total init the consumer does not have thelatest state of the supplier RUV and is receiving updates based onthe RUV at start of the total init is independent of the modrdnproblem. When a supplier is performing a total init it is stillaccepting changes, the total init can take a while and there arescenarios where an entry which is already sent is updated beforetotal init finishes. We cannot loose these changes.
OK... Then, RUV needs to be created at the time when the supplierstarts online init?
The test case would be something like this?
1. run online init on the supplier.
2. do some operation like move entries against the supplier while theonline init is still running on the consumer.3. do some operation which depends upon the previous operation done inthe step 2.
4. check the consumer is healthy or not.
Isn't it a timestamp issue from which operation should be replayedafter the total update? Regardless of the way how to fix 48755,unless the step 2 operation(s) are replayed after the online init isdone, the consumer could get broken/inconsistent?
Thanks,
--noriko
Therfor the update resolution/ entry state resolution on the consumerside has to handle this, ignore changes already applied and apply newchanges. And it handles it, if there are bugs they have to be fixed.Now, I am no longer sure if the fix for 48755 handles correctly allmodrdns received after the id list was prepared, the parentid mightchange while the total init is on progress.This brings up my origimal suggestion to handle the modrdn problemsalso on the consumer side.
Ludwig

On 06/30/2016 02:34 AM, William Brown wrote:
Hi,

Now thathttps://fedorahosted.org/389/ticket/48755  is merged, I would
like to discuss the way we handle CSN with relation to this master. As
I'm not an expert on this topic, I want to get the input of everyone
about this.

Following this document:
http://www.port389.org/docs/389ds/design/changelog-processing-in-repl-state-sending-updates.html


As I understand it, after a full online init, the replica that consumed
the changes does not set it's CSN to match the CSN of the master that
sent the changes.

As a result, after the online init, this causes a number of changes to
be replicated from the sending master to the consumer. These are ignored
by the URP, and we continue.

However, in a number of cases these are *not* ignored, and have caused
us some bugs in replication in the past. We also have some failing
changes that are skipped, which could in certain circumstance lead to
inconsistency in replicas. We have gone to a lot of effort to be able to
skip changes, to handle the case above.

The reason was is that if there was a modrdn performed, and the entry ID
of the entry that was moved was less than the new parent ID, this *had*
to be skipped, so that after the online init the modrdn change was
replayed and applied to the consumer.

Since 48755 which sorts based on the parent ID, this seems to no longer
be an issue. So we don't need to have the master replay it's changelog
out to the consumer, because the consumer is now a literal clone of the
data.

So, is there a reason for us to leave the CSN of the consumer low to
allow this replay to occur? Or can we alter the behaviour of the
consumer to set it's CSN to the CSN of the sending master, so that we
don't need to replay these changes?




--
389-devel mailing list
[email protected]
https://lists.fedoraproject.org/admin/lists/[email protected]
--
Red Hat GmbH,http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander


--
389-devel mailing list
[email protected]
https://lists.fedoraproject.org/admin/lists/[email protected]
--
389-devel mailing list
[email protected]
https://lists.fedoraproject.org/admin/lists/[email protected]


--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

--
389-devel mailing list
[email protected]
https://lists.fedoraproject.org/admin/lists/[email protected]

[389-devel] Re: Replication after full online init

Reply via email to