Chris Mikkelson wrote:
> On Thu, Jan 13, 2011 at 12:30:32PM -0800, Howard Chu wrote:
>> [email protected] wrote:
>>> Other than the patch I suggested (using the list head's
>>> CSN value directly), I can think of two other approaches:
>>>
>>>     1) When adding an entry to the sessionlog, check
>>>     if sl_mincsn is empty. If it is, update sl_mincsn
>>>     to the new entry's CSN.
>>>
>>>     2) When initializing the sessionlog, set sl_mincsn
>>>     to the maximum contextCSN value of the underlying
>>>     database.
>>>
>>> #2 seems ideal from an efficiency standpoint, although it
>>> differs from the algorithm the current code appears to be
>>> intended to implement.
>>
>> We should have gone with #2. The problem scenario:
>
> The current code (in HEAD) covers the vast majority of
> cases, but still has a hole which can come up quite often
> in MMR + refreshOnly scenarios. The scenario:
>
>    1: provider 1, provider 2, and consumer all in sync. SID 1 CSN
>       is a few minutes old, SID 2 is a few hours old. Both sessionlogs
>       are empty with sl_mincsn set to the newest (provider 1) CSN value
>    2: stop consumer
>    3: write on provider 2
>    4: start consumer
>
> The provider will see that the consumer is out of date
> wrt the SID 2 CSN and that the old value of the SID 2 CSN
> predates sl_mincsn, so it will skip the session log.
>
> I think the following approach will cover this case:
>
>       * sl_mincsn holds a CSN value for each SID

Yeah, looks like we were headed there anyway.

>       * in syncprov_op_search, mincsn is compared to
>       the sl_mincsn value with the same SID

The mincsn variable is no longer useful as-is. We need to compare the vector 
of CSNs from the cookie with the vector of CSNs in the provider. If any CSN in 
the consumer is older than the matching sl_mincsn then the log must be skipped.

At this point there are a bunch of redundant O(N^2) walkthrus of the CSNs 
anyway, so getting rid of mincsn would be a good simplification. Since there 
may be many SIDs it would be better to sort both vectors and then walk thru 
them sequentially.

>       * when expiring log entries in in syncprov_add_slog,
>       the sl_mincsn value with the same SID as the expired
>       entry's CSN is set to the maximum of the two values.

Yes.

> If there is no matching SID in sl_mincsn, then:
>
>       * in syncprov_op_search, this implies that the
>       mincsn's SID has been added to the provider's
>       configuration since startup or that the first
>       change recorded by that SID occurred since startup,
>       and thus the sessionlog contains sufficient
>       information
>
>       * in syncprov_add_slog, sl_mincsn will need to
>       be expanded to hold the new SID's CSN
>
> Thoughts?

Sounds right.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/


Reply via email to