On Mon, Jan 19, 2009 at 1:15 PM, Martin Alderson
<[email protected]> wrote:
> Hi Emmanuel
>
>> Now, how do we manage replication between server A and server B
>> (whatever the number of real servers present in A and B) ? Simple : as
>> each operation within A or B are done on a globally connected system,
>> with each operation having its unique timestamp (ie, two operations have
>> two different timestamps), all the modifications done globally are
>> ordered. It's just then a matter of re-ordering two lists of ordered
>> operations on A and B, and to apply them from the oldest operation to
>> the newest one. Let's see an example :
>
> This approach is good as it is easy to explain / understand but I have two 
> (related) problems with it:
>
> 1. Availability.  When replication (specifically the roll back, re-apply bit) 
> is taking place you must prevent new local modifications being applied to 
> ensure consistency.  This means new modification attempts must either be 
> rejected or frozen until the replication is completed.

Absolutely :) I drawn a small sequence diagram this morning in the
train where I added some starting and ending point, in between which
you can't apply any incoming requests : they are queued until the
replication is done.


> 2. Efficiency.  Rolling back changes before re-applying might be slower than 
> just applying with a check.  An example of this being especially bad is with 
> two servers (A and B) which become disconnected.  Server A has just one 
> modification applied early on.  Server B is heavily used and has hundreds of 
> thousands of modifications (e.g. a bulk change to all users).  Now when 
> server A and B reconnect server B will have to roll back all those changes 
> just to apply a potentially completely unrelated minor change from server A.

Yes, this can be a big issue. In fact, rolling back means applying the
revert operations on both servers. But at least, this is a guarantee.

I would like to have something which works 100% first, then try to
think about improvements.

>
> Note that both of these become much bigger issues if we want to support 
> periodic (as opposed to immediate) replication or incremental backup.

I guess that immediate replication when something change in any server
is the best solution.

>
> It also becomes a bigger problem the more replicas you add - changes may have 
> to be rolled back and re-applied multiple times.

True. Hopefuly, all the servers won't be disconnected from each other
every now and then.

> Using the current system (along with a fix for DIRSERVER-894) the example 
> above would require very little overhead for server B - the minor change from 
> server A would be sent over and then either applied or discarded based on the 
> CSN's at the time of the modification.  The existing modifications don't need 
> to be rolled back / re-applied.

The main issue using this technic is to analyze in retrospect what
would be the impact of a modification on the current server
considering all the later modifciations : there are many possibilities
that local modifications done after may be invalid.

Any idea on how to deal with that ?

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Reply via email to