On Mon, Jan 19, 2009 at 1:15 PM, Martin Alderson <[email protected]> wrote: > Hi Emmanuel > >> Now, how do we manage replication between server A and server B >> (whatever the number of real servers present in A and B) ? Simple : as >> each operation within A or B are done on a globally connected system, >> with each operation having its unique timestamp (ie, two operations have >> two different timestamps), all the modifications done globally are >> ordered. It's just then a matter of re-ordering two lists of ordered >> operations on A and B, and to apply them from the oldest operation to >> the newest one. Let's see an example : > > This approach is good as it is easy to explain / understand but I have two > (related) problems with it: > > 1. Availability. When replication (specifically the roll back, re-apply bit) > is taking place you must prevent new local modifications being applied to > ensure consistency. This means new modification attempts must either be > rejected or frozen until the replication is completed.
Absolutely :) I drawn a small sequence diagram this morning in the train where I added some starting and ending point, in between which you can't apply any incoming requests : they are queued until the replication is done. > 2. Efficiency. Rolling back changes before re-applying might be slower than > just applying with a check. An example of this being especially bad is with > two servers (A and B) which become disconnected. Server A has just one > modification applied early on. Server B is heavily used and has hundreds of > thousands of modifications (e.g. a bulk change to all users). Now when > server A and B reconnect server B will have to roll back all those changes > just to apply a potentially completely unrelated minor change from server A. Yes, this can be a big issue. In fact, rolling back means applying the revert operations on both servers. But at least, this is a guarantee. I would like to have something which works 100% first, then try to think about improvements. > > Note that both of these become much bigger issues if we want to support > periodic (as opposed to immediate) replication or incremental backup. I guess that immediate replication when something change in any server is the best solution. > > It also becomes a bigger problem the more replicas you add - changes may have > to be rolled back and re-applied multiple times. True. Hopefuly, all the servers won't be disconnected from each other every now and then. > Using the current system (along with a fix for DIRSERVER-894) the example > above would require very little overhead for server B - the minor change from > server A would be sent over and then either applied or discarded based on the > CSN's at the time of the modification. The existing modifications don't need > to be rolled back / re-applied. The main issue using this technic is to analyze in retrospect what would be the impact of a modification on the current server considering all the later modifciations : there are many possibilities that local modifications done after may be invalid. Any idea on how to deal with that ? -- Regards, Cordialement, Emmanuel Lécharny www.iktek.com
