Hi guys,
we are working with Kiran to test replication thoroughly. So far, it's
going pretty well.
We have fixed a couple of potential race conditions, and it seems that
the initial update is pretty much covered :
* one producer is loaded with 1000 entries
- a consumer is correctly getting those 1000 entries
- a consumer gets the 1000 entries, then one entry is added on the
producer, and it gets replicated into the consumer
- a consumer gets around 200 entries before being brutally stopped,
then restarted, and gets back the 1000 entries
- 4 consumers are started altogether, and are correctly updated
What is missing is a test where the provider is brutally stopped in the
middle of the replication, then restarted, to see if the consumer
correctly reconnect to get back the missing entries.
The next steps will be to see what's going on with multiple operations
(delete/move/rename/modify/moveAndRename) when having multiple
consumers. Kiran alrady added the tests for those operations, and they work.
One thing that could be done later would be to build a scenario where we
load a server with some random data, and apply some random operations on
it (add/delete/move/rename/modify+moveAndRename), with more than one
consumer connected to it, and after some time (say, one or two days),
check that the consumers and the provider are all in sync. That would be
something we should run on a separate platform, because it would be
totally overkilling if done as tests. I'll see with Apache infra if we
can use some of the servers they have.
The next step will be to implement the MMR mode (right now, we only have
MSR), which means the producer will also be a consumer. That also
implies we implement the conflict resolution engine.
Thanks !
--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com