Hi all,

I am currently looking into some of the replication issues, specifically DIRSERVER-894 ("Older concurrent changes are never replicated"), DIRSERVER-1097 ("Only send net changes during replication") and DIRSERVER-1101 ("New replicas may never receive some recent modifications").

I think these issues will require changing the replication data format. Currently the replication logs are stored in a single database table with time, replica ID, sequence number and operation columns. The first 3 comprise the CSN and the last is for a serialised operation object.

DIRSERVER-894 needs a way to work out the CSN at the point a specific attribute was last modified. DIRSERVER-1097 needs a way to find previous log entries based on entryUUID, modification type and attribute ID. We are also planning on moving the replication data to the DIT. Given all this I am thinking of removing the serialised operation blob and replacing it with extra table(s) for each operation type storing the operation's data across multiple columns. This will allow us to efficiently query the replication logs based on the operation data.

Perhaps this would be a good time to make the jump to storing the replication data in the DIT. It seems that that would be well suited to storing the operations in an "exploded" format. I am thinking of the following kind of format:

ou=logs/
  cn=<csn>/
      objectClass: ... (indicates operation type)
      time: ...
      replicaID: ...
      operationSequence: ...
      entryUUID: ...
      attributeID: <attributeName> (for attribute modifications)
      cn=attributes/
        <attributeName>: <attributeValues>

The biggest concern I have for this is the inflexibility of LDAP searches. Do we have a sort control in ApacheDS? Also, if we have the attributes for the operation in a child entry how can we find an operation in the logs based on those attributes.

At the same time I am thinking about a couple of things in the replication system that don't seem to be necessary.

Firstly, once DIRSERVER-894 is fixed, I don't think we will need the entryCSN attribute. I believe that it is only used to check whether an operation should be applied to an entry or not (i.e. is it a new modification), but this is broken and we need to check the CSN per attribute by using the logs instead.

Secondly, I don't really see the point of "tombstoning" entries (marking them as deleted instead of really deleting them). The only time I can see it having any kind of effect is when a replica receives a modification for an entry it thinks has been deleted - then it will resurrect it. This seems like a very bad idea to me. I would expect this to be a fatal replication error as something has gone seriously wrong.

Sorry for the long email... if anyone's managed to read this far any comments would be much appreciated.

Thanks,

Martin

Reply via email to