Martin Alderson wrote:
Thanks for the responses, all.
Apologies for the delay in getting back to you - having a family problem
at the moment so have very little spare time.
I thought having the replication logs stored in LDAP sounded nice - for
new replicas we have to send all replicatable entries but after that the
log LDAP entries can be sent instead. It would be pretty much the same
code logic and it just seemed to solve all the problems with a large
amount of code re-use. I was worried about possible performance hits
though and it sounds like you (Alex) don't want to store the logs in
LDAP for the same reason.
That all makes sense. That's the rationale behind the OpenLDAP accesslog
overlay, which implements our logging schema.
My main reasons for suggesting storing the logs in LDAP are:
1. So we can have optional attributes in each log entry. This is needed
when we "explode" the current message blob so it can be queried
efficiently. With JDBM I guess we would have to specify a new table for
each type of message.
The logging schema format we use is already efficiently searchable.
2. To reduce the code complexity. We would have virtually the same code
for sending whole entries as sending the logs and we would have less
code for dealing with the data storage in general.
3. To reduce the current tight coupling with the backend database. By
using LDAP as the abstraction layer we could leverage ApacheDS' existing
mechanism for specifying the data store.
4. To allow an easy way to view the logs.
5. It seems to be the most natural fit. Since we need to store (part
of) an LDAP entry in the logs, why not store it in LDAP?
I'll take another stab at explaining that: we already have code to store
LDAP entries in a database, so why would we want to write that again?
Yes to all of the above.
>> The biggest concern I have for this is the inflexibility of LDAP
>> searches. Do we have a sort control in ApacheDS?
> What types of searches do you envision performing, for which LDAP
> is too inflexible? OpenLDAP's syncrepl can be pretty much entirely
> mapped onto plain search operations. We gain a lot of versatility
> by keeping things generic.
We need to search for log entries beyond a certain CSN and have the
results ordered based on CSN. I guess if the results are always
returned in creation date order then it might not be an issue (I'm not
yet sure what ApacheDS does or what the LDAP standard says).
As Kurt used to remind me, a CSN is a Change Sequence Number, but it is not a
Commit Sequence Number. The order in which you see CSNs isn't necessarily the
order in which those changes were committed in the DB. As such, the syncrepl
protocol assumes that the changes it receives are in random order. That means
we can't update our replication state upon every entry received. So
incremental refreshes can only update the replica state once the refresh has
completed, at which point we record the greatest CSN we saw. (This also means
that if a refresh is interrupted, it starts over from the exact same starting
point on the next retry. The syncrepl consumer does comparisons to discard any
received updates that it has seen before.)
Currently
we also find the current CSN vector by just getting the most recent log
- we do this by performing a search with inverted sort by CSN and 1
result maximum.
Hm, we only search for that at startup time; at runtime it's always maintained
in memory.
Also, if we have the attributes in a child entry of the
actual log entry as I suggested we would need to specify a parent-child
relationship in the search.
That sounds like a painful model to implement.
>> Also our MMR support is still immature, we don't yet do value-level
>> conflict resolution.
> Yeash we have yet to consider that.
We will have this once I have fixed
https://issues.apache.org/jira/browse/DIRSERVER-894.
> The trick to get from basic single-master to basic (entry-level
> only) multi-master is just to store multiple contextCSNs - one for
> each peer master, and ignore entry updates that are older than an
> entry's current entryCSN. The other requirement here is that you
> have reliable, tightly synced clocks, otherwise the conflict
> resolution policy falls apart.
That's exactly how our replication module works at the moment except we
just send the changes rather than the whole entry. I am currently
looking at improving the way we store the logs so we can efficiently do
attribute value level conflict resolution. I suspect that I will end up
with something very similar to delta-syncrepl. I will try and dig out
some information on that from the openldap mailing list.
Yeah, the stuff Emmanuel pointed me at indicated that your approach is already
very similar to syncrepl.
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/