On 06/02/2014 04:08 PM, Rob Crittenden wrote:
Simo Sorce wrote:
First of all, very good summary, thanks a lot!
Replies in line.

On Mon, 2014-06-02 at 10:46 +0200, Ludwig Krispenz wrote:
Ticket 4302 is a request for an enhancement: Move replication topology
to the shared tree

There has been some discussion in comments in the ticket, but I'd like
to open the discussion to a wider audience to get an agreement on what
should be implemented, before writing a design spec.

The implementation requires a new IPA plugin for 389 DS and eventually
an enhancement of the 389 replication plugin (which depends on some
decisions below). In the following I will use the terms “topology
plugin” for the new plugin and “replication plugin” for the existing 389
multimaster replication plugin.

Lets start with the requirements: What should be achieved by this RFE ?

In my opinion there are three different levels of features to implement
with this request

- providing all replication configuration information consistent across
all deployed servers on all servers, eg to easily visualize the
replication topology.

- Allowing to do sanity checks on replication configuration, denying
modifications which would break replication topology or issue warnings.

- Use the information in the shared tree to trigger changes to the
replication configuration in the corresponding servers, this means to
allow to completely control replication configuration with modifications
of entries in the shared tree

The main questions are

1] which information is needed in the shared tree (eg what params in the
repl config should be modifiable)

2] how is the information organized and stored (layout of the repl
config information shared tree)

3] how is the interaction of the info in the shared tree and
configuration in cn=config and the interaction between the topology
plugin and the replication plugin

ad 1] to verify the topology, eg connectivity info about all existing
replication agreements is needed, the replication agreements only
contain info about the target, and the parameters for connection to the
target, but not about the origin. If the data have to evaluated on any
server, information about the origin has to be added, eg replicaID,

In addition, if agreement config has to be changed based on the shared
tree all required parameters need to be present, eg
replicatedAttributeList, strippedAttrs, replicationEnabled, .....

Replication agreements only provide information on connections where
replication is configured, if connectivity is to be checked independent
info about all deployed serevers/replicas is needed.

If topology should be validated, do we need params definig requirements,
eg each replica to be connected to 1,2,3,... others, type of topology
(ring, mesh, star,.) ?
Ok from a topology point of view you need the same elements you need to
define a graph. That is: nodes and segments.

We already have the list of masters in the cn=etc tree, so all we need
is to add segments (ie connection objects).

As for parameters my idea is that we have a general set of parameters
(eg. replicatedAttributeList, strippedAttrs) in the general topology
configuration, then we might override them on a per-connection basis if
needed (should be very rare).

Also note we may need multiple topology sets, because we may have to
distinguish between the replication topology for the main shared tree
and the replication topology for other databases.

However we may want to be able to mark a topology for 'multiple' sets.
For example we may want to have by default the same topology both for
the main database and for the CA database.
I think we should store them separately and making them the "same" would
be applied by a tool, but the data would just reflect the connections.

I was thinking the object DN would contain the LDAP database name (or
some rough equivalent), so we would store the IPA connections separate
from the CA connections.

ad 2] the data required are available in the replicationAgreement (and
eventually replica) entries, but the question is if there should be a
1:1 relationship to entries in the shared tree or a condensed
representation, if there should be a server or connection oriented view.
My answer is no, we need only one object per connection, but config
entries are per direction (and different ones on different servers).
We also need to store the type, MMR, read-only, etc, for future-proofing.

One entry per connection would mirror what we have now in the mapping
tree (which is generally ok). I wonder if this would be limiting with
other agreement types depending on the schema we use.
In the mapping tree you have a connection as a replicationagreement, but if I understand Simo correctly he wants on object per segment. If two servers are connected this could be one- or bidirectional

In my opinion a 1:1 relation is straight forward, easy to handle and
easy to extend (not the full data of a repl agreement need to be
present, other attributes are possible). The downside may be a larger
number of entries, but this is no problem for the directory server and
replication and the utilities eg to visualize a topology will handle this.
We want a more abstract and easy to handle view for the topology plugin,
in general.

If the number of entries should be reduced information on multiple
replication agreements would have to be stored in one entry, and the
problem arises ho to group data belonging to one agreement. LDAP does
not provide a simple way to group attribute values in one entry, so all
the info related to one agreement (origin, target, replicated attrs and
other repl configuration info) could be stored in a single attribute,
which will make the attribute as nicely readable and managable as acis.
We can easily use subtypes if really needed, this info is quite core to
the IPA code and will not be generally accessed by random clients.
However, as I indicated above we really need one object per graph
segment which represents a two-way connection, so we shouldn't have
issues (but sharing topologies between different databases may
reintroduce it :)

If topology verification and connectivity check is an integral part of
the feature, I think a connection oriented view is not sufficient, it
might be incomplete, so a server view is required, the server entry
would then have the connection information as subentries or as attributes.
We already have the list of servers, so we need to add only the list of
connections in the topology view. We may need to amend the servers
objects to add additional data in some cases. For example indicate
whether it is fully installed or not (on creation the topology plugin
would complain the server is disconnected until we create the first
segment, but that may actually be a good thing :-)
Not sure I grok the fully installed part. A server isn't added as a
master until it is actually installed, so a prepared master shouldn't
show here.

Ad 3] The replication configuration is stored under cn=config and can be
modified either by ldap operations or by editing the dse.ldif. With the
topology plugin another source of configuration changes comes into play.

The first question is: which information has precendence ? I think if
there is info in the shared tree it should be used, and the information
in cn=config should be updated. This also means that the topology plugin
needs to intercept all mods to the entries in cn=config and have them
ignored and handle all updates to the shared tree and trigger changes to
the cn=config entries, which then would trigger rebuilds of the in
memory replication objects.
Yes, I agree.

Next question: How to handle changes directly done in the dse.ldif, if
everything should be done by the topology plugin it would have to verify
and compare the info in cn=config and in the shared tree at every
startup of the directory server, which might be complicated by the fact
that the replication plugin might already be started and repl agreemnts
are active before the topology plugin is started and could do its work.
(plugin starting order and dependencies need to be checked).
Why do we care which one starts first ?
We can simply change replication agreements at any time, so the fact the
replication topology (and therefore agreements) can change after startup
should not be an issue.
Someone could delete an agreement, or worse, add one we don't know
about. Does that matter?

What happens to values in the mapping tree that aren't represented in
our own topology view?

Next next question: should there be a “bootstrapping” of the config
information in the shared tree ?

I think yes, the topology plugin could check at startup if there is a
representation of the config info in the shared tree and if not
construct it, so after deployment and enabling of the topology plugin
the information in the shared tree would be initialized.
Nope, the topology plugin should simply log a loud warning in the error
log and wait quietly until the topology is provided. This is needed to
allow us to handle migrations gracefully and carefully construct the
topology tree at install time w/o having the topology plugin interfere.
We will probably need a big 'enabled/disabled' flag on the topology tree
base object so we can construct a tree w/op waking up the plugin at
every change in the install phase.
So when we do the migration to this version some script will be needed
to create the initial topology from the agreements. Could we have a race

I think that not every part of the feature has to be handled in the
topology plugin, we could also ask for enhancements in the 389
replication plugin itself. There could be an extension to the replica
and replication agreement entries to reference an entry in the shared
tree. The replication plugin could check at startup if these entries
contain replication configuration attributes and if so use them,
otherwise use the values in cn=config. The presence of the reference
indicates to the topolgy plugin that initialization is done.

In my opinion this would simplify the coordination at startup and avoid
unnecessary revaluations and other deployments could benefit from this
new feature in directory server (one could eg have one entry for
replication argreements containing the fractional replication
configuration – and it would be identical on all servers)
I really do not want to touch the replication plugin. It works just fine
as it is, and handling topology has nothing to do with handling the low
level details of the replication. To each its own.
If other deployments want to use the topology plugin, we can later move
it to the 389ds codebase and generalize it.
My memory is fuzzy but I think that restarts are required when
adding/deleting agreements. Is that right? What implications would that
have for this?

So my proposal would contain the following components

1] Store replication configuration in the shared tree in a combination
of server and connection view (think we need both) and map replication
configuration to these entries. I would prefer a direct mapping (with a
subset of the cn=config attributes and required additions)
Nack, we already have the list of servers, we just need 1 object per
connection (graph segment) not one per agreement.

2] provide a topology plugin to do consistency checks and topology
verification, handle updates to trigger modification changes in
cn=config, intercept and reject direct mods to cn=config entries At
startup verify if shared tree opbjects are present, initialize them if
not, apply to cn=config if required

3] enhance replication plugin to handle config information in the shared
tree. This would allow to consistently handle config changes either
applied to the shared config, cn=config mods or des.ldif changes. This
feature might also be interesting to other DS deployments
Nack, leave the replication plugin alone, the topology plugin should do
all the topology work, dealing with interactions between 2 plugins would
tie them together and make things a lot more complicated than necessary.
It would also bind the development of the topology plugin to 2 schedules
(both 389ds and FreeIPA), making also the logistics of developing the
topology plugin more complicated.


Freeipa-devel mailing list

Reply via email to