On Tue, 09 Dec 2014 09:21:43 -0500
Rob Crittenden <rcrit...@redhat.com> wrote:

> Simo Sorce wrote:
> > On Sun, 07 Dec 2014 10:25:06 -0500
> > Rob Crittenden <rcrit...@redhat.com> wrote:
> > 
> >> Simo Sorce wrote:
> >>> On Thu, 04 Dec 2014 14:33:09 +0100
> >>> Ludwig Krispenz <lkris...@redhat.com> wrote:
> >>>
> >>>> hi,
> >>>>
> >>>> I just have another (hopefully this will end soon) issue I want
> >>>> to get your input. (please read to teh end first)
> >>>>
> >>>> To recapture the conditions:
> >>>> -  the topology plugin manages the connections between servers
> >>>> as segments in the shared tree
> >>>> - it is authoritative for managed servers, eg it controls all 
> >>>> connections between servers listed under cn=masters,
> >>>>    it is permissive for connection to other servers
> >>>> - it rejects any removal of a segment, which would disconnect the
> >>>> topology.
> >>>> - a change in topology can be applied to any server in the
> >>>> topology, it will reach the respective servers and the plugin
> >>>> will act upon it
> >>>>
> >>>> Now there is a special case, causing a bit of trouble. If a
> >>>> replica is to be removed from the topology, this means that
> >>>> the replication agreements from and to this replica should be
> >>>> removed, the server should be removed from the manages servers.
> >>>> The problem is that:
> >>>> - if you remove the server first, the server becomes unmanaged
> >>>> and removal of the segment will not trigger a removal of the
> >>>> replication agreement
> >>
> >> I had another, sort of unrelated thought about this, thinking about
> >> deleting servers.
> >>
> >> What happens if a replication conflict entry gets added?
> > 
> > This would happen in case a provisioning system tries to
> > instantiate 2 servers with the same name at the same time talking
> > to different existing masters.
> Sure and quite possible if there is some link down and two admins
> doing the same thing.
> > 
> >> While both exist I imagine that the actual agreement would reflect
> >> whichever entry is processed last. Probably not the end of the
> >> world.
> >>
> >> But how do you remove the conflict entry without also potentially
> >> deleting that master?
> >  
> > It should probably delete both, the domain would be pretty messed up
> > anyone and we have no easy way to know which of the 2 is part of the
> > domain and which one has all replication severed due to their keys
> > being overwritten by the other server ones.
> Yes, this is what I was leaning towards as well, but the plugin may
> need to specifically do this. It all depends on the search filters
> Ludwig uses to find the master(s) to operate on. And should this be
> automatically detected?
> So in reality the fact that there is a duplicate agreement in topology
> probably won't hurt anything at all since we don't really define much
> there. The actual agreement(s) will still be created just fine, but
> this extra topology record could be confusing depending on whether it
> gets returned by anything.
> > I guess the only thing we can reasonably do is to make
> > recommendations on how to deal with replicas deployments to avoid
> > this case and instructions on how to remove remnants entries if any.
> I brought this up in case deleting the conflict entry would create an
> orphan, for example.
> It occurs to me that perhaps the conflict entry may not actually
> contain anything different than the real entry. It isn't like we
> store a ton of data. So I wonder if one deletes a conflict entry it
> shouldn't trigger topology changes. But that is likely going to
> require extra logic.

I think we can avoid this situation completely going forward by
introducing the concept of "ephemeral replication master" (what a
mouthful eh? :).

The idea is that when creating a new replica the installer always
determines deterministically a server among all to contact and creates
the master entry there with an add operation.

So if two masters try to be installed they will conflict early and one
will fail.
If we are in split-brain the one on the wrong side will fail to contact
the master and fail.

If the agreed master is down, no replicas can be installed until it is
restored or removed (or maybe a force flag is used to work around it
for whatever reason).

I think this could work, but I am not sure how well it will mesh with
the new install procedure work that promotes a normal client because
then the basic host keytab will be already broken possibly.

So perhaps we keep this possible plan in mind (open a ticket) ? But
wait a bit to understand what we will end up with on the promotion
changes side.


Simo Sorce * Red Hat, Inc * New York

Freeipa-devel mailing list

Reply via email to