On Thu, 11 Dec 2014 17:03:55 +0100
Petr Spacek <pspa...@redhat.com> wrote:

> On 11.12.2014 15:20, Simo Sorce wrote:
> > On Thu, 11 Dec 2014 14:18:36 +0100
> > Ludwig Krispenz <lkris...@redhat.com> wrote:
> > 
> >>
> >> On 12/05/2014 04:50 PM, Simo Sorce wrote:
> >>> On Thu, 04 Dec 2014 14:33:09 +0100
> >>> Ludwig Krispenz <lkris...@redhat.com> wrote:
> >>>
> >>>> hi,
> >>>>
> >>>> I just have another (hopefully this will end soon) issue I want
> >>>> to get your input. (please read to teh end first)
> >>>>
> >>>> To recapture the conditions:
> >>>> -  the topology plugin manages the connections between servers as
> >>>> segments in the shared tree
> >>>> - it is authoritative for managed servers, eg it controls all
> >>>> connections between servers listed under cn=masters,
> >>>>     it is permissive for connection to other servers
> >>>> - it rejects any removal of a segment, which would disconnect the
> >>>> topology.
> >>>> - a change in topology can be applied to any server in the
> >>>> topology, it will reach the respective servers and the plugin
> >>>> will act upon it
> >>>>
> >>>> Now there is a special case, causing a bit of trouble. If a
> >>>> replica is to be removed from the topology, this means that
> >>>> the replication agreements from and to this replica should be
> >>>> removed, the server should be removed from the manages servers.
> >>>> The problem is that:
> >>>> - if you remove the server first, the server becomes unmanaged
> >>>> and removal of the segment will not trigger a removal of the
> >>>> replication agreement
> >>> Can you explain what you mean "if you remove the server first"
> >>> exactly ? What LDAP operation will be performed, by the management
> >>> tools ?
> >> as far as the plugin is concerned a removal of a replica triggers
> >> two operations:
> >> - removal of the host from the sservers in cn=masters, so the
> >> server is no longer considered as managed
> >> - removal of the segment(s) connecting the to be removed replica
> >> to other still amnaged servers, which should remove the
> >> corresponding replication agreements.
> >> It was the order of these two operations I was talking
> > 
> > We can define a correct order, the plugin can refuse to do any other
> > order for direct operations (we need to be careful not to refuse
> > replication operations I think).
> > 
> >>>
> >>>> - if you remove the segments first, one segment will be the last
> >>>> one connecting this replica to the topology and removal will be
> >>>> rejected
> >>> We should never remove the segments first indeed.
> >> if we can fully control that only specific management tools can be
> >> used, we can define the order, but an admin could apply individual
> >> operations and still it would be good if nothing breaks
> > 
> > I think we had a plan to return UNWILLING_TO_PERFORM if the admin
> > tries to remove the last segment first. So we would have no problem
> > really, the admin can try and fail. If he wants to remove a master
> > he'll have to remove it from the masters group, and this will
> > trigger the removal of all segments.
> > 
> >>>> Now, with some effort this can be resolved, eg
> >>>> if the server is removed, keep it internally as removed server
> >>>> and for segments connecting this server trigger removal of
> >>>> replication agreements or mark a the last segment, when tried to
> >>>> remove, as pending and once the server is removed also remove the
> >>>> corresponding repl agreements
> >>> Why should we "keep it internally" ?
> >>> If you mark the agreements as managed by setting an attribute on
> >>> them, then you will never have any issue recognizing a "managed"
> >>> agreement in cn=config, and you will also immediately find out it
> >>> is "old" as it is not backed by a segment so you will safely
> >>> remove it.
> >> I didn't want to add new flags/fields to the replication agreements
> >> as long as anything can be handled by the data in the shared tree.
> > 
> > We have too. I think it is a must or we will find numerous corner
> > cases. Is there a specific reason why you do not want to add flags
> > to replication agreements in cn=config ?
> > 
> >> "internally" was probably misleading, but I will think about it
> >> again
> > 
> > Ok, it is important we both understand what issues we see with any
> > of the possible approaches so we can agree on the best one.
> > 
> >>> Segments (and their agreements) should be removed as trigger on
> >>> the master entry getting removed. This should be done even if it
> >>> causes a split brain, because if the server is removed, no matter
> >>> how much we wish to keep tropology integrity we effectively are
> >>> in a split brain situation, keeping toplogy agreements alive w/o
> >>> the server entry doesn't help.
> >> If we can agree on that, that presence/removal of masters is the
> >> primary trigger that's fine.
> > 
> > Yes I think we can definitely agree that this is the primary trigger
> > for server removal/addition.
> > 
> >> I was thinking of situations where a server was removed, 
> >> but not uninstalled.
> > 
> > Understood, but even then it makes no real difference, once the
> > server is removed from the group of masters it will not be able to
> > replicate outbound anymore as the other master's ACIs will not
> > recognize this server credentials as valid replicator creds.
> > 
> >> Just taking it out of the topology, but it could still be reached
> > 
> > It can be reached, and that may be a problem for clients. But in the
> > long term this should be true only for clients manually configured
> > to reach that server. Clients that use SRV records would see it
> > drop off, and switch to another one.
> > 
> > We may consider whether we want some automatism that causes the
> > server to shut itself down if it can't replicate (or receives
> > replication data to the effect it realizes it is out of the
> > topology). But this may be a little too drastic.
> > 
> >>>> But there is a problem, which I think is much harder and I am not
> >>>> sure how much effort I should put in resolving it.
> >>>> If we want to have the replication agreements cleaned up after
> >>>> removal of a replica without direct modification of cn=config, we
> >>>> need to follow the path above,
> >>>> but this also means that the last change needs to reach both the
> >>>> removed replica (R) and the last server(S) it is connected to.
> >>> It would be nice if the changed reached the replica, indeed, but
> >>> not a big deal if it doesn't, if you are removing the replica it
> >>> means you are decommissioning it, so it is not really that
> >>> important that it receives updates, it will be destroyed shortly.
> >> That's what I was not sure about, couldn't there be cases where it
> >> is not destroyed, just isolated.
> > 
> > Why would you isolate a server ? Is there a legitimate case an admin
> > would want to do that ?
> 
> I know about one use case: Upgrade testing.
> 
> Recipe:
> - Install new replica.
> - Connect it to existing topology and suck in all the data.
> - Disconnect the new replica from rest of topology.
> - Do upgrade experiments.
> - Destroy the new/'experimental' replica.

You end up destroying it, so I do not see the problem :-)

Simo.

-- 
Simo Sorce * Red Hat, Inc * New York

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Reply via email to