On Thu, 11 Dec 2014 17:03:55 +0100 Petr Spacek <pspa...@redhat.com> wrote:
> On 11.12.2014 15:20, Simo Sorce wrote: > > On Thu, 11 Dec 2014 14:18:36 +0100 > > Ludwig Krispenz <lkris...@redhat.com> wrote: > > > >> > >> On 12/05/2014 04:50 PM, Simo Sorce wrote: > >>> On Thu, 04 Dec 2014 14:33:09 +0100 > >>> Ludwig Krispenz <lkris...@redhat.com> wrote: > >>> > >>>> hi, > >>>> > >>>> I just have another (hopefully this will end soon) issue I want > >>>> to get your input. (please read to teh end first) > >>>> > >>>> To recapture the conditions: > >>>> - the topology plugin manages the connections between servers as > >>>> segments in the shared tree > >>>> - it is authoritative for managed servers, eg it controls all > >>>> connections between servers listed under cn=masters, > >>>> it is permissive for connection to other servers > >>>> - it rejects any removal of a segment, which would disconnect the > >>>> topology. > >>>> - a change in topology can be applied to any server in the > >>>> topology, it will reach the respective servers and the plugin > >>>> will act upon it > >>>> > >>>> Now there is a special case, causing a bit of trouble. If a > >>>> replica is to be removed from the topology, this means that > >>>> the replication agreements from and to this replica should be > >>>> removed, the server should be removed from the manages servers. > >>>> The problem is that: > >>>> - if you remove the server first, the server becomes unmanaged > >>>> and removal of the segment will not trigger a removal of the > >>>> replication agreement > >>> Can you explain what you mean "if you remove the server first" > >>> exactly ? What LDAP operation will be performed, by the management > >>> tools ? > >> as far as the plugin is concerned a removal of a replica triggers > >> two operations: > >> - removal of the host from the sservers in cn=masters, so the > >> server is no longer considered as managed > >> - removal of the segment(s) connecting the to be removed replica > >> to other still amnaged servers, which should remove the > >> corresponding replication agreements. > >> It was the order of these two operations I was talking > > > > We can define a correct order, the plugin can refuse to do any other > > order for direct operations (we need to be careful not to refuse > > replication operations I think). > > > >>> > >>>> - if you remove the segments first, one segment will be the last > >>>> one connecting this replica to the topology and removal will be > >>>> rejected > >>> We should never remove the segments first indeed. > >> if we can fully control that only specific management tools can be > >> used, we can define the order, but an admin could apply individual > >> operations and still it would be good if nothing breaks > > > > I think we had a plan to return UNWILLING_TO_PERFORM if the admin > > tries to remove the last segment first. So we would have no problem > > really, the admin can try and fail. If he wants to remove a master > > he'll have to remove it from the masters group, and this will > > trigger the removal of all segments. > > > >>>> Now, with some effort this can be resolved, eg > >>>> if the server is removed, keep it internally as removed server > >>>> and for segments connecting this server trigger removal of > >>>> replication agreements or mark a the last segment, when tried to > >>>> remove, as pending and once the server is removed also remove the > >>>> corresponding repl agreements > >>> Why should we "keep it internally" ? > >>> If you mark the agreements as managed by setting an attribute on > >>> them, then you will never have any issue recognizing a "managed" > >>> agreement in cn=config, and you will also immediately find out it > >>> is "old" as it is not backed by a segment so you will safely > >>> remove it. > >> I didn't want to add new flags/fields to the replication agreements > >> as long as anything can be handled by the data in the shared tree. > > > > We have too. I think it is a must or we will find numerous corner > > cases. Is there a specific reason why you do not want to add flags > > to replication agreements in cn=config ? > > > >> "internally" was probably misleading, but I will think about it > >> again > > > > Ok, it is important we both understand what issues we see with any > > of the possible approaches so we can agree on the best one. > > > >>> Segments (and their agreements) should be removed as trigger on > >>> the master entry getting removed. This should be done even if it > >>> causes a split brain, because if the server is removed, no matter > >>> how much we wish to keep tropology integrity we effectively are > >>> in a split brain situation, keeping toplogy agreements alive w/o > >>> the server entry doesn't help. > >> If we can agree on that, that presence/removal of masters is the > >> primary trigger that's fine. > > > > Yes I think we can definitely agree that this is the primary trigger > > for server removal/addition. > > > >> I was thinking of situations where a server was removed, > >> but not uninstalled. > > > > Understood, but even then it makes no real difference, once the > > server is removed from the group of masters it will not be able to > > replicate outbound anymore as the other master's ACIs will not > > recognize this server credentials as valid replicator creds. > > > >> Just taking it out of the topology, but it could still be reached > > > > It can be reached, and that may be a problem for clients. But in the > > long term this should be true only for clients manually configured > > to reach that server. Clients that use SRV records would see it > > drop off, and switch to another one. > > > > We may consider whether we want some automatism that causes the > > server to shut itself down if it can't replicate (or receives > > replication data to the effect it realizes it is out of the > > topology). But this may be a little too drastic. > > > >>>> But there is a problem, which I think is much harder and I am not > >>>> sure how much effort I should put in resolving it. > >>>> If we want to have the replication agreements cleaned up after > >>>> removal of a replica without direct modification of cn=config, we > >>>> need to follow the path above, > >>>> but this also means that the last change needs to reach both the > >>>> removed replica (R) and the last server(S) it is connected to. > >>> It would be nice if the changed reached the replica, indeed, but > >>> not a big deal if it doesn't, if you are removing the replica it > >>> means you are decommissioning it, so it is not really that > >>> important that it receives updates, it will be destroyed shortly. > >> That's what I was not sure about, couldn't there be cases where it > >> is not destroyed, just isolated. > > > > Why would you isolate a server ? Is there a legitimate case an admin > > would want to do that ? > > I know about one use case: Upgrade testing. > > Recipe: > - Install new replica. > - Connect it to existing topology and suck in all the data. > - Disconnect the new replica from rest of topology. > - Do upgrade experiments. > - Destroy the new/'experimental' replica. You end up destroying it, so I do not see the problem :-) Simo. -- Simo Sorce * Red Hat, Inc * New York _______________________________________________ Freeipa-devel mailing list Freeipa-devel@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-devel