On 24.2.2016 15:25, Simo Sorce wrote: > On Wed, 2016-02-24 at 10:00 +0100, Martin Kosek wrote: >> On 02/23/2016 06:59 PM, Petr Spacek wrote: >>> On 23.2.2016 18:14, Simo Sorce wrote: >> ... >>>> More seriously I think it is a great idea, but too premature to get all >>>> the way there now. We need to build schema and CLI that will allow us to >>>> get there without having to completely change interfaces if at all >>>> possible or minimizing any disruption in the tools. >>> >>> Actually the backwards compatibility is the main worry which led to this >>> idea >>> with links. >>> >>> If we release first version of locations with custom priorities etc. we will >>> have support the schema (which will be different) and API (which will be >>> later >>> unnecessary) forever. >>> >>> If we skip this intermediate phase with hand-made configuration we can save >>> all the headache with upgrades to more automatic solution later on. >>> >>> >>> Maybe we should invert the order: >>> Start with locations + links with administrative metric and add >>> hand-tweaking >>> capabilities later (if necessary). >>> >>> IMHO locations + links with administrative metric will be easier to >>> implement >>> than the first version. >>> >>> Just thinking aloud ... >> >> Makes sense to me, I would have the same worry as Petr, that we would break >> something if we decide moving to links based solution later. > > Maybe I am missing something, but in order to generate the proper SRV > records we need priority and weights anyway, either by entering them > manually or by autogenerating them from some other piece of information > in the framework. So given this information is needed anyway why would > it become a problem to retain it in the future if we enable a tool the > simply autogenerates this information ?
Let me clarify this: You are right, in the end we always somehow get to priorities and weights. TL;DR version ============= The difference is subtle details how we get priorities and if we store them in LDAP and represent them in API (or not). It will simplify things if we do not expose them. I'm not convinced that we *need* to expose them in the first round. TL version ========== In the high level the process is always as follows: 1. input tuples (location, server, weight) for all primary servers assigned to locations 2. input or derive (location, server, priority) for all backups 3. generate SRV records using priority groups combined from the previous two steps Now we are trying to decide if step (2) is "input" OR "derive" priorities for backup servers. Variants ~~~~~~~~ Variant A --------- If we let the user to do everything manually (no links etc.) we need to provide following schema + API + user interface: [first step - same in both variants] * create locations * assign 'main' (aka 'primary' aka 'home') servers to locations ++ specify weights for the 'main' servers in given location, i.e. manually input (server, weight) tuples [second step] * specify backup servers for each location ++ assign (server, priority, weight) information for each non-main server ++ for S servers and L locations we need to represent up to S * L tuples (server, priority, weight) and provide means to manage it ++ most importantly, maintenance complexity of backups grows any time you add one of (server OR location) ++ this would be a nightmare to manage. For simple cases this require some 'include' mechanism to declare one location as backup for another location. This include complicates things significantly as it has a lot of corner cases and requires different LDAP schema when compared to direct servers assignment. Variant B --------- If we let the user only specify locations + links with costs we need to provide following schema + API + user interface: [first step - no change from variant A] * create locations * assign 'main' (aka 'primary' aka 'home') servers to locations ++ specify weights for the 'main' servers in given location, i.e. manually input (server, weight) tuples [second step] * create links between locations ++ manually assign point-to-point information + administrative cost ++ for S servers and L locations we need to represent up to L^2 tuples (from, to, cost) and provide means to manage it ++ storage can be optimized to great extent if there is a lot of links with equal cost, typically a full-mesh interconnections can be represented by single object in LDAP * generate backups (i.e. priority assignment) using usual routing algorithms. Priority does not need to be neither exposed to user nor stored in LDAP at all. ++ most importantly, maintenance complexity of backups grows while you add locations *but* you do not need to manually go though backup configuration for (potentially) all locations every time as you add/change/remove servers in existing locations (which you have to do with variant A, unless you use some smart includes ...). Please note that variant B with (links, costs) do not use explicit priority specification at all as this is always calculated by a algorithm. If we ever decide to provide means to hand-tweak generated priorities, we can still invent LDAP schema and API for variant A and populate it with data generated by the variant B algorithm, but we do not need to do that today. Less schema and less API -> smaller maintenance costs. Does it clarify why (link, cost) model is easier to manage for end-user than (server, priority, weight)? Variant C --------- An alternative is to be lazy and dumb. Maybe it would be enough for the first round ... We would retain [first step - no change from variant A] * create locations * assign 'main' (aka 'primary' aka 'home') servers to locations ++ specify weights for the 'main' servers in given location, i.e. manually input (server, weight) tuples Then, backups would be auto-generated set of all remaining servers from all other locations. Additional storage complexity: 0 This covers the scenario "always prefer local servers and use remote only as fallback" easily. It does not cover any other scenario. This might be sufficient for the first run and would allow us to gather some feedback from the field. Now I'm inclined to this variant :-) Bonus ===== Variant B with links has some fancy properties, here are some for curious: * Speaking of storage, there is an interesting consequence: Assumption (S = number of servers) >= (L = number of locations) Variant A complexity: S * L Variant B complexity: L * L => S * L >= L * L => variant A complexity >= variant B complexity This holds for the usual cases where all servers within one location have the same priority. * Other cases can be represented in variant B using new location and appropriate link costs. Variant B requires splitting hot backups to separate location like "CZ-hot-backup", which is then easy to display in topology graph etc. * Coincidentally, variant B allows to do fancy things like empty locations which are used only for routing. This nicely describe situation where all branch offices have own local servers and are connected to a VPN concentrator somewhere in the middle of a continent. E.g. declare 'hub' location with no IPA servers in it. Then create links (branch, hub, cost) for each branch office. This trivial configuration would automatically allow to compute backups for branch1 in optimal way, where clients form branch1 prefer branch2 over branch3 because branch3 has crappy VPN link. If you reached this point while not skipping anything you deserve some reward points, let me know :-D -- Petr^2 Spacek -- Manage your subscription for the Freeipa-devel mailing list: https://www.redhat.com/mailman/listinfo/freeipa-devel Contribute to FreeIPA: http://www.freeipa.org/page/Contribute/Code