On 26/11/12 22:18, Alon Bar-Lev wrote: > > > ----- Original Message ----- >> From: "Livnat Peer" <lp...@redhat.com> >> To: "Shu Ming" <shum...@linux.vnet.ibm.com> >> Cc: "Alon Bar-Lev" <abar...@redhat.com>, "VDSM Project Development" >> <vdsm-devel@lists.fedorahosted.org> >> Sent: Monday, November 26, 2012 2:57:19 PM >> Subject: Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary >> >> On 26/11/12 03:15, Shu Ming wrote: >>> Livnat, >>> >>> Thanks for your summary. I got comments below. >>> >>> 2012-11-25 18:53, Livnat Peer: >>>> Hi All, >>>> We have been discussing $subject for a while and I'd like to >>>> summarized >>>> what we agreed and disagreed on thus far. >>>> >>>> The way I see it there are two related discussions: >>>> >>>> >>>> 1. Getting VDSM networking stack to be distribution agnostic. >>>> - We are all in agreement that VDSM API should be generic enough >>>> to >>>> incorporate multiple implementation. (discussed on this thread: >>>> Alon's >>>> suggestion, Mark's patch for adding support for netcf etc.) >>>> >>>> - We would like to maintain at least one implementation as the >>>> working/up-to-date implementation for our users, this >>>> implementation >>>> should be distribution agnostic (as we all acknowledge this is an >>>> important goal for VDSM). >>>> I also think that with the agreement of this community we can >>>> choose to >>>> change our focus, from time to time, from one implementation to >>>> another >>>> as we see fit (today it can be OVS+netcf and in a few months we'll >>>> use >>>> the quantum based implementation if we agree it is better) >>>> >>>> 2. The second discussion is about persisting the network >>>> configuration >>>> on the host vs. dynamically retrieving it from a centralized >>>> location >>>> like the engine. Danken raised a concern that even if going with >>>> the >>>> dynamic approach the host should persist the management network >>>> configuration. >>> >>> About dynamical retrieving from a centralized location, when will >>> the >>> retrieving start? Just in the very early stage of host booting >>> before >>> network functions? Or after the host startup and in the normal >>> running >>> state of the host? Before retrieving the configuration, how does >>> the >>> host network connecting to the engine? I think we need a basic well >>> known network between hosts and the engine first. Then after the >>> retrieving, hosts should reconfigure the network for later >>> management. >>> However, the timing to retrieve and reconfigure are challenging. >>> >> >> We did not discuss the dynamic approach in details on the list so far >> and I think this is a good opportunity to start this discussion... >> >> From what was discussed previously I can say that the need for a well >> known network was raised by danken, it was referred to as the >> management >> network, this network would be used for pulling the full host network >> configuration from the centralized location, at this point the >> engine. >> >> About the timing for retrieving the configuration, there are several >> approaches. One of them was described by Alon, and I think he'll join >> this discussion and maybe put it in his own words, but the idea was >> to >> 'keep' the network synchronized at all times. When the host have >> communication channel to the engine and the engine detects there is a >> mismatch in the host configuration, the engine initiates 'apply >> network >> configuration' action on the host. >> >> Using this approach we'll have a single path of code to maintain and >> that would reduce code complexity and bugs - That's quoting Alon Bar >> Lev >> (Alon I hope I did not twisted your words/idea). >> >> On the other hand the above approach makes local tweaks on the host >> (done manually by the administrator) much harder. >> >> Any other approaches ? >> >> I'd like to add a more general question to the discussion what are >> the >> advantages of taking the dynamic approach? >> So far I collected two reasons: >> >> -It is a 'cleaner' design, removes complexity on VDSM code, easier to >> maintain going forward, and less bug prone (I agree with that one, as >> long as we keep the retrieving configuration mechanism/algorithm >> simple). >> >> -It adheres to the idea of having a stateless hypervisor - some more >> input on this point would be appreciated >> >> Any other advantages? >> >> discussing the benefits of having the persisted >> >> Livnat >> > > Sorry for the delay. Some more expansion. > > ASSUMPTION > > After boot a host running vdsm is able to receive communication from engine. > This means that host has legitimate layer 2 configuration and layer 3 > configuration for the interface used to communicate to engine. > > MISSION > > Reduce complexity of implementation, so that only one algorithm is used in > order to reach to operative state as far as networking is concerned. > > (Storage is extremely similar I can s/network/storage/ and still be relevant). >
For reaching the mission above we can also use the approach suggested by Adam. start from a clean configuration and execute setup network to set the host networking configuration. In Adam's proposal VDSM itself is issuing the setupNetwork and in your approach the engine does. > DESIGN FOCAL POINT > > Host running vdsm is a complete slave of its master, will it be ovirt-engine > or other engine. > > Having a complete slave ease implementation: > > 1. Master always apply the setting as-is. > 2. No need to consider slave state. > 3. No need to implement AI to reach from unknown state X to known state Y + > delta. > 4. After reboot (or fence) host is always in known state. > > ALGORITHM > > A. Given communication to vdsm, construct required vlan, bonding, bridge > setup on machine. > > B. Reboot/Fence - host is reset, apply A. > > C. Network configuration is changed at engine: > (1) Drop all resources that are not used by active VMs. I'm not sure what you mean by the above, drop all resources *not* used by VMs? > (2) Apply A. > > D. Host in maintenance - network configuration can be changed, will be > applied when host go into active, apply C (no resources are used by VMs, all > resources are dropped). > > E. Critical network is down (Host not operative) - network configuration is > not changed. > > F. Host unreachable (None responsive) - network configuration cannot be > changed. > What happens if we have a host that is added to the engine (or used to be non-operational and now returns to up) and reports a network configuration different than what is configured in the engine? > BENEFITS > > Single deterministic algorithm to apply network configuration. > > Pre-defined state after host reboot/fence, host always reachable, previous > network configuration that may be malformed is not in effect. > > Easy to integrate with various network management solution, can it be > primitive iproute, brctl implementation, NetworkManager, OVS or any other > configuration, as Linux is Linux is Linux, the ability to interact with the > kernel is single, while in order to persist implementation requires to > interact with the distribution. > > Moreover, a stateless implementation may be integrated with larger set of > network management tools, as no assumption of persistence is added to the > requirements, so if OVS is non-persistent, we use it as-is. > > We should aspire to reach to a state in which ovirt-node or any other similar > solution is totally stateless, adding a new node to a cluster should be some > blade rebooting from PXE, each persistence layer we drop, the closer we reach > to managing a large data center built on huge number of machines go up/down > as required joining different clusters. > > While discussing clusters, we should also consider autonomic clusters that > enforces policy even if ovirt-engine is unreachable, in this mode we would > like a primitive manager to be able to enforce policy including networking, > while allowing adding/removing nodes without performing any local > configuration. > > IMPLICATIONS > > System administrator will not be allowed to modify 'by hand' any of the > network settings (except of this basic engine reachability). > > Special settings can be set in the master, which will apply them via the > master->vdsm protocol, which in turn use the network management interface in > order to push them, this method should be generic enough to allow pushing > most of the configuration setting allowed (key=value). This approach will > also help replacing/adding nodes in cluster and/or mass deployment. > > Edge conditions can be handled by executing some script on host machine, > allowing administrator to override network configuration upon network > configuration event. > > SUMMARY > > Assuming the host running vdsm as a complete slave and stateless will enable > us to provide better control over that host in the short and long run. > > Manual intervention on hosts serving as hypervisors has the flexibility > argument. However at mass deployment, large data-center or dynamic > environment this flexibility argument becomes liability. > > Thank you, > Alon Bar-Lev > _______________________________________________ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel