Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

Alon Bar-Lev Tue, 27 Nov 2012 02:38:40 -0800


----- Original Message -----
> From: "Livnat Peer" <lp...@redhat.com>
> To: "Alon Bar-Lev" <alo...@redhat.com>
> Cc: "VDSM Project Development" <vdsm-devel@lists.fedorahosted.org>, "Shu 
> Ming" <shum...@linux.vnet.ibm.com>, "Saggi
> Mizrahi" <smizr...@redhat.com>, "Dan Kenigsberg" <dan...@redhat.com>
> Sent: Tuesday, November 27, 2012 12:18:31 PM
> Subject: Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary
> 
> On 26/11/12 22:18, Alon Bar-Lev wrote:
> > 
> > 
> > ----- Original Message -----
> >> From: "Livnat Peer" <lp...@redhat.com>
> >> To: "Shu Ming" <shum...@linux.vnet.ibm.com>
> >> Cc: "Alon Bar-Lev" <abar...@redhat.com>, "VDSM Project
> >> Development" <vdsm-devel@lists.fedorahosted.org>
> >> Sent: Monday, November 26, 2012 2:57:19 PM
> >> Subject: Re: [vdsm] Future of Vdsm network configuration - Thread
> >> mid-summary
> >>
> >> On 26/11/12 03:15, Shu Ming wrote:
> >>> Livnat,
> >>>
> >>> Thanks for your summary.  I got comments below.
> >>>
> >>> 2012-11-25 18:53, Livnat Peer:
> >>>> Hi All,
> >>>> We have been discussing $subject for a while and I'd like to
> >>>> summarized
> >>>> what we agreed and disagreed on thus far.
> >>>>
> >>>> The way I see it there are two related discussions:
> >>>>
> >>>>
> >>>> 1. Getting VDSM networking stack to be distribution agnostic.
> >>>> - We are all in agreement that VDSM API should be generic enough
> >>>> to
> >>>> incorporate multiple implementation. (discussed on this thread:
> >>>> Alon's
> >>>> suggestion, Mark's patch for adding support for netcf etc.)
> >>>>
> >>>> - We would like to maintain at least one implementation as the
> >>>> working/up-to-date implementation for our users, this
> >>>> implementation
> >>>> should be distribution agnostic (as we all acknowledge this is
> >>>> an
> >>>> important goal for VDSM).
> >>>> I also think that with the agreement of this community we can
> >>>> choose to
> >>>> change our focus, from time to time, from one implementation to
> >>>> another
> >>>> as we see fit (today it can be OVS+netcf and in a few months
> >>>> we'll
> >>>> use
> >>>> the quantum based implementation if we agree it is better)
> >>>>
> >>>> 2. The second discussion is about persisting the network
> >>>> configuration
> >>>> on the host vs. dynamically retrieving it from a centralized
> >>>> location
> >>>> like the engine. Danken raised a concern that even if going with
> >>>> the
> >>>> dynamic approach the host should persist the management network
> >>>> configuration.
> >>>
> >>> About dynamical retrieving from a centralized location,  when
> >>> will
> >>> the
> >>> retrieving start? Just in the very early stage of host booting
> >>> before
> >>> network functions?  Or after the host startup and in the normal
> >>> running
> >>> state of the host?  Before retrieving the configuration,  how
> >>> does
> >>> the
> >>> host network connecting to the engine? I think we need a basic
> >>> well
> >>> known network between hosts and the engine first.  Then after the
> >>> retrieving, hosts should reconfigure the network for later
> >>> management.
> >>> However, the timing to retrieve and reconfigure are challenging.
> >>>
> >>
> >> We did not discuss the dynamic approach in details on the list so
> >> far
> >> and I think this is a good opportunity to start this discussion...
> >>
> >> From what was discussed previously I can say that the need for a
> >> well
> >> known network was raised by danken, it was referred to as the
> >> management
> >> network, this network would be used for pulling the full host
> >> network
> >> configuration from the centralized location, at this point the
> >> engine.
> >>
> >> About the timing for retrieving the configuration, there are
> >> several
> >> approaches. One of them was described by Alon, and I think he'll
> >> join
> >> this discussion and maybe put it in his own words, but the idea
> >> was
> >> to
> >> 'keep' the network synchronized at all times. When the host have
> >> communication channel to the engine and the engine detects there
> >> is a
> >> mismatch in the host configuration, the engine initiates 'apply
> >> network
> >> configuration' action on the host.
> >>
> >> Using this approach we'll have a single path of code to maintain
> >> and
> >> that would reduce code complexity and bugs - That's quoting Alon
> >> Bar
> >> Lev
> >> (Alon I hope I did not twisted your words/idea).
> >>
> >> On the other hand the above approach makes local tweaks on the
> >> host
> >> (done manually by the administrator) much harder.
> >>
> >> Any other approaches ?
> >>
> >> I'd like to add a more general question to the discussion what are
> >> the
> >> advantages of taking the dynamic approach?
> >> So far I collected two reasons:
> >>
> >> -It is a 'cleaner' design, removes complexity on VDSM code, easier
> >> to
> >> maintain going forward, and less bug prone (I agree with that one,
> >> as
> >> long as we keep the retrieving configuration mechanism/algorithm
> >> simple).
> >>
> >> -It adheres to the idea of having a stateless hypervisor - some
> >> more
> >> input on this point would be appreciated
> >>
> >> Any other advantages?
> >>
> >> discussing the benefits of having the persisted
> >>
> >> Livnat
> >>
> > 
> > Sorry for the delay. Some more expansion.
> > 
> > ASSUMPTION
> > 
> > After boot a host running vdsm is able to receive communication
> > from engine.
> > This means that host has legitimate layer 2 configuration and layer
> > 3 configuration for the interface used to communicate to engine.
> > 
> > MISSION
> > 
> > Reduce complexity of implementation, so that only one algorithm is
> > used in order to reach to operative state as far as networking is
> > concerned.
> > 
> > (Storage is extremely similar I can s/network/storage/ and still be
> > relevant).
> > 
> 
> For reaching the mission above we can also use the approach suggested
> by
> Adam. start from a clean configuration and execute setup network to
> set
> the host networking configuration. In Adam's proposal VDSM itself is
> issuing the setupNetwork and in your approach the engine does.


Right. we can do this 100+ ways, question is which implementation will be the 
simplest.

> 
> 
> > DESIGN FOCAL POINT
> > 
> > Host running vdsm is a complete slave of its master, will it be
> > ovirt-engine or other engine.
> > 
> > Having a complete slave ease implementation:
> > 
> >  1. Master always apply the setting as-is.
> >  2. No need to consider slave state.
> >  3. No need to implement AI to reach from unknown state X to known
> >  state Y + delta.
> >  4. After reboot (or fence) host is always in known state.
> > 
> > ALGORITHM
> >  
> > A. Given communication to vdsm, construct required vlan, bonding,
> > bridge setup on machine.
> > 
> > B. Reboot/Fence - host is reset, apply A.
> > 
> > C. Network configuration is changed at engine:
> >   (1) Drop all resources that are not used by active VMs.
> 
> I'm not sure what you mean by the above, drop all resources *not*
> used
> by VMs?

Let's say we have running VM using bridge bridge1.
We cannot modify this bridge1 as long as VM is operative.
So we drop all network configuration except of bridge1 to allow VM to survive 
the upgrade.

I was tempted to write something else but I did not want to alarm people....
But... when network configuration is changed on a host with running VMs, first 
move the VMs to a different host, then recycle configuration (simplest: reboot).

> >   (2) Apply A.
> > 
> > D. Host in maintenance - network configuration can be changed, will
> > be applied when host go into active, apply C (no resources are
> > used by VMs, all resources are dropped).
> > 
> > E. Critical network is down (Host not operative) - network
> > configuration is not changed.
> > 
> > F. Host unreachable (None responsive) - network configuration
> > cannot be changed.
> > 
> 
> What happens if we have a host that is added to the engine (or used
> to
> be non-operational and now returns to up) and reports a network
> configuration different than what is configured in the engine?

This is a sign of totally malicious node!
A trigger to fencing, active rebooting.
Can you please describe a valid sequence in which it can happen?

> 
> > BENEFITS
> > 
> > Single deterministic algorithm to apply network configuration.
> > 
> > Pre-defined state after host reboot/fence, host always reachable,
> > previous network configuration that may be malformed is not in
> > effect.
> > 
> > Easy to integrate with various network management solution, can it
> > be primitive iproute, brctl implementation, NetworkManager, OVS or
> > any other configuration, as Linux is Linux is Linux, the ability
> > to interact with the kernel is single, while in order to persist
> > implementation requires to interact with the distribution.
> > 
> > Moreover, a stateless implementation may be integrated with larger
> > set of network management tools, as no assumption of persistence
> > is added to the requirements, so if OVS is non-persistent, we use
> > it as-is.
> > 
> > We should aspire to reach to a state in which ovirt-node or any
> > other similar solution is totally stateless, adding a new node to
> > a cluster should be some blade rebooting from PXE, each
> > persistence layer we drop, the closer we reach to managing a large
> > data center built on huge number of machines go up/down as
> > required joining different clusters.
> > 
> > While discussing clusters, we should also consider autonomic
> > clusters that enforces policy even if ovirt-engine is unreachable,
> > in this mode we would like a primitive manager to be able to
> > enforce policy including networking, while allowing
> > adding/removing nodes without performing any local configuration.
> > 
> > IMPLICATIONS
> > 
> > System administrator will not be allowed to modify 'by hand' any of
> > the network settings (except of this basic engine reachability).
> > 
> > Special settings can be set in the master, which will apply them
> > via the master->vdsm protocol, which in turn use the network
> > management interface in order to push them, this method should be
> > generic enough to allow pushing most of the configuration setting
> > allowed (key=value). This approach will also help replacing/adding
> > nodes in cluster and/or mass deployment.
> > 
> > Edge conditions can be handled by executing some script on host
> > machine, allowing administrator to override network configuration
> > upon network configuration event.
> > 
> > SUMMARY
> > 
> > Assuming the host running vdsm as a complete slave and stateless
> > will enable us to provide better control over that host in the
> > short and long run.
> > 
> > Manual intervention on hosts serving as hypervisors has the
> > flexibility argument. However at mass deployment, large
> > data-center or dynamic environment this flexibility argument
> > becomes liability.
> > 
> > Thank you,
> > Alon Bar-Lev
> > 
> 
> 
_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

Reply via email to