Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

Dan Kenigsberg Tue, 27 Nov 2012 06:22:35 -0800

On Tue, Nov 27, 2012 at 11:56:54AM +0200, Livnat Peer wrote:
> On 27/11/12 10:53, Alon Bar-Lev wrote:
> > 
> > 
> > ----- Original Message -----
> >> From: "Livnat Peer" <[email protected]>
> >> To: "Adam Litke" <[email protected]>
> >> Cc: "Alon Bar-Lev" <[email protected]>, "VDSM Project Development" 
> >> <[email protected]>
> >> Sent: Tuesday, November 27, 2012 10:42:00 AM
> >> Subject: Re: [vdsm] Future of Vdsm network configuration - Thread 
> >> mid-summary
> >>
> >> On 26/11/12 16:59, Adam Litke wrote:
> >>> On Mon, Nov 26, 2012 at 02:57:19PM +0200, Livnat Peer wrote:
> >>>> On 26/11/12 03:15, Shu Ming wrote:
> >>>>> Livnat,
> >>>>>
> >>>>> Thanks for your summary.  I got comments below.
> >>>>>
> >>>>> 2012-11-25 18:53, Livnat Peer:
> >>>>>> Hi All,
> >>>>>> We have been discussing $subject for a while and I'd like to
> >>>>>> summarized
> >>>>>> what we agreed and disagreed on thus far.
> >>>>>>
> >>>>>> The way I see it there are two related discussions:
> >>>>>>
> >>>>>>
> >>>>>> 1. Getting VDSM networking stack to be distribution agnostic.
> >>>>>> - We are all in agreement that VDSM API should be generic enough
> >>>>>> to
> >>>>>> incorporate multiple implementation. (discussed on this thread:
> >>>>>> Alon's
> >>>>>> suggestion, Mark's patch for adding support for netcf etc.)
> >>>>>>
> >>>>>> - We would like to maintain at least one implementation as the
> >>>>>> working/up-to-date implementation for our users, this
> >>>>>> implementation
> >>>>>> should be distribution agnostic (as we all acknowledge this is
> >>>>>> an
> >>>>>> important goal for VDSM).
> >>>>>> I also think that with the agreement of this community we can
> >>>>>> choose to
> >>>>>> change our focus, from time to time, from one implementation to
> >>>>>> another
> >>>>>> as we see fit (today it can be OVS+netcf and in a few months
> >>>>>> we'll use
> >>>>>> the quantum based implementation if we agree it is better)
> >>>>>>
> >>>>>> 2. The second discussion is about persisting the network
> >>>>>> configuration
> >>>>>> on the host vs. dynamically retrieving it from a centralized
> >>>>>> location
> >>>>>> like the engine. Danken raised a concern that even if going with
> >>>>>> the
> >>>>>> dynamic approach the host should persist the management network
> >>>>>> configuration.
> >>>>>
> >>>>> About dynamical retrieving from a centralized location,  when
> >>>>> will the
> >>>>> retrieving start? Just in the very early stage of host booting
> >>>>> before
> >>>>> network functions?  Or after the host startup and in the normal
> >>>>> running
> >>>>> state of the host?  Before retrieving the configuration,  how
> >>>>> does the
> >>>>> host network connecting to the engine? I think we need a basic
> >>>>> well
> >>>>> known network between hosts and the engine first.  Then after the
> >>>>> retrieving, hosts should reconfigure the network for later
> >>>>> management.
> >>>>> However, the timing to retrieve and reconfigure are challenging.
> >>>>>
> >>>>
> >>>> We did not discuss the dynamic approach in details on the list so
> >>>> far
> >>>> and I think this is a good opportunity to start this discussion...
> >>>>
> >>>> From what was discussed previously I can say that the need for a
> >>>> well
> >>>> known network was raised by danken, it was referred to as the
> >>>> management
> >>>> network, this network would be used for pulling the full host
> >>>> network
> >>>> configuration from the centralized location, at this point the
> >>>> engine.
> >>>>
> >>>> About the timing for retrieving the configuration, there are
> >>>> several
> >>>> approaches. One of them was described by Alon, and I think he'll
> >>>> join
> >>>> this discussion and maybe put it in his own words, but the idea
> >>>> was to
> >>>> 'keep' the network synchronized at all times. When the host have
> >>>> communication channel to the engine and the engine detects there
> >>>> is a
> >>>> mismatch in the host configuration, the engine initiates 'apply
> >>>> network
> >>>> configuration' action on the host.
> >>>>
> >>>> Using this approach we'll have a single path of code to maintain
> >>>> and
> >>>> that would reduce code complexity and bugs - That's quoting Alon
> >>>> Bar Lev
> >>>> (Alon I hope I did not twisted your words/idea).
> >>>>
> >>>> On the other hand the above approach makes local tweaks on the
> >>>> host
> >>>> (done manually by the administrator) much harder.
> >>>
> >>> I worry a lot about the above if we take the dynamic approach.  It
> >>> seems we'd
> >>> need to introduce before/after 'apply network configuration' hooks
> >>> where the
> >>> admin could add custom config commands that aren't yet modeled by
> >>> engine.
> >>>
> >>
> >> yes, and I'm not sure the administrators would like the fact that we
> >> are
> >> 'forcing' them to write everything in a script and getting familiar
> >> with
> >> VDSM hooking mechanism (which in some cases require the use of custom
> >> properties on the engine level) instead of running a simple command
> >> line.
> > 
> > In which case will we force? Please be more specific.
> > If we can pass most of the iproute2, brctl, bond parameters via key/value 
> > pairs via the API, what in your view that is common or even seldom should 
> > be used?
> > This hook mechanism is only as fallback, provided to calm people down.
> > 
> 
> I understand, I'm saying it can irritate the administrators that needs
> to use it, it does not help that we are calmed down ;)
> 
> Just to make it clear I'm not against the stateless approach, I'm trying
> to understand it better and make sure we are all aware of the drawbacks
> this approach has. Complicating local tweaks to the admin is one of them.
> 
> I'll reply on your original mail with the questions I have on your proposal.
> 
> >>
> >>>> Any other approaches ?
> >>>
> >>> Static configuration has the advantage of allowing a host to bring
> >>> itself back
> >>> online independent of the engine.  This is also useful for anyone
> >>> who may want
> >>> to deploy a vdsm node in standalone mode.
> >>>
> >>> I think it would be possible to easily support a quasi-static
> >>> configuration mode
> >>> simply by extending the design of the dynamic approach slightly.
> >>>  In dynamic
> >>> mode, the network configuration is passed down as a well-defined
> >>> data structure.
> >>> When a particular configuration has been committed, vdsm could
> >>> write a copy of
> >>> that configuration data structure to
> >>> /var/run/vdsm/network-config.json.  During
> >>> a subsequent boot, if the engine cannot be contacted after
> >>> activating the
> >>> management network, the cached configuration can be applied using
> >>> the same code
> >>> as for dynamic mode.  We'd have to flesh out the circumstances
> >>> under which this
> >>> would happen.
> >>
> >> I like this approach a lot but we need to consider that network
> >> configuration is an accumulated state, for example -
> >>
> >> 1. The engine sends a setup-network command with the full host
> >> network
> >> configuration
> >> 2. The user configures new network on the host, the engine sends a
> >> new
> >> setup-network request to VDSM which includes only the delta requested
> >> by
> >> the user (adding the required network)
> >> 3. VDSM adds the new network
> > 
> > THIS IS COMPLEX!!!!!!!
> > Almost AI.
> > As you need to complete the network setting with what you know.
> > 
> 
> I think we should clear this first -
> We have a running hypervisor with running VMs on it, we would like to
> configure an additional network on the host.
> You don't want to apply the network-configuration from scratch and mess
> with the running VMs networks or the storage network or anything else
> that is running and was not change by the administrator ==> you need to
> calculate the delta of the changes to perform as less intrusive
> operation as possible.
> 
> 
> >> and this can go on and on, for dealing with this issue:
> >>
> >> We can either hold network-config.json per setup-network command and
> >> then for recovering the network configuration state we need to
> >> execute
> >> chain of set-up networks commands.
> >>
> >> Or we can move the logic of calculating the delta from engine to VDSM
> >> and on each setup network have the engine pass the full
> >> configuration.
> >> The problem with that approach is that the analysis logic of the
> >> delta
> >> has to be done on the engine anyway to give a quick feedback to the
> >> user
> >> on the validity of his action.
> >> Maintaining this logic/code twice is not something we want (it's bad
> >> enough to do it once....)
> > 
> > I don't understand how the two algorithm are the same...
> > UI is much more/less verbose at different aspects, while taking the full 
> > configuration and convert to actual setting is a completely different 
> > sequence.
> > What the feedback of the user? as far as I understand the user is only 
> > interested in the end-result... building his own network and expect it to 
> > be applied.
> >  
> >> A third option is to extend the current API of setup network to
> >> include
> >> the full configuration in addition to the delta that is sent today.
> >> The
> >> full configuration would be used for creating network-config.json and
> >> for that alone, VDSM would change network configuration according to
> >> the
> >> delta sent as it does today.
> > 
> > Always pass full configuration, why deal with two cases?
> > 
> >> The problem with that approach is that I'm sure someone on the list
> >> would say it is a contamination to the API, and we should 'never'
> >> pass
> >> 'duplicate' information. Personally I find this option the easiest
> >> one
> >> to deal with the above issue.


Current setupNetwork API allows passing the complete image. The only
problem is with vdsm's brutal implementation when it sees a network that
it already knows about: it tears the net completely, and rebuilds
according to the current request.

Also, Vdsm needs explicit request to remove a network - if it is not
mentioned in setupNetwork, it is left unchanged.

> >>
> > 
> > Livnat, I don't see any argument of persistence vs non persistence as the 
> > above is common to any approach taken.
> > 
> > Only this "manual configuration" argument keeps poping, which as I wrote is 
> > irrelevant in large scale and we do want to go into large scale.

Well, we call it "manual configuration", but it applies just as well to
"puppet-based configuration".

Dan.
_______________________________________________
vdsm-devel mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

Reply via email to