On Mon, Nov 26, 2012 at 06:13:01PM -0500, Alon Bar-Lev wrote:
> Hello,
> 
> ----- Original Message -----
> > From: "Adam Litke" <a...@us.ibm.com> To: "Alon Bar-Lev" <alo...@redhat.com>
> > Cc: "Livnat Peer" <lp...@redhat.com>, "VDSM Project Development"
> > <vdsm-devel@lists.fedorahosted.org> Sent: Tuesday, November 27, 2012
> > 12:51:36 AM Subject: Re: [vdsm] Future of Vdsm network configuration -
> > Thread mid-summary
> > 
> > Nice writeup!  I like where this is going but see my comments inline below.
> > 
> > On Mon, Nov 26, 2012 at 03:18:22PM -0500, Alon Bar-Lev wrote:
> > > 
> > > 
> > > ----- Original Message -----
> > > > From: "Livnat Peer" <lp...@redhat.com> To: "Shu Ming"
> > > > <shum...@linux.vnet.ibm.com> Cc: "Alon Bar-Lev" <abar...@redhat.com>,
> > > > "VDSM Project Development" <vdsm-devel@lists.fedorahosted.org> Sent:
> > > > Monday, November 26, 2012 2:57:19 PM Subject: Re: [vdsm] Future of Vdsm
> > > > network configuration - Thread mid-summary
> > > > 
> > > > On 26/11/12 03:15, Shu Ming wrote:
> > > > > Livnat,
> > > > > 
> > > > > Thanks for your summary.  I got comments below.
> > > > > 
> > > > > 2012-11-25 18:53, Livnat Peer:
> > > > >> Hi All, We have been discussing $subject for a while and I'd like to
> > > > >> summarized what we agreed and disagreed on thus far.
> > > > >>
> > > > >> The way I see it there are two related discussions:
> > > > >>
> > > > >>
> > > > >> 1. Getting VDSM networking stack to be distribution agnostic.  - We
> > > > >> are all in agreement that VDSM API should be generic enough to
> > > > >> incorporate multiple implementation. (discussed on this thread:
> > > > >> Alon's suggestion, Mark's patch for adding support for netcf etc.)
> > > > >>
> > > > >> - We would like to maintain at least one implementation as the
> > > > >> working/up-to-date implementation for our users, this implementation
> > > > >> should be distribution agnostic (as we all acknowledge this is an
> > > > >> important goal for VDSM).  I also think that with the agreement of
> > > > >> this community we can choose to change our focus, from time to time,
> > > > >> from one implementation to another as we see fit (today it can be
> > > > >> OVS+netcf and in a few months we'll use the quantum based
> > > > >> implementation if we agree it is better)
> > > > >>
> > > > >> 2. The second discussion is about persisting the network
> > > > >> configuration on the host vs. dynamically retrieving it from a
> > > > >> centralized location like the engine. Danken raised a concern that
> > > > >> even if going with the dynamic approach the host should persist the
> > > > >> management network configuration.
> > > > > 
> > > > > About dynamical retrieving from a centralized location,  when will the
> > > > > retrieving start? Just in the very early stage of host booting before
> > > > > network functions?  Or after the host startup and in the normal
> > > > > running state of the host?  Before retrieving the configuration,  how
> > > > > does the host network connecting to the engine? I think we need a
> > > > > basic well known network between hosts and the engine first.  Then
> > > > > after the retrieving, hosts should reconfigure the network for later
> > > > > management.  However, the timing to retrieve and reconfigure are
> > > > > challenging.
> > > > > 
> > > > 
> > > > We did not discuss the dynamic approach in details on the list so far
> > > > and I think this is a good opportunity to start this discussion...
> > > > 
> > > > From what was discussed previously I can say that the need for a well
> > > > known network was raised by danken, it was referred to as the management
> > > > network, this network would be used for pulling the full host network
> > > > configuration from the centralized location, at this point the engine.
> > > > 
> > > > About the timing for retrieving the configuration, there are several
> > > > approaches. One of them was described by Alon, and I think he'll join
> > > > this discussion and maybe put it in his own words, but the idea was to
> > > > 'keep' the network synchronized at all times. When the host have
> > > > communication channel to the engine and the engine detects there is a
> > > > mismatch in the host configuration, the engine initiates 'apply network
> > > > configuration' action on the host.
> > > > 
> > > > Using this approach we'll have a single path of code to maintain and
> > > > that would reduce code complexity and bugs - That's quoting Alon Bar Lev
> > > > (Alon I hope I did not twisted your words/idea).
> > > > 
> > > > On the other hand the above approach makes local tweaks on the host
> > > > (done manually by the administrator) much harder.
> > > > 
> > > > Any other approaches ?
> > > > 
> > > > I'd like to add a more general question to the discussion what are the
> > > > advantages of taking the dynamic approach?  So far I collected two
> > > > reasons:
> > > > 
> > > > -It is a 'cleaner' design, removes complexity on VDSM code, easier to
> > > > maintain going forward, and less bug prone (I agree with that one, as
> > > > long as we keep the retrieving configuration mechanism/algorithm
> > > > simple).
> > > > 
> > > > -It adheres to the idea of having a stateless hypervisor - some more
> > > > input on this point would be appreciated
> > > > 
> > > > Any other advantages?
> > > > 
> > > > discussing the benefits of having the persisted
> > > > 
> > > > Livnat
> > > > 
> > > 
> > > Sorry for the delay. Some more expansion.
> > > 
> > > ASSUMPTION
> > > 
> > > After boot a host running vdsm is able to receive communication from
> > > engine.  This means that host has legitimate layer 2 configuration and
> > > layer 3 configuration for the interface used to communicate to engine.
> > > 
> > > MISSION
> > > 
> > > Reduce complexity of implementation, so that only one algorithm is used in
> > > order to reach to operative state as far as networking is concerned.
> > > 
> > > (Storage is extremely similar I can s/network/storage/ and still be
> > > relevant).
> > > 
> > > DESIGN FOCAL POINT
> > > 
> > > Host running vdsm is a complete slave of its master, will it be
> > > ovirt-engine or other engine.
> > 
> > I do not agree with this direction.  It reinforces the single point of
> > failure of the centralized manager.  Also, I am actively working to make
> > vdsm a self contained component that is independently useful.  This proposal
> > will effectively cripple that effort.
> > 
> > I would prefer a statement that a node _CAN_ be a slave to engine but can
> > also re-apply a previous configuration in the absense of a management
> > server.  See my other post for how this can be achieved without adding much
> > complexity to the design.
> 
> I strongly disagree.  I think you going to mix two separate components into
> one.  This is a fundamental issue, so I won't answer all the point you raise
> because all derives from this one.
> 
> vdsm is a slave, and should move to a stateless slave in order to keep it
> simple and stupid, which is actually smart.  A management component can be
> installed on the same host or at different host.  This management component
> can be a) a custom management component that is not part of the ovirt
> architecture.  b) a component that manages a cluster on behalf of the
> ovirt-engine.
> 
> There is no problem in implementing this component at: a) vdsm protocol proxy,
> a component that sits between vdsm and the ovirt-engine or whatever north
> connection.  b) entirely different entity which communicate with vdsm and
> different protocol to north.
> 
> If we follow this design, we have simple building blocks, that together can
> build complex solutions.  As each building block is simple the cost of
> maintenance of each is lower.

I am not opposed to implementing the 'static' mode as a separate service that
depends on vdsm.  My only concern with having this functionality outside of
vdsm is that it might break more easily as the code changes.  Hopefully, the
stable node-level API will mostly prevent problems in this area but we will need
to be more careful.

> > 
> > > Having a complete slave ease implementation:
> > > 
> > >  1. Master always apply the setting as-is.  2. No need to consider slave
> > >  state.  3. No need to implement AI to reach from unknown state X to known
> > >  state Y + delta.
> > 
> > These would be properties of any intelligent design regardless if engine is
> > responsible for triggering the configuration changes or if vdsm does it
> > autonomously.  In either case you need to write an algorithm that is capable
> > of deleting all networking config (except for the management interface).
> > Without this, you would be unable to apply incremental configuration changes
> > from engine reliably.
> > 
> > >  4. After reboot (or fence) host is always in known state.
> > 
> > Once you have a method to strip networking config down to only the
> > management interface, you can always get back to a known state.  I suggest
> > having a vdsm API that can do this for you.
> > 
> > > ALGORITHM
> > >  
> > > A. Given communication to vdsm, construct required vlan, bonding, bridge
> > > setup on machine.
> > > 
> > > B. Reboot/Fence - host is reset, apply A.
> > > 
> > > C. Network configuration is changed at engine: (1) Drop all resources that
> > > are not used by active VMs.
> > 
> >     This is the 'network reset' operation I am referring to.
> > 
> > >   (2) Apply A.
> > 
> > > D. Host in maintenance - network configuration can be changed, will be
> > > applied when host go into active, apply C (no resources are used by VMs,
> > > all resources are dropped).
> > > 
> > > E. Critical network is down (Host not operative) - network configuration
> > > is not changed.
> > > 
> > > F. Host unreachable (None responsive) - network configuration cannot be
> > > changed.
> > > 
> > > BENEFITS
> > > 
> > > Single deterministic algorithm to apply network configuration.
> > > 
> > > Pre-defined state after host reboot/fence, host always reachable, previous
> > > network configuration that may be malformed is not in effect.
> > 
> > Do you plan to keep the transactional nature of the current API (ie.
> > setSafeNetworkingConfig must be called after setupNetworks in order to
> > persist it)?
> > 
> > > 
> > > Easy to integrate with various network management solution, can it be
> > > primitive iproute, brctl implementation, NetworkManager, OVS or any other
> > > configuration, as Linux is Linux is Linux, the ability to interact with
> > > the kernel is single, while in order to persist implementation requires to
> > > interact with the distribution.
> > > 
> > > Moreover, a stateless implementation may be integrated with larger set of
> > > network management tools, as no assumption of persistence is added to the
> > > requirements, so if OVS is non-persistent, we use it as-is.
> > > 
> > > We should aspire to reach to a state in which ovirt-node or any other
> > > similar solution is totally stateless, adding a new node to a cluster
> > > should be some blade rebooting from PXE, each persistence layer we drop,
> > > the closer we reach to managing a large data center built on huge number
> > > of machines go up/down as required joining different clusters.
> > > 
> > > While discussing clusters, we should also consider autonomic clusters that
> > > enforces policy even if ovirt-engine is unreachable, in this mode we would
> > > like a primitive manager to be able to enforce policy including
> > > networking, while allowing adding/removing nodes without performing any
> > > local configuration.
> > 
> > This could also be done without requiring another redundant management
> > entity by storing a fallback config to apply when engine is unreachable.
> > Yes, it's stateful, but that's not always a problem.
> > 
> > > IMPLICATIONS
> > > 
> > > System administrator will not be allowed to modify 'by hand' any of the
> > > network settings (except of this basic engine reachability).
> > 
> > Good luck making that requirement stick in the face of real customers :)
> > You'll need (at the very least) a hooking mechanism for admins to override
> > some configuration that hasn't yet been modeled by oVirt.
> > 
> > > Special settings can be set in the master, which will apply them via the
> > > master->vdsm protocol, which in turn use the network management interface
> > > in order to push them, this method should be generic enough to allow
> > > pushing most of the configuration setting allowed (key=value). This
> > > approach will also help replacing/adding nodes in cluster and/or mass
> > > deployment.
> > > 
> > > Edge conditions can be handled by executing some script on host machine,
> > > allowing administrator to override network configuration upon network
> > > configuration event.
> > > 
> > > SUMMARY
> > > 
> > > Assuming the host running vdsm as a complete slave and stateless will
> > > enable us to provide better control over that host in the short and long
> > > run.
> > > 
> > > Manual intervention on hosts serving as hypervisors has the flexibility
> > > argument. However at mass deployment, large data-center or dynamic
> > > environment this flexibility argument becomes liability.
> > 
> > Today oVirt plays in the small data center realm so I do think it's
> > important to give appropriate weight to the flexibility argument.  It should
> > be possible to build different environments based on the needs of the
> > deployment.
> > 

-- 
Adam Litke <a...@us.ibm.com>
IBM Linux Technology Center

_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Reply via email to