Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

Itamar Heim Tue, 27 Nov 2012 12:20:07 -0800

On 11/27/2012 03:17 PM, Alon Bar-Lev wrote:



----- Original Message -----

From: "Itamar Heim" <[email protected]>
To: "Alon Bar-Lev" <[email protected]>
Cc: "Livnat Peer" <[email protected]>, "VDSM Project Development" 
<[email protected]>
Sent: Tuesday, November 27, 2012 10:08:34 PM
Subject: Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

On 11/26/2012 03:18 PM, Alon Bar-Lev wrote:



----- Original Message -----

From: "Livnat Peer" <[email protected]>
To: "Shu Ming" <[email protected]>
Cc: "Alon Bar-Lev" <[email protected]>, "VDSM Project
Development" <[email protected]>
Sent: Monday, November 26, 2012 2:57:19 PM
Subject: Re: [vdsm] Future of Vdsm network configuration - Thread
mid-summary

On 26/11/12 03:15, Shu Ming wrote:

Livnat,

Thanks for your summary.  I got comments below.

2012-11-25 18:53, Livnat Peer:

Hi All,
We have been discussing $subject for a while and I'd like to
summarized
what we agreed and disagreed on thus far.

The way I see it there are two related discussions:


1. Getting VDSM networking stack to be distribution agnostic.
- We are all in agreement that VDSM API should be generic enough
to
incorporate multiple implementation. (discussed on this thread:
Alon's
suggestion, Mark's patch for adding support for netcf etc.)

- We would like to maintain at least one implementation as the
working/up-to-date implementation for our users, this
implementation
should be distribution agnostic (as we all acknowledge this is
an
important goal for VDSM).
I also think that with the agreement of this community we can
choose to
change our focus, from time to time, from one implementation to
another
as we see fit (today it can be OVS+netcf and in a few months
we'll
use
the quantum based implementation if we agree it is better)

2. The second discussion is about persisting the network
configuration
on the host vs. dynamically retrieving it from a centralized
location
like the engine. Danken raised a concern that even if going with
the
dynamic approach the host should persist the management network
configuration.


About dynamical retrieving from a centralized location,  when
will
the
retrieving start? Just in the very early stage of host booting
before
network functions?  Or after the host startup and in the normal
running
state of the host?  Before retrieving the configuration,  how
does
the
host network connecting to the engine? I think we need a basic
well
known network between hosts and the engine first.  Then after the
retrieving, hosts should reconfigure the network for later
management.
However, the timing to retrieve and reconfigure are challenging.


We did not discuss the dynamic approach in details on the list so
far
and I think this is a good opportunity to start this discussion...

  From what was discussed previously I can say that the need for a
  well
known network was raised by danken, it was referred to as the
management
network, this network would be used for pulling the full host
network
configuration from the centralized location, at this point the
engine.

About the timing for retrieving the configuration, there are
several
approaches. One of them was described by Alon, and I think he'll
join
this discussion and maybe put it in his own words, but the idea
was
to
'keep' the network synchronized at all times. When the host have
communication channel to the engine and the engine detects there
is a
mismatch in the host configuration, the engine initiates 'apply
network
configuration' action on the host.

Using this approach we'll have a single path of code to maintain
and
that would reduce code complexity and bugs - That's quoting Alon
Bar
Lev
(Alon I hope I did not twisted your words/idea).

On the other hand the above approach makes local tweaks on the
host
(done manually by the administrator) much harder.

Any other approaches ?

I'd like to add a more general question to the discussion what are
the
advantages of taking the dynamic approach?
So far I collected two reasons:

-It is a 'cleaner' design, removes complexity on VDSM code, easier
to
maintain going forward, and less bug prone (I agree with that one,
as
long as we keep the retrieving configuration mechanism/algorithm
simple).

-It adheres to the idea of having a stateless hypervisor - some
more
input on this point would be appreciated

Any other advantages?

discussing the benefits of having the persisted

Livnat


Sorry for the delay. Some more expansion.

ASSUMPTION

After boot a host running vdsm is able to receive communication
from engine.
This means that host has legitimate layer 2 configuration and layer
3 configuration for the interface used to communicate to engine.

MISSION

Reduce complexity of implementation, so that only one algorithm is
used in order to reach to operative state as far as networking is
concerned.

(Storage is extremely similar I can s/network/storage/ and still be
relevant).

DESIGN FOCAL POINT

Host running vdsm is a complete slave of its master, will it be
ovirt-engine or other engine.

Having a complete slave ease implementation:

   1. Master always apply the setting as-is.
   2. No need to consider slave state.
   3. No need to implement AI to reach from unknown state X to known
   state Y + delta.
   4. After reboot (or fence) host is always in known state.

ALGORITHM

A. Given communication to vdsm, construct required vlan, bonding,
bridge setup on machine.

B. Reboot/Fence - host is reset, apply A.

C. Network configuration is changed at engine:
    (1) Drop all resources that are not used by active VMs.
    (2) Apply A.

D. Host in maintenance - network configuration can be changed, will
be applied when host go into active, apply C (no resources are
used by VMs, all resources are dropped).

E. Critical network is down (Host not operative) - network
configuration is not changed.

F. Host unreachable (None responsive) - network configuration
cannot be changed.

BENEFITS

Single deterministic algorithm to apply network configuration.

Pre-defined state after host reboot/fence, host always reachable,
previous network configuration that may be malformed is not in
effect.

Easy to integrate with various network management solution, can it
be primitive iproute, brctl implementation, NetworkManager, OVS or
any other configuration, as Linux is Linux is Linux, the ability
to interact with the kernel is single, while in order to persist
implementation requires to interact with the distribution.

Moreover, a stateless implementation may be integrated with larger
set of network management tools, as no assumption of persistence
is added to the requirements, so if OVS is non-persistent, we use
it as-is.

We should aspire to reach to a state in which ovirt-node or any
other similar solution is totally stateless, adding a new node to
a cluster should be some blade rebooting from PXE, each
persistence layer we drop, the closer we reach to managing a large
data center built on huge number of machines go up/down as
required joining different clusters.

While discussing clusters, we should also consider autonomic
clusters that enforces policy even if ovirt-engine is unreachable,
in this mode we would like a primitive manager to be able to
enforce policy including networking, while allowing
adding/removing nodes without performing any local configuration.

IMPLICATIONS

System administrator will not be allowed to modify 'by hand' any of
the network settings (except of this basic engine reachability).

Special settings can be set in the master, which will apply them
via the master->vdsm protocol, which in turn use the network
management interface in order to push them, this method should be
generic enough to allow pushing most of the configuration setting
allowed (key=value). This approach will also help replacing/adding
nodes in cluster and/or mass deployment.

Edge conditions can be handled by executing some script on host
machine, allowing administrator to override network configuration
upon network configuration event.

SUMMARY

Assuming the host running vdsm as a complete slave and stateless
will enable us to provide better control over that host in the
short and long run.

Manual intervention on hosts serving as hypervisors has the
flexibility argument. However at mass deployment, large
data-center or dynamic environment this flexibility argument
becomes liability.

Thank you,
Alon Bar-Lev
_______________________________________________
vdsm-devel mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


several questions:
on management interface:
1. bonding configuration must match switch - I'm not sure you'll even
get layer two without persisting the bonding configuration.


Management interface configuration is a separate issue.
If we perform changes of this interface when host is in maintenance we reduce 
the complexity of the problem.

For your specific issue, if there are two interfaces, one which is up during 
boot and one which is down during boot, there is no problem to bond them after 
boot without persisting configuration.


how would you know which bond mode to use? which MTU?

2. are you assuming dhcp for the host to get initial configuration?
some
are not using it so management network needs to be persisted.


See assumption section, assumption is that you have connectivity engine->vdsm 
during boot.
Of course you need to persist <something>
The discussion is the persistence of the dynamic network configuration made by 
hosting VMs.

on use case:
we don't have good support for this today, but there is a notion of a
"hybrid mode" - installing vdsm on a node doing other things to allow
it
to run some guests in a lower priority. I'm not sure we can assume
total
automatic control by ovirt in this use case. to date, we assumed "do
no
harm" to networks we were not directly asked to configure.


This is a product decision, if you enforce this you enforce newly world of 
complexity.

If I am to attend to this issue, I would have ran VM with nested virtualization 
and reduce the problem into this nested host, as I can manage this host as a 
complete slave.


nested virt is still not relevant for production use cases in most distros.
_______________________________________________
vdsm-devel mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

Reply via email to