+ultrotter

On May 30 13:44, Jose A. Lopes wrote:
> This design document is still pending review.
> 
> Guido, given that you first accepted Grnet's proposal, would you like to 
> review it?
> 
> Thanks,
> Jose
> 
> On May 15 13:40, Dimitris Aragiorgis wrote:
> > The ifdown script will be responsible for deconfiguring network
> > devices and cleanup changes made by the ifup script. The first
> > implementation will target KVM but it could be ported to Xen as well
> > especially when Xen hotplug gets implemented.
> > 
> > Signed-off-by: Dimitris Aragiorgis <[email protected]>
> > ---
> >  Makefile.am           |    1 +
> >  doc/design-draft.rst  |    1 +
> >  doc/design-ifdown.rst |  156 
> > +++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 158 insertions(+)
> >  create mode 100644 doc/design-ifdown.rst
> > 
> > diff --git a/Makefile.am b/Makefile.am
> > index f5287f6..47b127d 100644
> > --- a/Makefile.am
> > +++ b/Makefile.am
> > @@ -587,6 +587,7 @@ docinput = \
> >     doc/design-htools-2.3.rst \
> >     doc/design-http-server.rst \
> >     doc/design-hugepages-support.rst \
> > +   doc/design-ifdown.rst \
> >     doc/design-impexp2.rst \
> >     doc/design-internal-shutdown.rst \
> >     doc/design-kvmd.rst \
> > diff --git a/doc/design-draft.rst b/doc/design-draft.rst
> > index e3531ee..f6a4e49 100644
> > --- a/doc/design-draft.rst
> > +++ b/doc/design-draft.rst
> > @@ -24,6 +24,7 @@ Design document drafts
> >     design-systemd.rst
> >     design-cpu-speed.rst
> >     design-performance-tests.rst
> > +   design-ifdown.rst
> >  
> >  .. vim: set textwidth=72 :
> >  .. Local Variables:
> > diff --git a/doc/design-ifdown.rst b/doc/design-ifdown.rst
> > new file mode 100644
> > index 0000000..7626da9
> > --- /dev/null
> > +++ b/doc/design-ifdown.rst
> > @@ -0,0 +1,156 @@
> > +======================================
> > +Design for adding ifdown script to KVM
> > +======================================
> > +
> > +.. contents:: :depth: 4
> > +
> > +This is a design document about adding support for an ifdown script 
> > responsible
> > +for deconfiguring network devices and cleanup changes made by the ifup 
> > script. The
> > +first implementation will target KVM but it could be ported to Xen as well
> > +especially when hotplug gets implemented.
> > +
> > +
> > +Current state and shortcomings
> > +==============================
> > +
> > +Currently, KVM before instance startup, instance migration and NIC 
> > hotplug, it
> > +creates a tap and invokes explicitly the kvm-ifup script with the relevant
> > +environment (INTERFACE, MAC, IP, MODE, LINK, TAGS, and all the network 
> > info if
> > +any; NETWORK\_SUBNET, NETWORK\_TAGS, etc).
> > +
> > +For Xen we have the `vif-ganeti` script (associated with vif-script 
> > hypervisor
> > +parameter). The main difference is that Xen calls it by itself by passing 
> > it as
> > +an extra option in the configuration file.
> > +
> > +This ifup script can do several things; bridge a tap to a bridge, add ip 
> > rules,
> > +update a external DNS or DHCP server, enable proxy ARP or proxy NDP, issue
> > +openvswitch commands, etc.  In general we can divide those actions in two
> > +categories:
> > +
> > +1) Commands that change the state of the host
> > +2) Commands that change the state of external components.
> > +
> > +Currently those changes do not get cleaned up or modified upon instance
> > +shutdown, remove, migrate, or NIC hot-unplug. Thus we have stale entries in
> > +hosts and most important might have stale/invalid configuration on external
> > +components like routers that could affect connectivity.
> > +
> > +A workaround could be hooks but:
> > +
> > +1) During migrate hooks the environment is the one held in config data
> > +and not in runtime files. The NIC configuration might have changed on
> > +master but not on the running KVM process (unless hotplug is used).
> > +Plus the NIC order in config data might not be the same one on the KVM
> > +process.
> > +
> > +2) On instance modification, changes are not available on hooks. With
> > +other words we do not know the configuration before and after modification.
> > +
> > +Since Ganeti is the orchestrator and is the one who explicitly configures
> > +host devices (tap, vif) it should be the one responsible for cleanup/
> > +deconfiguration. Especially on a SDN approach this kind of script might
> > +be useful to cleanup flows in the cluster in order to ensure correct paths
> > +without ping pongs between hosts or connectivity loss for the instance.
> > +
> > +
> > +Proposed Changes
> > +================
> > +
> > +We add an new script, kvm-ifdown that is explicitly invoked after:
> > +
> > +1) instance shutdown on primary node
> > +2) successful instance migration on source node
> > +3) failed instance migration on target node
> > +4) successful NIC hot-remove on primary node
> > +
> > +If an administrator's custom ifdown script exists (e.g. 
> > `kvm-ifdown-custom`),
> > +the `kvm-ifdown` script executes that script, as happens with `kvm-ifup`.
> > +
> > +Along with that change we should rename custom ifup script from
> > +`kvm-vif-bridge` (which does not make any sense) to `kvm-ifup-custom`.
> > +
> > +In contrary to `kvm-ifup`, one cannot rely on `kvm-ifdown` script to be
> > +called. A node might die just after a successful migration or after an
> > +instance shutdown. In that case, all "undo" operations will not be invoked.
> > +Thus, this script should work "on a best effort basis" and the network
> > +should not rely on the script being called or being successful. 
> > Additionally
> > +it should modify *only* the node local dynamic configs (routes, arp 
> > entries,
> > +SDN, firewalls, etc.), whereas static ones (DNS, DHCP, etc.) should be 
> > modified
> > +via hooks.
> > +
> > +
> > +Implementation Details
> > +======================
> > +
> > +1) Where to get the NIC info?
> > +
> > +We cannot account on config data since it might have changed. So the only
> > +place we keep our valid data is inside the runtime file. During instance
> > +modifications (NIC hot-remove, hot-modify) we have the NIC object from
> > +the RPC. We take its UUID and search for the corresponding entry in the
> > +runtime file to get further info. After instance shutdown and migration
> > +we just take all NICs from the runtime file and invoke the ifdown script
> > +for each one
> > +
> > +2) Where to find the corresponding TAP?
> > +
> > +Currently TAP names are kept under
> > +/var/run/ganeti/kvm-hypervisor/nics/<instance>/<nic\_index>.
> > +This is not enough. As told above a NIC's index might change during 
> > instance's
> > +life. An example will make things clear:
> > +
> > +* The admin starts an instance with three NICs.
> > +* The admin removes the second without hotplug.
> > +* The admin removes the first with hotplug.
> > +
> > +The index that will arrive with the RPC will be 1 and if we read the 
> > relevant
> > +NIC file we will get the tap of the NIC that has been removed on second
> > +step but is still existing in the KVM process.
> > +
> > +So upon TAP creation we write another file with the same info but named
> > +after the NIC's UUID. The one named after its index can be left
> > +for compatibility (Ganeti does not use it; external tools might)
> > +Obviously this info will not be available for old instances in the cluster.
> > +The ifdown script should be aware of this corner case.
> > +
> > +3) What should we cleanup/deconfigure?
> > +
> > +Upon NIC hot-remove we obviously want to wipe everything. But on instance
> > +migration we don't want to reset external configuration like DNS.  So we 
> > choose
> > +to pass an extra positional argument to the ifdown script (it already has 
> > the
> > +TAP name) that will reflect the context it was invoked with. Please note 
> > that
> > +de-configuration of external components is not encouraged and should be
> > +done via hooks. Still we could easily support it via this extra argument.
> > +
> > +4) What will be the script environment?
> > +
> > +In general the same environment passed to ifup script. Except instance's
> > +tags. Those are the only info not kept in runtime file and it can
> > +change between ifup and ifdown script execution. The ifdown
> > +script must be aware of it and should cleanup everything that ifup script
> > +might setup depending on instance tags (e.g. firewalls, etc)
> > +
> > +
> > +Configuration Changes
> > +~~~~~~~~~~~~~~~~~~~~~
> > +
> > +1) The `kvm-ifdown` script will be an extra file installed under the same 
> > dir
> > +   `kvm-ifup` resides. We could have a single script (and symbolic links 
> > to it)
> > +   that shares the same code, where a second positional argument or an 
> > extra
> > +   environment variable would define if we are bringing the interface up or
> > +   down. Still this is not the best practice since it is not equivalent
> > +   with how KVM uses `script` and `downscript` in the `netdev` option; 
> > scripts
> > +   are different files that get the tap name as positional argument. Of 
> > course
> > +   common code will go in `net-common` so that it can be sourced from 
> > either
> > +   Xen or KVM specific scripts.
> > +
> > +2) An extra file written upon TAP creation named after the NIC's UUID and
> > +   including the TAP's name. Since this should be the correct file to keep
> > +   backwards compatibility we create a symbolic link named after the NIC's
> > +   index and pointing to this new file.
> > +
> > +.. vim: set textwidth=72 :
> > +.. Local Variables:
> > +.. mode: rst
> > +.. fill-column: 72
> > +.. End:
> > -- 
> > 1.7.10.4
> > 
> 
> -- 
> Jose Antonio Lopes
> Ganeti Engineering
> Google Germany GmbH
> Dienerstr. 12, 80331, München
> 
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
> Geschäftsführer: Graham Law, Christine Elizabeth Flores
> Steuernummer: 48/725/00206
> Umsatzsteueridentifikationsnummer: DE813741370

-- 
Jose Antonio Lopes
Ganeti Engineering
Google Germany GmbH
Dienerstr. 12, 80331, München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Steuernummer: 48/725/00206
Umsatzsteueridentifikationsnummer: DE813741370

Reply via email to