On Thu, Feb 13, 2014 at 04:01:45PM +0200, Dimitris Aragiorgis wrote: > The ifdown script will be responsible for deconfiguring network > devices and cleanup changes made by the ifup script. The first > implementation will target KVM but it could be ported to Xen as well > especially when Xen hotplug gets implemented. > > Signed-off-by: Dimitris Aragiorgis <[email protected]> > --- > Makefile.am | 1 + > doc/design-draft.rst | 1 + > doc/design-ifdown.rst | 134 > +++++++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 136 insertions(+) > create mode 100644 doc/design-ifdown.rst > > diff --git a/Makefile.am b/Makefile.am > index 3e216bd..32bfff6 100644 > --- a/Makefile.am > +++ b/Makefile.am > @@ -533,6 +533,7 @@ docinput = \ > doc/design-htools-2.3.rst \ > doc/design-http-server.rst \ > doc/design-hugepages-support.rst \ > + doc/design-ifdown.rst \ > doc/design-impexp2.rst \ > doc/design-internal-shutdown.rst \ > doc/design-kvmd.rst \ > diff --git a/doc/design-draft.rst b/doc/design-draft.rst > index 87c06e7..6ff8059 100644 > --- a/doc/design-draft.rst > +++ b/doc/design-draft.rst > @@ -21,6 +21,7 @@ Design document drafts > design-hsqueeze.rst > design-ssh-ports.rst > design-os.rst > + design-ifdown.rst > > .. vim: set textwidth=72 : > .. Local Variables: > diff --git a/doc/design-ifdown.rst b/doc/design-ifdown.rst > new file mode 100644 > index 0000000..839cf8b > --- /dev/null > +++ b/doc/design-ifdown.rst > @@ -0,0 +1,134 @@ > +====================================== > +Design for adding ifdown script to KVM > +====================================== > + > +.. contents:: :depth: 4 > + > +This is a design document about adding support for an ifdown script > responsible > +for deconfiguring network devices and cleanup changes made by the ifup > script. The > +first implementation will target KVM but it could be ported to Xen as well > +especially when hotplug gets implemented. > + > + > +Current state and shortcomings > +============================== > + > +Currently, KVM before instance startup, instance migration and NIC hotplug, > it > +creates a tap and invokes explicitly the kvm-ifup script with the relevant > +environment (INTERFACE, MAC, IP, MODE, LINK, TAGS, and all the network info > if > +any; NETWORK\_SUBNET, NETWORK\_TAGS, etc). > + > +For Xen we have the `vif-ganeti` script (associated with vif-script > hypervisor > +parameter). The main difference is that Xen calls it by itself by passing it > as > +an extra option in the configuration file. > + > +This ifup script can do several things; bridge a tap to a bridge, add ip > rules, > +update a external DNS or DHCP server, enable proxy ARP or proxy NDP, issue > +openvswitch commands, etc. In general we can divide those actions in two > +categories: > + > +1) Commands that change the state of the host > +2) Commands that change the state of external components. > + > +Currently those changes do not get cleaned up or modified upon instance > +shutdown, remove, migrate, or NIC hot-unplug. Thus we have stale entries in > +hosts and most important might have stale/invalid configuration on external > +components like routers that could affect connectivity. > + > +A workaround could be hooks but: > + > +1) During migrate hooks the environment is the one held in config data > +and not in runtime files. The NIC configuration might have changed on > +master but not on the running KVM process (unless hotplug is used). > +Plus the NIC order in config data might not be the same one on the KVM > +process. > + > +2) On instance modification, changes are not available on hooks. With > +other words we do not know the configuration before and after modification. > + > +Since Ganeti is the orchestrator and is the one who explicitly configures > +host devices (tap, vif) it should be the one responsible for cleanup/ > +deconfiguration. Especially on a SDN approach this kind of script might > +be useful to cleanup flows in the cluster in order to ensure correct paths > +without ping pongs between hosts or connectivity loss for the instance. > + > + > +Proposed Changes > +================ > + > +We add an new script, kvm-ifdown that is explicitly invoked after: > + > +1) instance shutdown on primary node > +2) successful instance migration on source node > +3) failed instance migration on target node > +4) successful NIC hot-remove on primary node > + > +If an administrator's custom ifdown script exists (e.g. `kvm-ifdown-custom`), > +the `kvm-ifdown` script executes that script, as happens with `kvm-ifup`. > + > +Along with that change we should rename custom ifup script from > +`kvm-vif-bridge` (which does not make any sense) to `kvm-ifup-custom`.
Can't we have just one script, like 'kvm-if', that takes another environment variable, say 'EVENT', whose value can be either 'UP' or 'DOWN'? If you want the add logic for the UP event and the DOWN event, you want that logic to be in one place. If the up does A, B, and C, the down will probably do C, B, and A. And putting these in two different files is going to make it complicated to verify that they are aligned. If you put them in the same script, you can write the script in a way that the same logic is on the same place, and thus becomes much easier to verify whether the down is actually reversing the up. Moreover, you are probably going to have code that is going to be shared between the up and the down scripts, which you would probably put in a third script that you then import from the up and down scripts. That's already three files. Why not have just one? Finally, if you have just one script and you really, really want to have two scripts one for up and one for down, you just need to check the environment variable 'EVENT' for the value 'UP' or 'DOWN' and call the up or down script. But if you start with two scripts, there's no way to get only one file. > + > + > +Implementation Details > +====================== > + > +1) Where to get the NIC info? > + > +We cannot account on config data since it might have changed. So the only > +place we keep our valid data is inside the runtime file. During instance > +modifications (NIC hot-remove, hot-modify) we have the NIC object from > +the RPC. We take its UUID and search for the corresponding entry in the > +runtime file to get further info. After instance shutdown and migration > +we just take all NICs from the runtime file and invoke the ifdown script > +for each one > + > +2) Where to find the corresponding TAP? > + > +Currently TAP names are kept under > +/var/run/ganeti/kvm-hypervisor/nics/<instance>/<nic\_index>. > +This is not enough. As told above a NIC's index might change during > instance's > +life. An example will make things clear: > + > +* I start an instance with three NICs. > +* I remove the second without hotplug > +* I remove the first with hotplug. > + > +The index that will arrive with the RPC will be 1 and if we read the relevant > +NIC file we will get the tap of the NIC that has been removed on second > +step but is still existing in the KVM process. > + > +So upon TAP creation we write another file with the same info but named > +after the NIC's UUID. The one named after its index can be left > +for compatibility (Ganeti does not use it; external tools might) > +Obviously this info will not be available for old instances in the cluster. > +The ifdown script should be aware of this corner case. > + > +3) What should we cleanup/deconfigure? > + > +Upon NIC hot-remove we obviously want to wipe everything. But on instance > +migration we don't want to reset external configuration like DNS. So we > choose > +to pass an extra positional argument to the ifdown script (it already has the > +TAP name) that will reflect the context it was invoked with. > + > +4) What will be the script environment? > + > +In general the same environment passed to ifup script. Except instance's > +tags. Those are the only info not kept in runtime file and it can > +change between ifup and ifdown script execution. The ifdown > +script must be aware of it and should cleanup everything that ifup script > +might setup depending on instance tags (e.g. firewalls, etc) > + > + > +Configuration Changes > +~~~~~~~~~~~~~~~~~~~~~ > + > +Just an extra file installed under the same dir kvm-ifup resides. Plus an > +extra file writen upon TAP creation named after the NIC's UUID and including > +the TAP's name. > + > +.. vim: set textwidth=72 : > +.. Local Variables: > +.. mode: rst > +.. fill-column: 72 > +.. End: > -- > 1.7.10.4 > -- Jose Antonio Lopes Ganeti Engineering Google Germany GmbH Dienerstr. 12, 80331, München Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Graham Law, Christine Elizabeth Flores Steuernummer: 48/725/00206 Umsatzsteueridentifikationsnummer: DE813741370
