IMPORTANT:  Please remember this is simply a strawman to frame a discussion
            around a concept called 'graceful restart.'  More to be explained.

Now that 2.9 work is frozen and the tree will be forked off, I assumed
more extreme and/or interesting ideas might be welcome.  As such, here's
something fairly small-ish that provides an interesting behavior called
'Graceful Restart.'  The idea is that when the OvS userspace is being
upgraded, we can leave the existing flows installed in the datapath allowing
existing flows to continue.  Once the new versions of the daemons take over,
the standard dump/sweep operations of the revalidator threads will resume
and "Everything Will Just Work(tm)."

Of course, there are some important corner cases and side effects that
need to be thought out.  I've listed the ones I know of here (no particular
order, though):


1. Only the active datapath flows (those installed in the kernel datapath
   at the time of 'reload') will remain while the daemons are down.  This
   means *any* new traffic (possibly even new connections between the same
   endpoints) will fail to pass.  This even means a ping between endpoints
   could start failing (ie: if neighbor entries expire, no ARP/ND can pass
   and the neighbor will not be resolved causing send failures - unless
   those flows are luckily still in the kernel datapath).

   1a.  This also means that some protocol exchanges might *seem* to
        work on first glance, but won't actually proceed.  I'm thinking
        cases where pings are used as 'keep alives.'  That's no different
        than existing system.  What will be different is the user expectation.
        The expectation with a "graceful" restart may be that no such failures
        would exist.

2. This is a strong knob that a user may accidentally trigger.  If they do,
   flows will *NEVER* die from the kernel datapath while the daemons are
   running.  This might be acceptable to keep around.  After all, it isn't
   a persistent database entry or anything.  The flag only exists for the
   lifetime of the userspace process (so a restart can also be an effect
   which 'clears' the behavior).  I'm not sure if this would be acceptable.

3. Traffic will pass with no userspace knowledge for a time.  I think this
   is okay - after all if the OvS daemon is killed flows will stick around.
   However, this behavior would go from "well, sometimes it could happen," to
   "we plan and/or expect such to happen."

4. This only covers the kernel datapath.  Userspace datapath implementations
   will still lose the entire datapath during restart.


There probably exists a better/more efficient/more functionally appropriate
way of achieving the desired effect.  This is simply to spawn some discussion
in the upstream community to see if there's a way to achieve this "graceful 
restart" effect (ie: not losing existing packet flow) during planned
outages (upgrades, reloads, etc.)

Since the implementation is subject to complete and total change, I haven't
written any documentation for this feature yet.  I'm saving that work for
another spin after getting some feedback.  There may be other opportunity,
for instance, to integrate with something like ovs-ctl for a system-agnostic
implementation.

Aaron Conole (2):
  datapath: prevent deletion of flows / datapaths
  rhel: tell ovsctl to freeze the datapath

 lib/dpctl.c                                        | 27 +++++++++
 lib/dpif-netdev.c                                  |  2 +
 lib/dpif-netlink.c                                 | 65 ++++++++++++++++------
 lib/dpif-provider.h                                |  8 +++
 lib/dpif.c                                         | 22 ++++++++
 lib/dpif.h                                         |  2 +
 .../usr_lib_systemd_system_ovs-vswitchd.service.in |  2 +-
 utilities/ovs-ctl.in                               |  4 ++
 8 files changed, 115 insertions(+), 17 deletions(-)

-- 
2.14.3

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to