A 4.4 test kernel with the fix backported is available at:

https://people.canonical.com/~nivedita/geneve-xenial-test/

if anyone wishes to validate the 4.4 X solution.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1794232

Title:
  Geneve tunnels don't work when ipv6 is disabled

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Fix Committed
Status in linux source package in Disco:
  Fix Released

Bug description:
  SRU Justification

  Impact: Cannot create geneve tunnels if ipv6 is disabled dynamically.

  Fix:
  Fixed by upstream commit in v5.0:
  Commit: cf1c9ccba7308e48a68fa77f476287d9d614e4c7
  "geneve: correctly handle ipv6.disable module parameter"

  Hence available in Disco and later; required in X,B,C.

  Testcase:
  1. Boot with "ipv6.disable=1"
  2. Then try and create a geneve tunnel using:
     # ovs-vsctl add-br br1
     # ovs-vsctl add-port br1 geneve1 -- set interface geneve1
      type=geneve options:remote_ip=192.168.x.z // ip of the other host

  Regression Potential: Low, only geneve tunnels when ipv6 dynamically
  disabled, current status is it doesn't work at all.

  Other Info:
  * Mainline commit msg includes reference to a fix for
    non-metadata tunnels (infrastructure is not yet in
    our tree prior to Disco), hence not being included
    at this time under this case.

    At this time, all geneve tunnels created as above
    are metadata-enabled.

  ---
  [Impact]

  When attempting to create a geneve tunnel on Ubuntu 16.04 Xenial, in
  an OS environment with open vswitch, where ipv6 has been disabled,
  the create fails with the error :

  “ovs-vsctl: Error detected while setting up 'geneve0': could not
  add network device geneve0 to ofproto (Address family not supported
  by protocol)."

  [Fix]
  There is an upstream commit for this in v5.0 mainline (and in Disco and later 
Ubuntu kernels).

  "geneve: correctly handle ipv6.disable module parameter"
  Commit: cf1c9ccba7308e48a68fa77f476287d9d614e4c7

  This fix is needed on all our series prior to Disco
  and the v5.0 kernel: X, C, B. It is identical to the
  fix we implemented and tested internally with, but had
  not pushed upstream yet.

  [Test Case]
  (Best to do this on a kvm guest VM so as not to interfere with
   your system's networking)

  1. On any Ubuntu Xenial kernel, disable ipv6. This example
     is shown with the 4.15.0-23-generic kernel (which differs
     slightly from 4.4.x in symptoms):

  - Edit /etc/default/grub to add the line:
          GRUB_CMDLINE_LINUX="ipv6.disable=1"
  - # update-grub
  - Reboot

  2. Install OVS
  # apt install openvswitch-switch

  3. Create a Geneve tunnel
  # ovs-vsctl add-br br1
  # ovs-vsctl add-port br1 geneve1 -- set interface geneve1
  type=geneve options:remote_ip=192.168.x.z

  (where remote_ip is the IP of the other host)

  You will see the following error message:

  "ovs-vsctl: Error detected while setting up 'geneve1'.
  See ovs-vswitchd log for details."

  From /var/log/openvswitch/ovs-vswitchd.log you will see:

  "2018-07-02T16:48:13.295Z|00026|dpif|WARN|system@ovs-system:
  failed to add geneve1 as port: Address family not supported
  by protocol"

  You will notice from the "ifconfig" output that the device
  genev_sys_6081 is not created.

  If you do not disable IPv6 (remove ipv6.disable=1 from
  /etc/default/grub + update-grub + reboot), the same
  'ovs-vsctl add-port' command completes successfully.
  You can see that it is working properly by adding an
  IP to the br1 and pinging each host.

  On kernel 4.4 (4.4.0-128-generic), the error message doesn't
  happen using the 'ovs-vsctl add-port' command, no warning is
  shown in ovs-vswitchd.log, but the device genev_sys_6081 is
  also not created and ping test won't work.

  With the fixed test kernel, the interfaces and tunnel
  is created successfully.

  [Regression Potential]
  * Low -- affects the geneve driver only, and when ipv6 is
    disabled, and since it doesn't work in that case at all,
    this fix gets the tunnel up and running for the common case.

  [Other Info]

  * Analysis

  Geneve tunnels should work with either IPv4 or IPv6 environments
  as a design and support  principle.

  Currently, however, what's in the implementation requires support
  for ipv6 for metadata-based tunnels which geneve is:

  rather than:

  a) ipv4 + metadata // whether ipv6 compiled or dynamically disabled
  b) ipv4 + metadata + ipv6

  What enforces this in the current 4.4.0-x code when opening a Geneve
  tunnel is the following in geneve_open() :

          bool ipv6 = geneve->remote.sa.sa_family == AF_INET6;
          bool metadata = geneve->collect_md;
          ...

  #if IS_ENABLED(CONFIG_IPV6)
          geneve->sock6 = NULL;
          if (ipv6 || metadata)
                  ret = geneve_sock_add(geneve, true);
  #endif
          if (!ret && (!ipv6 || metadata))
                  ret = geneve_sock_add(geneve, false);

  CONFIG_IPV6 is enabled, IPv6 is disabled at boot, but
  even though ipv6 is false, metadata is always true
  for a geneve open as it is set unconditionally in
  ovs:

  In /lib/dpif_netlink_rtnl.c :

  case OVS_VPORT_TYPE_GENEVE:
  nl_msg_put_flag(&request, IFLA_GENEVE_COLLECT_METADATA);

  The second argument of geneve_sock_add is a boolean
  value indicating whether it's an ipv6 address family
  socket or not, and we thus incorrectly pass a true
  value rather than false.

  The current "|| metadata" check is unnecessary and incorrectly
  sends the tunnel creation code down the ipv6 path, which
  fails subsequently when the code expects an ipv6 family socket.

  * This issue exists in all versions of the kernel upto present
     mainline and net-next trees.

  * Testing with a trivial patch to remove that and make
    similar changes to those made for vxlan (which had the
    same issue) has been successful. Patches for various
    versions to be attached here soon.

  * Example Versions (bug exists in all versions of Ubuntu
    and mainline):

  $ uname -r
  4.4.0-135-generic

  $ lsb_release -rd
  Description:  Ubuntu 16.04.5 LTS
  Release:      16.04

  $ dpkg -l | grep openvswitch-switch
  ii  openvswitch-switch                   2.5.4-0ubuntu0.16.04.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1794232/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to