Now that we have a means to perform a UDP socket lookup without taking
a reference, it is feasible to have flow dissector crack open UDP
encapsulated packets. Generally, we would expect that the UDP source
port or the flow label in IPv6 would contain enough entropy about
the encapsulated flow. However, there will be cases, such as a static
UDP tunnel with fixed ports, where dissecting the encapsulated packet
is valuable.

The model is here is similar to that implemented for UDP GRO. A
tunnel implementation (e.g. GUE) may set a flow_dissect function
in the udp_sk. In __skb_flow_dissect a case has been added for
UDP to check if there is a socket with flow_dissect set. If there
is the function is called. The (per tunnel implementation)
function can parse the encapsulation headers and return the
next protocol for __skb_flow_dissect to process and it's position
in nhoff.

Since performing a UDP lookup on every packet might be expensive
I added a static key check to bypass the lookup if there are no
sockets with flow_dissect set. I should mention that doing the
lookup wasn't particularly a big hit anyway.

Fou/gue was modified to perform tunnel dissection. This is enabled
on each listener socket via a netlink configuration option.

  - davem suggested that we don't need udp_flow_dissect and that
    udp{v6}_encap_needed could be used. Problem is that those are
    in respective udp.c and flow_dissector.c is in net/core. Keep
    udp_flow_dissect as more generic item.
  - Fixed Makefile issue where we were using CONFIG_NET instead of
  - Added limits inf flow dissector from controlling number of nested
    encapsulations or EHs that are dissected.
  - Added CONFIG_INET around use of inet_offloads in flow_dissector.c.

  - Fix build issues with modules that call IPv6 functions and
    CONFIG_INET is not set.
  - Fix compilation error in init'ing .flow_dissect in IPv6 UDP


Running 200 streams with TCP_RR.

GRE/GUE variable source port (baseline)
RSS distributes packets, RFS is effective
1211702 tps
147/241/442 50/90/99% latencies
87.95 CPU utilization

GRE/GUE fixed source port
All packets to one CPU, RFS is ineffective
173680 tps
1170/1377/1853 50/90/99% latencies
7.42 CPU utilization

GRE/GUE fixed source port with deep hash enabled
All packets to one CPU, but now RFS is effective
730359 tps
263/325/464 50/90/99% latencies
38.25% CPU utilization (Interrupting CPU is maxed out)

Tom Herbert (7):
  ipv6: Fix Makefile conditional to use CONFIG_INET
  flow_dissector: Limit processing of next encaps and extensions
  udp: Add socket lookup functions with noref
  udp: UDP flow dissector
  udp: Add UDP flow dissection functions to IPv4 and IPv6
  udp: UDP tunnel flow dissection infrastructure
  fou: Support flow dissection

 drivers/net/usb/cdc_mbim.c   |   4 ++
 include/linux/netdevice.h    |   5 ++
 include/linux/udp.h          |   7 +++
 include/net/flow_dissector.h |   8 +++
 include/net/ipv6.h           |  15 ++++++
 include/net/net_namespace.h  |   2 +
 include/net/udp.h            |  12 +++++
 include/net/udp_tunnel.h     |   5 ++
 include/uapi/linux/fou.h     |   1 +
 net/Makefile                 |   2 +-
 net/core/flow_dissector.c    | 122 ++++++++++++++++++++++++++++++++++++++-----
 net/ipv4/fou.c               |  68 +++++++++++++++++++++++-
 net/ipv4/udp.c               |  11 ++++
 net/ipv4/udp_offload.c       |  39 ++++++++++++++
 net/ipv4/udp_tunnel.c        |   5 ++
 net/ipv6/udp.c               |  10 ++++
 net/ipv6/udp_offload.c       |  40 +++++++++++++-
 17 files changed, 341 insertions(+), 15 deletions(-)


Reply via email to