From: Martin Varghese <martin.vargh...@nokia.com> There are various L3 encapsulation standards using UDP being discussed to leverage the UDP based load balancing capability of different networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them.
The Bareudp tunnel provides a generic L3 encapsulation support for tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP tunnel. An example to create bareudp device to tunnel MPLS traffic is given $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ options:payload_type=0x8847 options:dst_port=6635 \ options:packet_type="legacy_l3" \ ofport_request=$bareudp_egress_port The bareudp device supports special handling for MPLS & IP as they can have multiple ethertypes. MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6). The bareudp device to tunnel L3 traffic with multiple ethertypes (MPLS & IP) can be created by passing the L3 protocol name as string in the field payload_type. An example to create bareudp device to tunnel MPLS unicast & multicast traffic is given below.:: $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ options:payload_type=mpls options:dst_port=6635 \ options:packet_type="legacy_l3" Signed-off-by: Martin Varghese <martin.vargh...@nokia.com> Acked-By: Greg Rose <gvrose8...@gmail.com> Tested-by: Greg Rose <gvrose8...@gmail.com> --- Changes in v2: - Removed vport-bareudp module. Changes in v3: - Added net-next upstream commit id and message to commit message. Changes in v4: - Removed kernel datapath changes. Changes in v5: - Fixed release notes errors. - Fixed coding errors in dpif-nelink-rtnl.c. Changes in v6: - Added code to enable rx metadata collection in the kernel device. - Added version history. Changes in v7: - Fixed release notes errors. - Added Skip tests for older kernels. - Changes bareudp ovs_vport_type to 111. - Added Acked-by & tested by from gvrose8...@gmail.com Documentation/automake.mk | 1 + Documentation/faq/bareudp.rst | 62 +++++++++++++++++++++++ Documentation/faq/index.rst | 1 + Documentation/faq/releases.rst | 1 + NEWS | 5 +- datapath/linux/compat/include/linux/openvswitch.h | 9 ++++ lib/dpif-netlink-rtnl.c | 55 ++++++++++++++++++++ lib/dpif-netlink.c | 5 ++ lib/netdev-vport.c | 27 +++++++++- lib/netdev.h | 1 + ofproto/ofproto-dpif-xlate.c | 1 + tests/system-layer3-tunnels.at | 48 ++++++++++++++++++ 12 files changed, 213 insertions(+), 3 deletions(-) create mode 100644 Documentation/faq/bareudp.rst diff --git a/Documentation/automake.mk b/Documentation/automake.mk index f85c432..ea3475f 100644 --- a/Documentation/automake.mk +++ b/Documentation/automake.mk @@ -88,6 +88,7 @@ DOC_SOURCE = \ Documentation/faq/terminology.rst \ Documentation/faq/vlan.rst \ Documentation/faq/vxlan.rst \ + Documentation/faq/bareudp.rst \ Documentation/internals/index.rst \ Documentation/internals/authors.rst \ Documentation/internals/bugs.rst \ diff --git a/Documentation/faq/bareudp.rst b/Documentation/faq/bareudp.rst new file mode 100644 index 0000000..ef43763 --- /dev/null +++ b/Documentation/faq/bareudp.rst @@ -0,0 +1,62 @@ +.. + Licensed under the Apache License, Version 2.0 (the "License"); you may + not use this file except in compliance with the License. You may obtain + a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + License for the specific language governing permissions and limitations + under the License. + + Convention for heading levels in Open vSwitch documentation: + + ======= Heading 0 (reserved for the title in a document) + ------- Heading 1 + ~~~~~~~ Heading 2 + +++++++ Heading 3 + ''''''' Heading 4 + + Avoid deeper levels because they do not render well. + +======= +Bareudp +======= + +Q: What is Bareudp? + + A: There are various L3 encapsulation standards using UDP being discussed + to leverage the UDP based load balancing capability of different + networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among + them. + + The Bareudp tunnel provides a generic L3 encapsulation support for + tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP + tunnel. + + An example to create bareudp device to tunnel MPLS traffic is given + below.:: + + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ + options:payload_type=0x8847 options:dst_port=6635 \ + options:packet_type="legacy_l3" \ + ofport_request=$bareudp_egress_port + + The bareudp device supports special handling for MPLS & IP as they can + have multiple ethertypes. + MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) & + ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) + & ETH_P_IPV6 (v6). + + The bareudp device to tunnel L3 traffic with multiple ethertypes + (MPLS & IP) can be created by passing the L3 protocol name as string in + the field payload_type. An example to create bareudp device to tunnel + MPLS unicast & multicast traffic is given below.:: + + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ + options:payload_type=mpls options:dst_port=6635 \ + options:packet_type="legacy_l3" diff --git a/Documentation/faq/index.rst b/Documentation/faq/index.rst index 334b828..1dd2998 100644 --- a/Documentation/faq/index.rst +++ b/Documentation/faq/index.rst @@ -30,6 +30,7 @@ Open vSwitch FAQ .. toctree:: :maxdepth: 2 + bareudp configuration contributing design diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst index ac93e6e..817e6b0 100644 --- a/Documentation/faq/releases.rst +++ b/Documentation/faq/releases.rst @@ -136,6 +136,7 @@ Q: Are all features available with all datapaths? Tunnel - ERSPAN 4.18 2.10 2.10 NO Tunnel - ERSPAN-IPv6 4.18 2.10 2.10 NO Tunnel - GTP-U NO NO 2.14 NO + Tunnel - Bareudp 5.7 NO NO NO QoS - Policing YES 1.1 2.6 NO QoS - Shaping YES 1.1 NO NO sFlow YES 1.0 1.0 NO diff --git a/NEWS b/NEWS index dceda95..f42d6e2 100644 --- a/NEWS +++ b/NEWS @@ -43,7 +43,10 @@ v2.14.0 - xx xxx xxxx - Tunnels: TC Flower offload * Tunnel Local endpoint address masked match are supported. * Tunnel Romte endpoint address masked match are supported. - + - Bareudp Tunnel + * Bareudp device support is present in linux kernel from version 5.7 + * Kernel bareudp device is not backported to ovs tree. + * Userspace datapath support is not added v2.13.0 - 14 Feb 2020 --------------------- diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h index cc41bbe..40e3a25 100644 --- a/datapath/linux/compat/include/linux/openvswitch.h +++ b/datapath/linux/compat/include/linux/openvswitch.h @@ -246,6 +246,7 @@ enum ovs_vport_type { OVS_VPORT_TYPE_IP6ERSPAN = 108, /* ERSPAN tunnel. */ OVS_VPORT_TYPE_IP6GRE = 109, OVS_VPORT_TYPE_GTPU = 110, + OVS_VPORT_TYPE_BAREUDP = 111, /* Bareudp tunnel. */ __OVS_VPORT_TYPE_MAX }; @@ -308,6 +309,14 @@ enum { #define OVS_VXLAN_EXT_MAX (__OVS_VXLAN_EXT_MAX - 1) +enum { + OVS_BAREUDP_EXT_UNSPEC, + OVS_BAREUDP_EXT_MULTIPROTO_MODE, + __OVS_BAREUDP_EXT_MAX, +}; + +#define OVS_BAREUDP_EXT_MAX (__OVS_BAREUDP_EXT_MAX - 1) + /* OVS_VPORT_ATTR_OPTIONS attributes for tunnels. */ enum { diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c index fd157ce..3f6842a 100644 --- a/lib/dpif-netlink-rtnl.c +++ b/lib/dpif-netlink-rtnl.c @@ -58,6 +58,19 @@ VLOG_DEFINE_THIS_MODULE(dpif_netlink_rtnl); #define IFLA_GENEVE_UDP_ZERO_CSUM6_RX 10 #endif +#ifndef __IFLA_BAREUDP_MAX +#define IFLA_BAREUDP_MAX 0 +#endif +#if IFLA_BAREUDP_MAX < 4 +#define IFLA_BAREUDP_PORT 1 +#define IFLA_BAREUDP_ETHERTYPE 2 +#define IFLA_BAREUDP_SRCPORT_MIN 3 +#define IFLA_BAREUDP_MULTIPROTO_MODE 4 +#define IFLA_BAREUDP_RX_COLLECT_METADATA 5 +#endif + +#define BAREUDP_MPLS_SRCPORT_MIN 49153 + static const struct nl_policy rtlink_policy[] = { [IFLA_LINKINFO] = { .type = NL_A_NESTED }, }; @@ -81,6 +94,10 @@ static const struct nl_policy geneve_policy[] = { [IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NL_A_U8 }, [IFLA_GENEVE_PORT] = { .type = NL_A_U16 }, }; +static const struct nl_policy bareudp_policy[] = { + [IFLA_BAREUDP_PORT] = { .type = NL_A_U16 }, + [IFLA_BAREUDP_ETHERTYPE] = { .type = NL_A_U16 }, +}; static const char * vport_type_to_kind(enum ovs_vport_type type, @@ -113,6 +130,8 @@ vport_type_to_kind(enum ovs_vport_type type, } case OVS_VPORT_TYPE_GTPU: return NULL; + case OVS_VPORT_TYPE_BAREUDP: + return "bareudp"; case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: case OVS_VPORT_TYPE_LISP: @@ -243,6 +262,24 @@ dpif_netlink_rtnl_geneve_verify(const struct netdev_tunnel_config *tnl_cfg, return err; } +static int +dpif_netlink_rtnl_bareudp_verify(const struct netdev_tunnel_config *tnl_cfg, + const char *kind, struct ofpbuf *reply) +{ + struct nlattr *bareudp[ARRAY_SIZE(bareudp_policy)]; + int err; + + err = rtnl_policy_parse(kind, reply, bareudp_policy, bareudp, + ARRAY_SIZE(bareudp_policy)); + if (!err) { + if ((tnl_cfg->dst_port != nl_attr_get_be16(bareudp[IFLA_BAREUDP_PORT])) + || (tnl_cfg->payload_ethertype + != nl_attr_get_be16(bareudp[IFLA_BAREUDP_ETHERTYPE]))) { + err = EINVAL; + } + } + return err; +} static int dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, @@ -275,6 +312,9 @@ dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, case OVS_VPORT_TYPE_GENEVE: err = dpif_netlink_rtnl_geneve_verify(tnl_cfg, kind, reply); break; + case OVS_VPORT_TYPE_BAREUDP: + err = dpif_netlink_rtnl_bareudp_verify(tnl_cfg, kind, reply); + break; case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: case OVS_VPORT_TYPE_LISP: @@ -357,6 +397,20 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, nl_msg_put_u8(&request, IFLA_GENEVE_UDP_ZERO_CSUM6_RX, 1); nl_msg_put_be16(&request, IFLA_GENEVE_PORT, tnl_cfg->dst_port); break; + case OVS_VPORT_TYPE_BAREUDP: + nl_msg_put_be16(&request, IFLA_BAREUDP_ETHERTYPE, + tnl_cfg->payload_ethertype); + if ((tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS)) || + (tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS_MCAST))) { + nl_msg_put_u16(&request, IFLA_BAREUDP_SRCPORT_MIN, + BAREUDP_MPLS_SRCPORT_MIN); + } + nl_msg_put_be16(&request, IFLA_BAREUDP_PORT, tnl_cfg->dst_port); + if (tnl_cfg->exts & (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE)) { + nl_msg_put_flag(&request, IFLA_BAREUDP_MULTIPROTO_MODE); + } + nl_msg_put_flag(&request, IFLA_BAREUDP_RX_COLLECT_METADATA); + break; case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: case OVS_VPORT_TYPE_LISP: @@ -470,6 +524,7 @@ dpif_netlink_rtnl_port_destroy(const char *name, const char *type) case OVS_VPORT_TYPE_ERSPAN: case OVS_VPORT_TYPE_IP6ERSPAN: case OVS_VPORT_TYPE_IP6GRE: + case OVS_VPORT_TYPE_BAREUDP: return dpif_netlink_rtnl_destroy(name); case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 7da4fb5..0062a25 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -749,6 +749,9 @@ get_vport_type(const struct dpif_netlink_vport *vport) case OVS_VPORT_TYPE_GTPU: return "gtpu"; + case OVS_VPORT_TYPE_BAREUDP: + return "bareudp"; + case OVS_VPORT_TYPE_UNSPEC: case __OVS_VPORT_TYPE_MAX: break; @@ -784,6 +787,8 @@ netdev_to_ovs_vport_type(const char *type) return OVS_VPORT_TYPE_GRE; } else if (!strcmp(type, "gtpu")) { return OVS_VPORT_TYPE_GTPU; + } else if (!strcmp(type, "bareudp")) { + return OVS_VPORT_TYPE_BAREUDP; } else { return OVS_VPORT_TYPE_UNSPEC; } diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c index 0252b61..c86d420 100644 --- a/lib/netdev-vport.c +++ b/lib/netdev-vport.c @@ -112,7 +112,7 @@ netdev_vport_needs_dst_port(const struct netdev *dev) return (class->get_config == get_tunnel_config && (!strcmp("geneve", type) || !strcmp("vxlan", type) || !strcmp("lisp", type) || !strcmp("stt", type) || - !strcmp("gtpu", type))); + !strcmp("gtpu", type) || !strcmp("bareudp",type))); } const char * @@ -219,6 +219,8 @@ netdev_vport_construct(struct netdev *netdev_) dev->tnl_cfg.dst_port = port ? htons(port) : htons(STT_DST_PORT); } else if (!strcmp(type, "gtpu")) { dev->tnl_cfg.dst_port = port ? htons(port) : htons(GTPU_DST_PORT); + } else if (!strcmp(type, "bareudp")) { + dev->tnl_cfg.dst_port = htons(port); } dev->tnl_cfg.dont_fragment = true; @@ -438,6 +440,8 @@ tunnel_supported_layers(const char *type, return TNL_L2 | TNL_L3; } else if (!strcmp(type, "gtpu")) { return TNL_L3; + } else if (!strcmp(type, "bareudp")) { + return TNL_L3; } else { return TNL_L2; } @@ -745,6 +749,16 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp) goto out; } } + } else if (!strcmp(node->key, "payload_type")) { + if (strcmp(node->key, "mpls")) { + tnl_cfg.payload_ethertype = htons(ETH_TYPE_MPLS); + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); + } else if ((strcmp(node->key, "ip"))) { + tnl_cfg.payload_ethertype = htons(ETH_TYPE_IP); + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); + } else { + tnl_cfg.payload_ethertype = htons(atoi(node->value)); + } } else { ds_put_format(&errors, "%s: unknown %s argument '%s'\n", name, type, node->key); @@ -917,7 +931,8 @@ get_tunnel_config(const struct netdev *dev, struct smap *args) (!strcmp("vxlan", type) && dst_port != VXLAN_DST_PORT) || (!strcmp("lisp", type) && dst_port != LISP_DST_PORT) || (!strcmp("stt", type) && dst_port != STT_DST_PORT) || - (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT)) { + (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT) || + !strcmp("bareudp", type)) { smap_add_format(args, "dst_port", "%d", dst_port); } } @@ -1243,6 +1258,14 @@ netdev_vport_tunnel_register(void) }, {{NULL, NULL, 0, 0}} }, + { "udp_sys", + { + TUNNEL_FUNCTIONS_COMMON, + .type = "bareudp", + .get_ifindex = NETDEV_VPORT_GET_IFINDEX, + }, + {{NULL, NULL, 0, 0}} + }, }; static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; diff --git a/lib/netdev.h b/lib/netdev.h index fb50730..b705a9e 100644 --- a/lib/netdev.h +++ b/lib/netdev.h @@ -107,6 +107,7 @@ struct netdev_tunnel_config { bool out_key_flow; ovs_be64 out_key; + ovs_be16 payload_ethertype; ovs_be16 dst_port; bool ip_src_flow; diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index e0ede2c..6e07960 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -3573,6 +3573,7 @@ propagate_tunnel_data_to_flow(struct xlate_ctx *ctx, struct eth_addr dmac, case OVS_VPORT_TYPE_VXLAN: case OVS_VPORT_TYPE_GENEVE: case OVS_VPORT_TYPE_GTPU: + case OVS_VPORT_TYPE_BAREUDP: nw_proto = IPPROTO_UDP; break; case OVS_VPORT_TYPE_LISP: diff --git a/tests/system-layer3-tunnels.at b/tests/system-layer3-tunnels.at index 1232964..8423add 100644 --- a/tests/system-layer3-tunnels.at +++ b/tests/system-layer3-tunnels.at @@ -152,3 +152,51 @@ AT_CHECK([tail -1 stdout], [0], OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([layer3 - ping over MPLS Bareudp]) +OVS_CHECK_MIN_KERNEL(5, 7) +OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])]) +ADD_NAMESPACES(at_ns0, at_ns1) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "36:b1:ee:7c:01:01") +ADD_VETH(p1, at_ns1, br1, "10.1.1.2/24", "36:b1:ee:7c:01:02") + +ADD_OVS_TUNNEL([bareudp], [br0], [at_bareudp0], [8.1.1.3], [8.1.1.2/24], + [ options:local_ip=8.1.1.2 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) + +ADD_OVS_TUNNEL([bareudp], [br1], [at_bareudp1], [8.1.1.2], [8.1.1.3/24], + [options:local_ip=8.1.1.3 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) + +AT_DATA([flows0.txt], [dnl +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp0 +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp0 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:01->dl_dst,set_field:36:b1:ee:7c:01:02->dl_src,output:ovs-p0 +table=0,priority=10 actions=normal +]) + +AT_DATA([flows1.txt], [dnl +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp1 +table=0,priority=100,dl_type=0x8847 in_port=at_bareudp1 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:02->dl_dst,set_field:36:b1:ee:7c:01:01->dl_src,output:ovs-p1 +table=0,priority=10 actions=normal +]) + +AT_CHECK([ip link add patch0 type veth peer name patch1]) +on_exit 'ip link del patch0' + +AT_CHECK([ip link set dev patch0 up]) +AT_CHECK([ip link set dev patch1 up]) +AT_CHECK([ovs-vsctl add-port br0 patch0]) +AT_CHECK([ovs-vsctl add-port br1 patch1]) + + +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br0 flows0.txt]) +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br1 flows1.txt]) + +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP -- 1.8.3.1 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev