On 9 Dec 2020, at 14:23, Martin Varghese wrote:
On Tue, Dec 08, 2020 at 02:42:42PM +0100, Eelco Chaudron wrote:Hi Martin,Did some basic testing, and it all works fine. See some comments inlinebelow. Cheers, Eelco On 7 Dec 2020, at 4:32, Martin Varghese wrote:From: Martin Varghese <[email protected]>There are various L3 encapsulation standards using UDP being discussedtoleverage the UDP based load balancing capability of different networks.MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them. The Bareudp tunnel provides a generic L3 encapsulation support fortunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDPtunnel. An example to create bareudp device to tunnel MPLS traffic is given $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ options:payload_type=0x8847 options:dst_port=6635 \ options:packet_type="legacy_l3" \ ofport_request=$bareudp_egress_port The bareudp device supports special handling for MPLS & IP asthey can have multiple ethertypes. MPLS procotcol can have ethertypesETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6). The bareudp device to tunnel L3 traffic with multiple ethertypes(MPLS & IP) can be created by passing the L3 protocol name as string in the field payload_type. An example to create bareudp device to tunnelMPLS unicast & multicast traffic is given below.:: $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ options:payload_type=mpls options:dst_port=6635 \ options:packet_type="legacy_l3" Signed-off-by: Martin Varghese <[email protected]> Acked-By: Greg Rose <[email protected]> Tested-by: Greg Rose <[email protected]> --- Changes in v2: - Removed vport-bareudp module. Changes in v3:- Added net-next upstream commit id and message to commit message.Changes in v4: - Removed kernel datapath changes. Changes in v5: - Fixed release notes errors. - Fixed coding errors in dpif-nelink-rtnl.c. Changes in v6:- Added code to enable rx metadata collection in the kernel device.- Added version history. Changes in v7 - Fixed release notes errors. - Added Skip tests for older kernels. - Changes bareudp ovs_vport_type to 111. - Added Acked-by & tested by from [email protected] Changes in v8 - The code added in v6 to enable rx metadata collection inthe kernel device is removed. This flag was never added to any of the kernel release. The rx metadata collection is always enabledin kernel bareudp module. Documentation/automake.mk | 1 +Documentation/faq/bareudp.rst | 62 +++++++++++++++++++Documentation/faq/index.rst | 1 + Documentation/faq/releases.rst | 1 + NEWS | 5 +- .../linux/compat/include/linux/openvswitch.h | 9 +++ lib/dpif-netlink-rtnl.c | 53 ++++++++++++++++ lib/dpif-netlink.c | 5 ++ lib/netdev-vport.c | 27 +++++++- lib/netdev.h | 1 + ofproto/ofproto-dpif-xlate.c | 1 + tests/system-layer3-tunnels.at | 48 ++++++++++++++ 12 files changed, 211 insertions(+), 3 deletions(-) create mode 100644 Documentation/faq/bareudp.rst diff --git a/Documentation/automake.mk b/Documentation/automake.mk index f85c4320e..ea3475f35 100644 --- a/Documentation/automake.mk +++ b/Documentation/automake.mk @@ -88,6 +88,7 @@ DOC_SOURCE = \ Documentation/faq/terminology.rst \ Documentation/faq/vlan.rst \ Documentation/faq/vxlan.rst \ + Documentation/faq/bareudp.rst \ Documentation/internals/index.rst \ Documentation/internals/authors.rst \ Documentation/internals/bugs.rst \ diff --git a/Documentation/faq/bareudp.rst b/Documentation/faq/bareudp.rst new file mode 100644 index 000000000..ef437631c --- /dev/null +++ b/Documentation/faq/bareudp.rst @@ -0,0 +1,62 @@ +..+ Licensed under the Apache License, Version 2.0 (the "License");you may+ not use this file except in compliance with the License. You mayobtain + a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software+ distributed under the License is distributed on an "AS IS" BASIS,WITHOUT+ WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the + License for the specific language governing permissions and limitations + under the License. + + Convention for heading levels in Open vSwitch documentation: + + ======= Heading 0 (reserved for the title in a document) + ------- Heading 1 + ~~~~~~~ Heading 2 + +++++++ Heading 3 + ''''''' Heading 4 + + Avoid deeper levels because they do not render well. + +======= +Bareudp +======= + +Q: What is Bareudp? + + A: There are various L3 encapsulation standards using UDP being discussed+ to leverage the UDP based load balancing capability of different + networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) isone among + them. ++ The Bareudp tunnel provides a generic L3 encapsulation supportfor+ tunnelling different L3 protocols like MPLS, IP, NSH etc. insidea UDP + tunnel. ++ An example to create bareudp device to tunnel MPLS traffic isgiven + below.:: + + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ + options:payload_type=0x8847 options:dst_port=6635 \I think it would be good to explain what the payload_type is used for asit's not clear from this text, and I had to read the kernel code to understand.Maybe add an example on how to redirect traffic to this tunnel, as it willonly accept the specific ethertype.I will explain the payload_type. One can refer the tests to see how the traffic is directed towards tunnel. I propose not to mention that here.But if you insist we can add a sample rule which has an action to push aMPLS label and output to a bareudp port or do you suggest something else ?
That's what I suggest, end-users do not tend to look at the test cases, so a good example here would be appreciated. Also as the payload_type dictates the correct header.
+ options:packet_type="legacy_l3" \Looking at the code, it seems we only support packet_type=legacy_l3 (or ptap), so we could remove it in the examples as it will default to L3.Yes we could remove it+ ofport_request=$bareudp_egress_port +Maybe also the ofport_request option can be removed, as it adds no valuehere.Noted+ The bareudp device supports special handling for MPLS & IP asthey can + have multiple ethertypes. + MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) & + ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes ETH_P_IP (v4) + & ETH_P_IPV6 (v6). ++ The bareudp device to tunnel L3 traffic with multiple ethertypes + (MPLS & IP) can be created by passing the L3 protocol name asstring in+ the field payload_type. An example to create bareudp device totunnel + MPLS unicast & multicast traffic is given below.:: + + $ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \ + type=bareudp options:remote_ip=2.1.1.3 options:local_ip=2.1.1.2 \ + options:payload_type=mpls options:dst_port=6635 \ + options:packet_type="legacy_l3"Same as above on packet_type.Noted.Maybe also add an example for IP over UDP?Yes. I will add one.diff --git a/Documentation/faq/index.rst b/Documentation/faq/index.rstindex 334b828b2..1dd29986a 100644 --- a/Documentation/faq/index.rst +++ b/Documentation/faq/index.rst @@ -30,6 +30,7 @@ Open vSwitch FAQ .. toctree:: :maxdepth: 2 + bareudp configuration contributing design diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst index 3623e3f40..68cbf1dbc 100644 --- a/Documentation/faq/releases.rst +++ b/Documentation/faq/releases.rst@@ -138,6 +138,7 @@ Q: Are all features available with all datapaths? Tunnel - ERSPAN 4.18 2.10 2.10NOTunnel - ERSPAN-IPv6 4.18 2.10 2.10NOTunnel - GTP-U NO NO 2.14NO + Tunnel - Bareudp 5.7 NO NO NO QoS - Policing YES 1.1 2.6 NO QoS - Shaping YES 1.1 NO NO sFlow YES 1.0 1.0 NO diff --git a/NEWS b/NEWS index 7e291a180..e3bc34a3f 100644 --- a/NEWS +++ b/NEWS @@ -75,7 +75,10 @@ v2.14.0 - 17 Aug 2020 - GTP-U Tunnel Protocol * Add two new fields: tun_gtpu_flags, tun_gtpu_msgtype. * Only support for userspace datapath. - + - Bareudp Tunnel+ * Bareudp device support is present in linux kernel from version5.7 + * Kernel bareudp device is not backported to ovs tree. + * Userspace datapath support is not addedAny plans on adding this?It will come as a subsequent patchstatic const char * vport_type_to_kind(enum ovs_vport_type type, @@ -113,6 +129,8 @@ vport_type_to_kind(enum ovs_vport_type type, } case OVS_VPORT_TYPE_GTPU: return NULL; + case OVS_VPORT_TYPE_BAREUDP: + return "bareudp"; case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: case OVS_VPORT_TYPE_LISP: @@ -243,6 +261,24 @@ dpif_netlink_rtnl_geneve_verify(const struct netdev_tunnel_config *tnl_cfg, return err; } +static int +dpif_netlink_rtnl_bareudp_verify(const struct netdev_tunnel_config *tnl_cfg,+ const char *kind, struct ofpbuf *reply)+{ + struct nlattr *bareudp[ARRAY_SIZE(bareudp_policy)]; + int err; + + err = rtnl_policy_parse(kind, reply, bareudp_policy, bareudp, + ARRAY_SIZE(bareudp_policy)); + if (!err) { + if ((tnl_cfg->dst_port != nl_attr_get_be16(bareudp[IFLA_BAREUDP_PORT])) + || (tnl_cfg->payload_ethertype+ != nl_attr_get_be16(bareudp[IFLA_BAREUDP_ETHERTYPE]))){ + err = EINVAL; + } + } + return err; +} static intdpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg,@@ -275,6 +311,9 @@ dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg, case OVS_VPORT_TYPE_GENEVE:err = dpif_netlink_rtnl_geneve_verify(tnl_cfg, kind, reply);break; + case OVS_VPORT_TYPE_BAREUDP:+ err = dpif_netlink_rtnl_bareudp_verify(tnl_cfg, kind, reply);+ break; case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: case OVS_VPORT_TYPE_LISP: @@ -357,6 +396,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, nl_msg_put_u8(&request, IFLA_GENEVE_UDP_ZERO_CSUM6_RX, 1);nl_msg_put_be16(&request, IFLA_GENEVE_PORT, tnl_cfg->dst_port);break; + case OVS_VPORT_TYPE_BAREUDP: + nl_msg_put_be16(&request, IFLA_BAREUDP_ETHERTYPE, + tnl_cfg->payload_ethertype); + if ((tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS)) || + (tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS_MCAST))) { + nl_msg_put_u16(&request, IFLA_BAREUDP_SRCPORT_MIN, + BAREUDP_MPLS_SRCPORT_MIN);So why do we set this for MPLS only? All other proposals have the same minport guidance: - https://tools.ietf.org/html/draft-xu-intarea-ip-in-udp-09 - https://tools.ietf.org/html/rfc8086We could pass the ephmeneral starting port for all the payload types.
I think that would be good, please add it for all, and we can add an exception if needed, but I do not see a use case for it now.
+ } + nl_msg_put_be16(&request, IFLA_BAREUDP_PORT, tnl_cfg->dst_port);+ if (tnl_cfg->exts & (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE)) { + nl_msg_put_flag(&request, IFLA_BAREUDP_MULTIPROTO_MODE);+ } + break; case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: case OVS_VPORT_TYPE_LISP: @@ -470,6 +522,7 @@ dpif_netlink_rtnl_port_destroy(const char *name, const char *type) case OVS_VPORT_TYPE_ERSPAN: case OVS_VPORT_TYPE_IP6ERSPAN: case OVS_VPORT_TYPE_IP6GRE: + case OVS_VPORT_TYPE_BAREUDP: return dpif_netlink_rtnl_destroy(name); case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 2f881e4fa..ceb56c685 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -749,6 +749,9 @@ get_vport_type(const struct dpif_netlink_vport *vport) case OVS_VPORT_TYPE_GTPU: return "gtpu"; + case OVS_VPORT_TYPE_BAREUDP: + return "bareudp"; + case OVS_VPORT_TYPE_UNSPEC: case __OVS_VPORT_TYPE_MAX: break; @@ -784,6 +787,8 @@ netdev_to_ovs_vport_type(const char *type) return OVS_VPORT_TYPE_GRE; } else if (!strcmp(type, "gtpu")) { return OVS_VPORT_TYPE_GTPU; + } else if (!strcmp(type, "bareudp")) { + return OVS_VPORT_TYPE_BAREUDP; } else { return OVS_VPORT_TYPE_UNSPEC; } diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c index 0252b61de..c86d420d7 100644 --- a/lib/netdev-vport.c +++ b/lib/netdev-vport.c @@ -112,7 +112,7 @@ netdev_vport_needs_dst_port(const struct netdev *dev) return (class->get_config == get_tunnel_config && (!strcmp("geneve", type) || !strcmp("vxlan", type) || !strcmp("lisp", type) || !strcmp("stt", type) || - !strcmp("gtpu", type))); + !strcmp("gtpu", type) || !strcmp("bareudp",type))); } const char * @@ -219,6 +219,8 @@ netdev_vport_construct(struct netdev *netdev_) dev->tnl_cfg.dst_port = port ? htons(port) : htons(STT_DST_PORT); } else if (!strcmp(type, "gtpu")) { dev->tnl_cfg.dst_port = port ? htons(port) : htons(GTPU_DST_PORT); + } else if (!strcmp(type, "bareudp")) { + dev->tnl_cfg.dst_port = htons(port); } dev->tnl_cfg.dont_fragment = true; @@ -438,6 +440,8 @@ tunnel_supported_layers(const char *type, return TNL_L2 | TNL_L3; } else if (!strcmp(type, "gtpu")) { return TNL_L3; + } else if (!strcmp(type, "bareudp")) { + return TNL_L3; } else { return TNL_L2; }@@ -745,6 +749,16 @@ set_tunnel_config(struct netdev *dev_, const structsmap *args, char **errp) goto out; } } + } else if (!strcmp(node->key, "payload_type")) { + if (strcmp(node->key, "mpls")) { + tnl_cfg.payload_ethertype = htons(ETH_TYPE_MPLS); + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); + } else if ((strcmp(node->key, "ip"))) { + tnl_cfg.payload_ethertype = htons(ETH_TYPE_IP); + tnl_cfg.exts |= (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE); + } else {+ tnl_cfg.payload_ethertype = htons(atoi(node->value));As the kernel only supports IPv4, IPv6, MPLS, and MPLS_MULTI, why not returnan error here if it's not one of these four?The kernel accepts all the ethertypes even the custom ones.The bareudp device can be used to tunnel a proprietary protocol with a customethertype (eg 0x123). But i agree that we must return a error if we are passed a unknown string (other than ip or mpls)
Good catch, taking any number will make it future proof :) As you mentioned, I think it would be good to make sure it’s really a number that is passed.
+ } } else {ds_put_format(&errors, "%s: unknown %s argument '%s'\n",name, type, node->key);@@ -917,7 +931,8 @@ get_tunnel_config(const struct netdev *dev, structsmap *args)(!strcmp("vxlan", type) && dst_port != VXLAN_DST_PORT) ||(!strcmp("lisp", type) && dst_port != LISP_DST_PORT) || (!strcmp("stt", type) && dst_port != STT_DST_PORT) || - (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT)) { + (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT) || + !strcmp("bareudp", type)) { smap_add_format(args, "dst_port", "%d", dst_port); } } @@ -1243,6 +1258,14 @@ netdev_vport_tunnel_register(void) }, {{NULL, NULL, 0, 0}} }, + { "udp_sys", + { + TUNNEL_FUNCTIONS_COMMON, + .type = "bareudp", + .get_ifindex = NETDEV_VPORT_GET_IFINDEX, + }, + {{NULL, NULL, 0, 0}} + }, }; static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; diff --git a/lib/netdev.h b/lib/netdev.h index fb5073056..b705a9e56 100644 --- a/lib/netdev.h +++ b/lib/netdev.h @@ -107,6 +107,7 @@ struct netdev_tunnel_config { bool out_key_flow; ovs_be64 out_key; + ovs_be16 payload_ethertype; ovs_be16 dst_port; bool ip_src_flow;diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.cindex 11aa20754..7eeff14f6 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -3573,6 +3573,7 @@ propagate_tunnel_data_to_flow(struct xlate_ctx *ctx, struct eth_addr dmac, case OVS_VPORT_TYPE_VXLAN: case OVS_VPORT_TYPE_GENEVE: case OVS_VPORT_TYPE_GTPU: + case OVS_VPORT_TYPE_BAREUDP: nw_proto = IPPROTO_UDP; break; case OVS_VPORT_TYPE_LISP: diff --git a/tests/system-layer3-tunnels.at b/tests/system-layer3-tunnels.at index 1232964bb..8423add2b 100644 --- a/tests/system-layer3-tunnels.at +++ b/tests/system-layer3-tunnels.atThese tests also get executed for the userspace test set,system-userspace-testsuite.at, which will fail, so it needs to be excluded.The check_min_kernel takes care of it? I see these tests are getting skipped for userspace tests
My bad, looked at the wrong macro definition :) It’s all good for now.
@@ -152,3 +152,51 @@ AT_CHECK([tail -1 stdout], [0], OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([layer3 - ping over MPLS Bareudp]) +OVS_CHECK_MIN_KERNEL(5, 7) +OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])]) +ADD_NAMESPACES(at_ns0, at_ns1) + +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "36:b1:ee:7c:01:01") +ADD_VETH(p1, at_ns1, br1, "10.1.1.2/24", "36:b1:ee:7c:01:02") + +ADD_OVS_TUNNEL([bareudp], [br0], [at_bareudp0], [8.1.1.3], [8.1.1.2/24], + [ options:local_ip=8.1.1.2 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) + +ADD_OVS_TUNNEL([bareudp], [br1], [at_bareudp1], [8.1.1.2], [8.1.1.3/24], + [options:local_ip=8.1.1.3 options:packet_type="legacy_l3" options:payload_type=mpls options:dst_port=6635]) + +AT_DATA([flows0.txt], [dnl +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp0+table=0,priority=100,dl_type=0x8847 in_port=at_bareudp0 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:01->dl_dst,set_field:36:b1:ee:7c:01:02->dl_src,output:ovs-p0+table=0,priority=10 actions=normal +])Maybe it would be good to also have an IP test case?i will add+AT_DATA([flows1.txt], [dnl +table=0,priority=100,dl_type=0x0800 actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp1+table=0,priority=100,dl_type=0x8847 in_port=at_bareudp1 actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:02->dl_dst,set_field:36:b1:ee:7c:01:01->dl_src,output:ovs-p1+table=0,priority=10 actions=normal +]) + +AT_CHECK([ip link add patch0 type veth peer name patch1]) +on_exit 'ip link del patch0' + +AT_CHECK([ip link set dev patch0 up]) +AT_CHECK([ip link set dev patch1 up]) +AT_CHECK([ovs-vsctl add-port br0 patch0]) +AT_CHECK([ovs-vsctl add-port br1 patch1]) + + +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br0 flows0.txt]) +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br1 flows1.txt]) + +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP -- 2.18.4Can you also update the vswitchd/ovs-vswitchd.conf.db.5 man page with thenew tunnel and options?I will add that.
Thanks, looking forward to your next rev. //Eelco _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
