On 10/11/22 08:11, Nobuhiro MIKI wrote:
> SRv6 (Segment Routing IPv6) tunnel vport is responsible
> for encapsulation and decapsulation the inner packets with
> IPv6 header and an extended header called SRH
> (Segment Routing Header). See spec in:
>
> https://datatracker.ietf.org/doc/html/rfc8754
>
> This patch implements SRv6 tunneling in userspace datapath.
> It uses `remote_ip` and `local_ip` options as with existing
> tunnel protocols. It also adds a dedicated `srv6_segs` option
> to define a sequence of routers called segment list.
>
> Signed-off-by: Nobuhiro MIKI <[email protected]>
Hi. Thanks for the patch and sorry for the late reply.
See some coments inline.
Best regards, Ilya Maximets.
> --
> v5:
> * rebased on latest master
> v4:
> * fix alignment on cast
> v3:
> * fix alignment on cast
> v2:
> * fix pointer arithmetic
> ---
> Documentation/faq/configuration.rst | 14 +++
> Documentation/faq/releases.rst | 1 +
> NEWS | 2 +
> include/linux/openvswitch.h | 1 +
> lib/dpif-netlink-rtnl.c | 5 +
> lib/dpif-netlink.c | 5 +
> lib/netdev-native-tnl.c | 141 ++++++++++++++++++++++++++++
> lib/netdev-native-tnl.h | 13 +++
> lib/netdev-vport.c | 52 ++++++++++
> lib/netdev.h | 4 +
> lib/odp-util.c | 20 ++++
> lib/packets.h | 16 ++++
> lib/tnl-ports.c | 100 ++++++++++++--------
> ofproto/ofproto-dpif-xlate.c | 24 +++--
> tests/tunnel-push-pop-ipv6.at | 23 +++++
> tests/tunnel.at | 56 +++++++++++
> 16 files changed, 429 insertions(+), 48 deletions(-)
>
> diff --git a/Documentation/faq/configuration.rst
> b/Documentation/faq/configuration.rst
> index dc6c92446..f879c0920 100644
> --- a/Documentation/faq/configuration.rst
> +++ b/Documentation/faq/configuration.rst
> @@ -238,6 +238,20 @@ Q: Does Open vSwitch support GTP-U?
> set int gtpu0 type=gtpu options:key=<teid> \
> options:remote_ip=172.31.1.1
>
> +Q: Does Open vSwitch support SRv6?
> +
> + A: Yes. Starting with version 3.1, the Open vSwitch userspace
> + datapath supports SRv6 (Segment Routing over IPv6). The following
> + example shows tunneling to fc00:300::1 via fc00:100::1 and fc00:200::1.
> +
> + ::
> +
> + $ ovs-vsctl add-br br0
> + $ ovs-vsctl add-port br0 srv6_0 -- \
> + set int srv6_0 type=srv6 \
> + options:remote_ip=fc00:300::1 \
> + options:srv6_segs="fc00:100::1,fc00:200::1,fc00:300::1"
> +
> Q: How do I connect two bridges?
>
> A: First, why do you want to do this? Two connected bridges are not much
> diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
> index ac0001cd5..d6d84a9ee 100644
> --- a/Documentation/faq/releases.rst
> +++ b/Documentation/faq/releases.rst
> @@ -152,6 +152,7 @@ Q: Are all features available with all datapaths?
> Tunnel - ERSPAN 4.18 2.10 2.10 NO
> Tunnel - ERSPAN-IPv6 4.18 2.10 2.10 NO
> Tunnel - GTP-U NO NO 2.14 NO
> + Tunnel - SRv6 NO NO 3.1 NO
> Tunnel - Bareudp 5.7 NO NO NO
> QoS - Policing YES 1.1 2.6 NO
> QoS - Shaping YES 1.1 NO NO
> diff --git a/NEWS b/NEWS
> index ff77ee404..c3ac30b52 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -23,6 +23,8 @@ Post-v3.0.0
> bug and CVE fixes addressed since its release.
> If a user wishes to benefit from these fixes it is recommended to use
> DPDK 21.11.2.
> + - SRv6 Tunnel Protocol
> + * Only support for userspace datapath.
>
>
> v3.0.0 - 15 Aug 2022
> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
> index 8bb5abdc8..987b5cbf7 100644
> --- a/include/linux/openvswitch.h
> +++ b/include/linux/openvswitch.h
> @@ -254,6 +254,7 @@ enum ovs_vport_type {
> OVS_VPORT_TYPE_IP6GRE = 109,
> OVS_VPORT_TYPE_GTPU = 110,
> OVS_VPORT_TYPE_BAREUDP = 111, /* Bareudp tunnel. */
> + OVS_VPORT_TYPE_SRV6 = 112, /* SRv6 tunnel. */
> __OVS_VPORT_TYPE_MAX
> };
>
> diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c
> index 4fc42daed..5788294ae 100644
> --- a/lib/dpif-netlink-rtnl.c
> +++ b/lib/dpif-netlink-rtnl.c
> @@ -129,6 +129,8 @@ vport_type_to_kind(enum ovs_vport_type type,
> }
> case OVS_VPORT_TYPE_GTPU:
> return NULL;
> + case OVS_VPORT_TYPE_SRV6:
> + return "srv6";
> case OVS_VPORT_TYPE_BAREUDP:
> return "bareudp";
> case OVS_VPORT_TYPE_NETDEV:
> @@ -319,6 +321,7 @@ dpif_netlink_rtnl_verify(const struct
> netdev_tunnel_config *tnl_cfg,
> case OVS_VPORT_TYPE_LISP:
> case OVS_VPORT_TYPE_STT:
> case OVS_VPORT_TYPE_GTPU:
> + case OVS_VPORT_TYPE_SRV6:
> case OVS_VPORT_TYPE_UNSPEC:
> case __OVS_VPORT_TYPE_MAX:
> default:
> @@ -411,6 +414,7 @@ dpif_netlink_rtnl_create(const struct
> netdev_tunnel_config *tnl_cfg,
> case OVS_VPORT_TYPE_LISP:
> case OVS_VPORT_TYPE_STT:
> case OVS_VPORT_TYPE_GTPU:
> + case OVS_VPORT_TYPE_SRV6:
> case OVS_VPORT_TYPE_UNSPEC:
> case __OVS_VPORT_TYPE_MAX:
> default:
> @@ -519,6 +523,7 @@ dpif_netlink_rtnl_port_destroy(const char *name, const
> char *type)
> case OVS_VPORT_TYPE_ERSPAN:
> case OVS_VPORT_TYPE_IP6ERSPAN:
> case OVS_VPORT_TYPE_IP6GRE:
> + case OVS_VPORT_TYPE_SRV6:
> case OVS_VPORT_TYPE_BAREUDP:
> return dpif_netlink_rtnl_destroy(name);
> case OVS_VPORT_TYPE_NETDEV:
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index a620a6ec5..ad210c69a 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -919,6 +919,9 @@ get_vport_type(const struct dpif_netlink_vport *vport)
> case OVS_VPORT_TYPE_GTPU:
> return "gtpu";
>
> + case OVS_VPORT_TYPE_SRV6:
> + return "srv6";
> +
> case OVS_VPORT_TYPE_BAREUDP:
> return "bareudp";
>
> @@ -957,6 +960,8 @@ netdev_to_ovs_vport_type(const char *type)
> return OVS_VPORT_TYPE_GRE;
> } else if (!strcmp(type, "gtpu")) {
> return OVS_VPORT_TYPE_GTPU;
> + } else if (!strcmp(type, "srv6")) {
> + return OVS_VPORT_TYPE_SRV6;
> } else if (!strcmp(type, "bareudp")) {
> return OVS_VPORT_TYPE_BAREUDP;
> } else {
> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> index b89dfdd52..652ed6f10 100644
> --- a/lib/netdev-native-tnl.c
> +++ b/lib/netdev-native-tnl.c
> @@ -356,6 +356,17 @@ gre_header_len(ovs_be16 flags)
> return hlen;
> }
>
> +static int
> +parse_srv6_header(struct dp_packet *packet,
> + struct flow_tnl *tnl)
> +{
> + unsigned int ulen;
> +
> + netdev_tnl_ip_extract_tnl_md(packet, tnl, &ulen);
> +
> + return ulen;
> +}
> +
> static int
> parse_gre_header(struct dp_packet *packet,
> struct flow_tnl *tnl)
> @@ -845,6 +856,136 @@ netdev_gtpu_build_header(const struct netdev *netdev,
> return 0;
> }
>
> +static void
> +srv6_build_header(struct ovs_action_push_tnl *data,
> + const struct netdev_tnl_build_header_params *params,
> + int nr_segs, const struct in6_addr *segs)
> +{
> + struct ovs_16aligned_ip6_hdr *nh6;
> + struct srv6_base_hdr *srh;
> + struct in6_addr *s;
> + ovs_be16 dl_type;
> + unsigned int hlen;
> + int i;
> +
> + ovs_assert(nr_segs > 0);
> +
> + nh6 = (struct ovs_16aligned_ip6_hdr *) eth_build_header(data, params);
> + put_16aligned_be32(&nh6->ip6_flow, htonl(6 << 28) |
> + htonl(params->flow->tunnel.ip_tos << 20));
> + nh6->ip6_hlim = params->flow->tunnel.ip_ttl;
> + nh6->ip6_nxt = IPPROTO_ROUTING;
> + memcpy(&nh6->ip6_src, params->s_ip, sizeof(ovs_be32[4]));
> + memcpy(&nh6->ip6_dst, &segs[0], sizeof(ovs_be32[4]));
> +
> +
> + srh = (struct srv6_base_hdr *) (nh6 + 1);
> + dl_type = params->flow->dl_type;
> + if (dl_type == htons(ETH_TYPE_IP)) {
> + srh->nexthdr = IPPROTO_IPIP;
> + } else if (dl_type == htons(ETH_TYPE_IPV6)) {
> + srh->nexthdr = IPPROTO_IPV6;
> + }
> + srh->type = IPV6_SRCRT_TYPE_4;
> + srh->hdrlen = 2 * nr_segs;
> + srh->segments_left = nr_segs - 1;
> + srh->last_entry = nr_segs - 1;
> + srh->flags = 0;
> + srh->tag = 0;
> +
> + s = ALIGNED_CAST(struct in6_addr *,
> + (char *) srh + sizeof(struct srv6_base_hdr));
> + for (i = 0; i < nr_segs; i++) {
> + /* Segment list is written to the header in reverse order. */
> + memcpy(s, &segs[nr_segs - i - 1], sizeof(ovs_be32[4]));
> + s++;
> + }
> +
> + hlen = IPV6_HEADER_LEN + sizeof(struct srv6_base_hdr) + 8 * srh->hdrlen;
> +
> + data->header_len += hlen;
> + data->tnl_type = OVS_VPORT_TYPE_SRV6;
> +}
> +
> +int
> +netdev_srv6_build_header(const struct netdev *netdev,
> + struct ovs_action_push_tnl *data,
> + const struct netdev_tnl_build_header_params *params)
> +{
> + struct netdev_vport *dev = netdev_vport_cast(netdev);
> + struct netdev_tunnel_config *tnl_cfg;
> +
> + ovs_mutex_lock(&dev->mutex);
> + tnl_cfg = &dev->tnl_cfg;
> +
> + if (tnl_cfg->srv6_num_segs) {
> + srv6_build_header(data, params,
> + tnl_cfg->srv6_num_segs, tnl_cfg->srv6_segs);
> + } else {
> + /*
> + * If explicit segment list setting is omitted, tunnel destination
> + * is considered to be the first segment list.
> + */
> + srv6_build_header(data, params,
> + 1, ¶ms->flow->tunnel.ipv6_dst);
> + }
> +
> + ovs_mutex_unlock(&dev->mutex);
> +
> + return 0;
> +}
> +
> +void
> +netdev_srv6_push_header(const struct netdev *netdev OVS_UNUSED,
> + struct dp_packet *packet OVS_UNUSED,
> + const struct ovs_action_push_tnl *data OVS_UNUSED)
> +{
> + int ip_tot_size;
> +
> + netdev_tnl_push_ip_header(packet, data->header,
> + data->header_len, &ip_tot_size);
> +}
> +
> +struct dp_packet *
> +netdev_srv6_pop_header(struct dp_packet *packet)
> +{
> + struct pkt_metadata *md = &packet->md;
> + struct flow_tnl *tnl = &md->tunnel;
> + struct srv6_base_hdr *srh;
> + int hlen = ETH_HEADER_LEN + IPV6_HEADER_LEN;
We assume here that packet doesn't have any other extension
headers. Is that always true? Should we also check that
the next header is actually a routing header? And if it is
a type 4 header?
> +
> + srh = ALIGNED_CAST(struct srv6_base_hdr *,
> + (char *) dp_packet_data(packet) + hlen);
> + if (srh->segments_left > 0) {
> + VLOG_WARN_RL(&err_rl, "invalid srv6 segments_left=%d\n",
> + srh->segments_left);
I suppose, this means that we do not support receiving
packets with a segment list specified. It might be OK,
but the limitation should be documented somewhere.
Is there a reason to not accept such packets? i.e. not
really decap the packet, but swap the destination IPs.
> + goto err;
> + }
> +
> + if (srh->nexthdr == IPPROTO_IPIP) {
> + packet->packet_type = htonl(PT_IPV4);
> + } else if (srh->nexthdr == IPPROTO_IPV6) {
> + packet->packet_type = htonl(PT_IPV6);
> + } else {
> + goto err;
> + }
> +
> + hlen += sizeof(struct srv6_base_hdr) + 8 * srh->hdrlen;
> +
> + pkt_metadata_init_tnl(md);
> +
> + hlen = parse_srv6_header(packet, tnl);
> +
> + dp_packet_reset_packet(packet, hlen);
> +
> + return packet;
> +
> +err:
> + dp_packet_delete(packet);
> +
> + return NULL;
> +}
> +
> struct dp_packet *
> netdev_vxlan_pop_header(struct dp_packet *packet)
> {
> diff --git a/lib/netdev-native-tnl.h b/lib/netdev-native-tnl.h
> index 22ae2ce53..b4007e3a2 100644
> --- a/lib/netdev-native-tnl.h
> +++ b/lib/netdev-native-tnl.h
> @@ -65,6 +65,19 @@ netdev_gtpu_build_header(const struct netdev *netdev,
> struct ovs_action_push_tnl *data,
> const struct netdev_tnl_build_header_params *p);
>
> +struct dp_packet *
> +netdev_srv6_pop_header(struct dp_packet *packet);
> +
> +void
> +netdev_srv6_push_header(const struct netdev *netdev,
> + struct dp_packet *packet,
> + const struct ovs_action_push_tnl *data);
> +
> +int
> +netdev_srv6_build_header(const struct netdev *netdev,
> + struct ovs_action_push_tnl *data,
> + const struct netdev_tnl_build_header_params *p);
I know that other prototypes in this file are not following that
guideline, but the type and the name should be on the same line.
Names should start from the new line in actual implementations,
and not prototypes.
> +
> void
> netdev_tnl_push_udp_header(const struct netdev *netdev,
> struct dp_packet *packet,
> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
> index 3b3927865..22b159f95 100644
> --- a/lib/netdev-vport.c
> +++ b/lib/netdev-vport.c
> @@ -424,6 +424,34 @@ parse_tunnel_ip(const char *value, bool accept_mcast,
> bool *flow,
> return 0;
> }
>
> +static int
> +parse_srv6_segs(char *s, struct in6_addr *segs, uint8_t *num_segs) {
'{' should be on a separate line.
> + char *save_ptr = NULL;
> + char *token;
> +
> + if (!s) {
> + return EINVAL;
> + }
> +
> + *num_segs = 0;
> +
> + while ((token = strtok_r(s, ",", &save_ptr)) != NULL) {
> + if (*num_segs == SRV6_MAX_SEGS) {
> + return EINVAL;
> + }
> +
> + if (inet_pton(AF_INET6, token, segs) != 1) {
> + return EINVAL;
> + }
> +
> + segs++;
> + (*num_segs)++;
> + s = NULL;
> + }
> +
> + return 0;
> +}
> +
> enum tunnel_layers {
> TNL_L2 = 1 << 0, /* 1 if a tunnel type can carry Ethernet traffic.
> */
> TNL_L3 = 1 << 1 /* 1 if a tunnel type can carry L3 traffic. */
> @@ -443,6 +471,8 @@ tunnel_supported_layers(const char *type,
> return TNL_L3;
> } else if (!strcmp(type, "bareudp")) {
> return TNL_L3;
> + } else if (!strcmp(type, "srv6")) {
> + return TNL_L3;
> } else {
> return TNL_L2;
> }
> @@ -750,6 +780,17 @@ set_tunnel_config(struct netdev *dev_, const struct smap
> *args, char **errp)
> goto out;
> }
> }
> + } else if (!strcmp(node->key, "srv6_segs")) {
> + err = parse_srv6_segs(node->value,
> + tnl_cfg.srv6_segs,
> + &tnl_cfg.srv6_num_segs);
> +
> + switch (err) {
> + case EINVAL:
> + ds_put_format(&errors, "%s: bad %s 'srv6_segs'\n",
> + name, node->value);
> + break;
> + }
> } else if (!strcmp(node->key, "payload_type")) {
> if (!strcmp(node->value, "mpls")) {
> tnl_cfg.payload_ethertype = htons(ETH_TYPE_MPLS);
> @@ -1290,6 +1331,17 @@ netdev_vport_tunnel_register(void)
> },
> {{NULL, NULL, 0, 0}}
> },
> + { "srv6_sys",
> + {
> + TUNNEL_FUNCTIONS_COMMON,
> + .type = "srv6",
> + .build_header = netdev_srv6_build_header,
> + .push_header = netdev_srv6_push_header,
> + .pop_header = netdev_srv6_pop_header,
> + .get_ifindex = NETDEV_VPORT_GET_IFINDEX,
> + },
> + {{NULL, NULL, 0, 0}}
> + },
>
> };
> static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
> diff --git a/lib/netdev.h b/lib/netdev.h
> index acf174927..ff207f56c 100644
> --- a/lib/netdev.h
> +++ b/lib/netdev.h
> @@ -140,6 +140,10 @@ struct netdev_tunnel_config {
> bool erspan_idx_flow;
> bool erspan_dir_flow;
> bool erspan_hwid_flow;
> +
> + uint8_t srv6_num_segs;
> + #define SRV6_MAX_SEGS 6
> + struct in6_addr srv6_segs[SRV6_MAX_SEGS];
> };
>
> void netdev_run(void);
> diff --git a/lib/odp-util.c b/lib/odp-util.c
> index ba5be4bb3..295fd3508 100644
> --- a/lib/odp-util.c
> +++ b/lib/odp-util.c
> @@ -714,6 +714,26 @@ format_odp_tnl_push_header(struct ds *ds, struct
> ovs_action_push_tnl *data)
> ds_put_char(ds, ')');
> }
>
> + ds_put_char(ds, ')');
> + } else if (data->tnl_type == OVS_VPORT_TYPE_SRV6) {
> + const struct srv6_base_hdr *srh;
> + struct in6_addr *segs;
This should be ovs_16aligned_ip6_hdr, so you will not need
an aligned cast.
> + int i;
> + int nr_segs;
> +
> + srh = ALIGNED_CAST(struct srv6_base_hdr *,
> + (char *) l3 + IPV6_HEADER_LEN);
> + segs = ALIGNED_CAST(struct in6_addr *,
> + (char *) srh + sizeof(struct srv6_base_hdr));
> + nr_segs = srh->last_entry + 1;
> +
> + ds_put_format(ds, "srv6(");
> + ds_put_format(ds, "segments_left=%d", srh->segments_left);
> + ds_put_format(ds, ",segs=");
> + for (i = 0; i < nr_segs; i++) {
> + ds_put_format(ds, i > 0 ? "," : "");
> + ipv6_format_addr(&segs[nr_segs - i - 1], ds);
> + }
> ds_put_char(ds, ')');
This is a formatting part, but we also need a parsing part in
ovs_parse_tnl_push(). And the unit test in tests/odp.at.
With the unit test you will also notice that we need a python
parsing library updated for this new format string.
See the python/ovs/flow/odp.py and python/ovs/tests/test_odp.py.
The parsing library is relatively new, it was added in 3.0.
> } else if (data->tnl_type == OVS_VPORT_TYPE_GRE ||
> data->tnl_type == OVS_VPORT_TYPE_IP6GRE) {
> diff --git a/lib/packets.h b/lib/packets.h
> index 5bdf6e4bb..6e48cd9e6 100644
> --- a/lib/packets.h
> +++ b/lib/packets.h
> @@ -710,6 +710,10 @@ char *ip_parse_cidr_len(const char *s, int *n, ovs_be32
> *ip,
> #define IPPROTO_UDPLITE 136
> #endif
>
> +#ifndef IPPROTO_IPIP
> +#define IPPROTO_IPIP 4
> +#endif
> +
> /* TOS fields. */
> #define IP_ECN_NOT_ECT 0x0
> #define IP_ECN_ECT_1 0x01
> @@ -1514,6 +1518,18 @@ BUILD_ASSERT_DECL(sizeof(struct vxlanhdr) == 8);
> #define VXLAN_F_GPE 0x4000
> #define VXLAN_HF_GPE 0x04000000
>
> +/* SRv6 protocol header */
> +#define IPV6_SRCRT_TYPE_4 4
> +struct srv6_base_hdr {
> + uint8_t nexthdr;
> + uint8_t hdrlen;
> + uint8_t type;
> + uint8_t segments_left;
> + uint8_t last_entry;
> + uint8_t flags;
> + ovs_be16 tag;
> +};
This should have a build-time assertion on the structure size
as the other structures.
> +
> /* Input values for PACKET_TYPE macros have to be in host byte order.
> * The _BE postfix indicates result is in network byte order. Otherwise
> result
> * is in host byte order. */
> diff --git a/lib/tnl-ports.c b/lib/tnl-ports.c
> index 050eafa6b..33313d8fd 100644
> --- a/lib/tnl-ports.c
> +++ b/lib/tnl-ports.c
> @@ -126,7 +126,7 @@ map_insert(odp_port_t port, struct eth_addr mac, struct
> in6_addr *addr,
> /* XXX: No fragments support. */
> match.wc.masks.nw_frag = FLOW_NW_FRAG_MASK;
>
> - /* 'tp_port' is zero for GRE tunnels. In this case it
> + /* 'tp_port' is zero for GRE and SRv6 tunnels. In this case it
> * doesn't make sense to match on UDP port numbers. */
Currently to detect that a tunnel port should receive a packet
we're matching on the protocol and the port number.
GRE is an exception, since it's another IP in IP protocol, but
it has a distinct protocol number and a special header, so it's
simple to detect.
However, with SRv6 we have a generic IPv6 header with generic
IPIP or IPV6 network protocol number. It looks like OVS will
intercept any other IPIP or IPv6|IPv6 packets, try to decapsuate
them as if they are SRv6 and drop if they are not.
This miht be an unwanted behavior.
I wonder if we actually need to match on the exstence of the
routing extension header. For that we'll need parts of the
following chnage:
https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/
In any case it should be documented, I think, that addition
of the SRv6 port may cause problems for IPIP traffic destined
to the node.
> if (tp_port) {
> match.wc.masks.tp_dst = OVS_BE16_MAX;
> @@ -161,26 +161,31 @@ map_insert_ipdev__(struct ip_device *ip_dev, char
> dev_name[],
> }
> }
>
> -static uint8_t
> -tnl_type_to_nw_proto(const char type[])
> +static void
> +tnl_type_to_nw_proto(const char type[], uint8_t nw_protos[2])
> {
> + nw_protos[1] = 0;
> +
> if (!strcmp(type, "geneve")) {
> - return IPPROTO_UDP;
> + nw_protos[0] = IPPROTO_UDP;
> }
> if (!strcmp(type, "stt")) {
> - return IPPROTO_TCP;
> + nw_protos[0] = IPPROTO_TCP;
> }
> if (!strcmp(type, "gre") || !strcmp(type, "erspan") ||
> !strcmp(type, "ip6erspan") || !strcmp(type, "ip6gre")) {
> - return IPPROTO_GRE;
> + nw_protos[0] = IPPROTO_GRE;
> }
> if (!strcmp(type, "vxlan")) {
> - return IPPROTO_UDP;
> + nw_protos[0] = IPPROTO_UDP;
> }
> if (!strcmp(type, "gtpu")) {
> - return IPPROTO_UDP;
> + nw_protos[0] = IPPROTO_UDP;
> + }
> + if (!strcmp(type, "srv6")) {
> + nw_protos[0] = IPPROTO_IPIP;
> + nw_protos[1] = IPPROTO_IPV6;
> }
> - return 0;
> }
>
> void
> @@ -189,34 +194,40 @@ tnl_port_map_insert(odp_port_t port, ovs_be16 tp_port,
> {
> struct tnl_port *p;
> struct ip_device *ip_dev;
> - uint8_t nw_proto;
> + uint8_t nw_protos[2];
> + int i;
>
> - nw_proto = tnl_type_to_nw_proto(type);
> - if (!nw_proto) {
> - return;
> - }
> + tnl_type_to_nw_proto(type, nw_protos);
>
> ovs_mutex_lock(&mutex);
> - LIST_FOR_EACH(p, node, &port_list) {
> - if (p->port == port && p->nw_proto == nw_proto) {
> - ovs_refcount_ref(&p->ref_cnt);
> - goto out;
> + for (i = 0; i < 2; i++) {
Might make sense to extract the body into a helper function
tnl_port_map_insert__() and call it in the loop.
Same for the deletion.
> + if (!nw_protos[i]) {
> + goto next;
> }
> +
> + LIST_FOR_EACH (p, node, &port_list) {
> + if (p->port == port && p->nw_proto == nw_protos[i]) {
> + ovs_refcount_ref(&p->ref_cnt);
> + goto next;
> + }
> + }
> +
> + p = xzalloc(sizeof *p);
> + p->port = port;
> + p->tp_port = tp_port;
> + p->nw_proto = nw_protos[i];
> + ovs_strlcpy(p->dev_name, dev_name, sizeof p->dev_name);
> + ovs_refcount_init(&p->ref_cnt);
> + ovs_list_insert(&port_list, &p->node);
> +
> + LIST_FOR_EACH (ip_dev, node, &addr_list) {
> + map_insert_ipdev__(ip_dev, p->dev_name, p->port, p->nw_proto,
> + p->tp_port);
> + }
> +next:
> + ;
> }
>
> - p = xzalloc(sizeof *p);
> - p->port = port;
> - p->tp_port = tp_port;
> - p->nw_proto = nw_proto;
> - ovs_strlcpy(p->dev_name, dev_name, sizeof p->dev_name);
> - ovs_refcount_init(&p->ref_cnt);
> - ovs_list_insert(&port_list, &p->node);
> -
> - LIST_FOR_EACH(ip_dev, node, &addr_list) {
> - map_insert_ipdev__(ip_dev, p->dev_name, p->port, p->nw_proto,
> p->tp_port);
> - }
> -
> -out:
> ovs_mutex_unlock(&mutex);
> }
>
> @@ -261,20 +272,27 @@ tnl_port_map_delete(odp_port_t port, const char type[])
> {
> struct tnl_port *p;
> struct ip_device *ip_dev;
> - uint8_t nw_proto;
> + uint8_t nw_protos[2];
> + int i;
>
> - nw_proto = tnl_type_to_nw_proto(type);
> + tnl_type_to_nw_proto(type, nw_protos);
>
> ovs_mutex_lock(&mutex);
> - LIST_FOR_EACH_SAFE (p, node, &port_list) {
> - if (p->port == port && p->nw_proto == nw_proto &&
> - ovs_refcount_unref_relaxed(&p->ref_cnt) == 1) {
> - ovs_list_remove(&p->node);
> - LIST_FOR_EACH(ip_dev, node, &addr_list) {
> - ipdev_map_delete(ip_dev, p->tp_port, p->nw_proto);
> + for (i = 0; i < 2; i++) {
> + if (!nw_protos[i]) {
> + continue;
> + }
> +
> + LIST_FOR_EACH_SAFE (p, node, &port_list) {
> + if (p->port == port && p->nw_proto == nw_protos[i] &&
> + ovs_refcount_unref_relaxed(&p->ref_cnt) == 1) {
> + ovs_list_remove(&p->node);
> + LIST_FOR_EACH (ip_dev, node, &addr_list) {
> + ipdev_map_delete(ip_dev, p->tp_port, p->nw_proto);
> + }
> + free(p);
> + break;
> }
> - free(p);
> - break;
> }
> }
> ovs_mutex_unlock(&mutex);
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index ab6f39bb2..af73923da 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -3609,20 +3609,25 @@ propagate_tunnel_data_to_flow(struct xlate_ctx *ctx,
> struct eth_addr dmac,
> struct flow *base_flow, *flow;
> flow = &ctx->xin->flow;
> base_flow = &ctx->base_flow;
> - uint8_t nw_proto = 0;
> + uint8_t nw_protos[2] = {0};
> + int i;
>
> switch (tnl_type) {
> case OVS_VPORT_TYPE_GRE:
> case OVS_VPORT_TYPE_ERSPAN:
> case OVS_VPORT_TYPE_IP6ERSPAN:
> case OVS_VPORT_TYPE_IP6GRE:
> - nw_proto = IPPROTO_GRE;
> + nw_protos[0] = IPPROTO_GRE;
> break;
> case OVS_VPORT_TYPE_VXLAN:
> case OVS_VPORT_TYPE_GENEVE:
> case OVS_VPORT_TYPE_GTPU:
> case OVS_VPORT_TYPE_BAREUDP:
> - nw_proto = IPPROTO_UDP;
> + nw_protos[0] = IPPROTO_UDP;
> + break;
> + case OVS_VPORT_TYPE_SRV6:
> + nw_protos[0] = IPPROTO_IPIP;
> + nw_protos[1] = IPPROTO_IPV6;
> break;
> case OVS_VPORT_TYPE_LISP:
> case OVS_VPORT_TYPE_STT:
> @@ -3637,10 +3642,15 @@ propagate_tunnel_data_to_flow(struct xlate_ctx *ctx,
> struct eth_addr dmac,
> * Update base_flow first followed by flow as the dst_flow gets modified
> * in the function.
> */
> - propagate_tunnel_data_to_flow__(base_flow, flow, dmac, smac, s_ip6, s_ip,
> - is_tnl_ipv6, nw_proto);
> - propagate_tunnel_data_to_flow__(flow, flow, dmac, smac, s_ip6, s_ip,
> - is_tnl_ipv6, nw_proto);
> + for (i = 0; i < 2; i++) {
> + if (!nw_protos[i]) {
> + continue;
> + }
> + propagate_tunnel_data_to_flow__(base_flow, flow, dmac, smac, s_ip6,
> + s_ip, is_tnl_ipv6, nw_protos[i]);
> + propagate_tunnel_data_to_flow__(flow, flow, dmac, smac, s_ip6,
> + s_ip, is_tnl_ipv6, nw_protos[i]);
It doesn't seem to make sense to call this function twice with
different protocols as the second call will overwrite everything
that the first one set.
> + }
> }
>
> static int
> diff --git a/tests/tunnel-push-pop-ipv6.at b/tests/tunnel-push-pop-ipv6.at
> index c96b77cd1..249c3bf1e 100644
> --- a/tests/tunnel-push-pop-ipv6.at
> +++ b/tests/tunnel-push-pop-ipv6.at
Beside these "unit" tests it would also be great to have system
tests in tests/system-traffic.at that will show interoperability
between our and native kernel's implementation.
> @@ -202,6 +202,8 @@ AT_CHECK([ovs-vsctl add-port int-br t2 -- set Interface
> t2 type=vxlan \
> options:remote_ip=flow options:key=123
> ofport_request=5\
> -- add-port int-br t5 -- set Interface t5 type=gre \
> options:remote_ip=2001:cafe::92 options:key=455
> options:packet_type=legacy_l3 ofport_request=6\
> + -- add-port int-br t6 -- set Interface t6 type=srv6 \
> + options:remote_ip=2001:cafe::92 ofport_request=7\
> ], [0])
>
> AT_CHECK([ovs-appctl dpif/show], [0], [dnl
> @@ -216,12 +218,15 @@ dummy@ovs-dummy: hit:0 missed:0
> t3 4/4789: (vxlan: csum=true, out_key=flow, remote_ip=2001:cafe::93)
> t4 5/6081: (geneve: key=123, remote_ip=flow)
> t5 6/3: (gre: key=455, packet_type=legacy_l3, remote_ip=2001:cafe::92)
> + t6 7/6: (srv6: remote_ip=2001:cafe::92)
> ])
>
> AT_CHECK([ovs-appctl tnl/ports/show |sort], [0], [dnl
> Listening ports:
> genev_sys_6081 (6081) ref_cnt=1
> gre_sys (3) ref_cnt=2
> +srv6_sys (6) ref_cnt=1
> +srv6_sys (6) ref_cnt=1
> vxlan_sys_4789 (4789) ref_cnt=2
> ])
>
> @@ -363,6 +368,8 @@ AT_CHECK([ovs-appctl tnl/ports/show |sort], [0], [dnl
> Listening ports:
> genev_sys_6081 (6081) ref_cnt=1
> gre_sys (3) ref_cnt=2
> +srv6_sys (6) ref_cnt=1
> +srv6_sys (6) ref_cnt=1
> vxlan_sys_4789 (4789) ref_cnt=2
> ])
>
> @@ -384,6 +391,12 @@ AT_CHECK([tail -1 stdout], [0],
> [Datapath actions: tnl_pop(6081)
> ])
>
> +dnl Check SRv6 tunnel pop
> +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy
> 'in_port(1),eth(src=f8:bc:12:44:34:b6,dst=aa:55:aa:55:00:00),eth_type(0x86dd),ipv6(src=2001:cafe::92,dst=2001:cafe::88,label=0,proto=4,tclass=0x0,hlimit=64)'],
> [0], [stdout])
> +AT_CHECK([tail -1 stdout], [0],
> + [Datapath actions: tnl_pop(6)
> +])
> +
> dnl Check VXLAN tunnel push
> AT_CHECK([ovs-ofctl add-flow int-br action=2])
> AT_CHECK([ovs-appctl ofproto/trace ovs-dummy
> 'in_port(2),eth(src=f8:bc:12:44:34:b6,dst=aa:55:aa:55:00:01),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=47,tos=0,ttl=64,frag=no)'],
> [0], [stdout])
> @@ -405,6 +418,13 @@ AT_CHECK([tail -1 stdout], [0],
> [Datapath actions:
> tnl_push(tnl_port(3),header(size=62,type=109,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x86dd),ipv6(src=2001:cafe::88,dst=2001:cafe::92,label=0,proto=47,tclass=0x0,hlimit=64),gre((flags=0x2000,proto=0x6558),key=0x1c8)),out_port(100)),1
> ])
>
> +dnl Check SRv6 tunnel push
> +AT_CHECK([ovs-ofctl add-flow int-br action=7])
> +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy
> 'in_port(2),eth(src=f8:bc:12:44:34:b6,dst=aa:55:aa:55:00:01),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=47,tos=0,ttl=64,frag=no)'],
> [0], [stdout])
> +AT_CHECK([tail -1 stdout], [0],
> + [Datapath actions:
> pop_eth,tnl_push(tnl_port(6),header(size=78,type=112,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x86dd),ipv6(src=2001:cafe::88,dst=2001:cafe::92,label=0,proto=43,tclass=0x0,hlimit=64),srv6(segments_left=0,segs=2001:cafe::92)),out_port(100)),1
> +])
> +
> dnl Check Geneve tunnel push
> AT_CHECK([ovs-ofctl add-flow int-br
> "actions=set_field:2001:cafe::92->tun_ipv6_dst,5"])
> AT_CHECK([ovs-appctl ofproto/trace ovs-dummy
> 'in_port(2),eth(src=f8:bc:12:44:34:b6,dst=aa:55:aa:55:00:01),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=47,tos=0,ttl=64,frag=no)'],
> [0], [stdout])
> @@ -510,6 +530,8 @@ AT_CHECK([ovs-appctl tnl/ports/show |sort], [0], [dnl
> Listening ports:
> genev_sys_6081 (6081) ref_cnt=1
> gre_sys (3) ref_cnt=1
> +srv6_sys (6) ref_cnt=1
> +srv6_sys (6) ref_cnt=1
> vxlan_sys_4789 (4789) ref_cnt=1
> vxlan_sys_4790 (4790) ref_cnt=1
> ])
> @@ -518,6 +540,7 @@ AT_CHECK([ovs-vsctl del-port int-br t1 \
> -- del-port int-br t2 \
> -- del-port int-br t4 \
> -- del-port int-br t5 \
> + -- del-port int-br t6 \
> ], [0])
>
> dnl Check tunnel lookup entries after deleting all remaining tunnel ports
> diff --git a/tests/tunnel.at b/tests/tunnel.at
> index 037b4c390..4964987f5 100644
> --- a/tests/tunnel.at
> +++ b/tests/tunnel.at
> @@ -1223,3 +1223,59 @@ AT_CHECK([ovs-vsctl add-port br0 p1 -- set int p1
> type=dummy])
> OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
> OVS_APP_EXIT_AND_WAIT([ovsdb-server])]
> AT_CLEANUP
> +
> +AT_SETUP([tunnel - SRV6 basic])
> +OVS_VSWITCHD_START([add-port br0 p1 -- set Interface p1 type=dummy \
> + ofport_request=1 \
> + -- add-port br0 p2 -- set Interface p2 type=srv6 \
> + options:remote_ip=flow \
> + ofport_request=2])
> +OVS_VSWITCHD_DISABLE_TUNNEL_PUSH_POP
> +
> +dnl First setup dummy interface IP address, then add the route
> +dnl so that tnl-port table can get valid IP address for the device.
> +AT_CHECK([ovs-appctl netdev-dummy/ip6addr br0 fc00::1/64], [0], [OK
> +])
> +AT_CHECK([ovs-appctl ovs/route/add fc00::0/64 br0], [0], [OK
> +])
> +AT_CHECK([ovs-appctl ovs/route/show], [0], [dnl
> +Route Table:
> +User: fc00::/64 dev br0 SRC fc00::1
> +])
> +
> +AT_DATA([flows.txt], [dnl
> +in_port=1,actions=set_field:fc00::2->tun_ipv6_dst,output:2
> +in_port=2,actions=1
> +])
> +AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
> +
> +AT_CHECK([ovs-appctl dpif/show | tail -n +3], [0], [dnl
> + br0 65534/100: (dummy-internal)
> + p1 1/1: (dummy)
> + p2 2/6: (srv6: remote_ip=flow)
> +])
> +
> +AT_CHECK([ovs-appctl tnl/ports/show |sort], [0], [dnl
> +Listening ports:
> +srv6_sys (6) ref_cnt=1
> +srv6_sys (6) ref_cnt=1
> +])
> +
> +AT_CHECK([ovs-appctl ofproto/list-tunnels], [0], [dnl
> +port 6: p2 (srv6: ::->flow, key=0, legacy_l3, dp port=6, ttl=64)
> +])
> +
> +dnl Encap: ipv4 inner packet
> +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy
> 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=4,ttl=128,frag=no),tcp(src=8,dst=9)'],
> [0], [stdout])
> +AT_CHECK([tail -1 stdout], [0],
> + [Datapath actions: set(tunnel(ipv6_dst=fc00::2,ttl=64,flags(df))),pop_eth,6
> +])
> +
> +dnl Encap: ipv6 inner packet
> +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy
> 'in_port(1),eth(src=50:54:00:00:00:05,dst=50:54:00:00:00:07),eth_type(0x86dd),ipv6(src=2001:cafe::92,dst=2001:cafe::88,label=0,proto=47,tclass=0x0,hlimit=64)'],
> [0], [stdout])
> +AT_CHECK([tail -1 stdout], [0],
> + [Datapath actions: set(tunnel(ipv6_dst=fc00::2,ttl=64,flags(df))),pop_eth,6
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev