Re: [ovs-dev] [PATCH v3 6/6] netdev-dpdk: Refactor tunnel checksum offloading.
On Wed, May 15, 2024 at 2:11 PM Kevin Traynor wrote: > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > > index 1dad2ef833..31dd6f1d5a 100644 > > --- a/lib/netdev-dpdk.c > > +++ b/lib/netdev-dpdk.c > > @@ -2584,6 +2584,9 @@ static bool > > netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf > > *mbuf) > > { > > struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf); > > +void *l2; > > +void *l3; > > +void *l4; > > > > const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM | > > RTE_MBUF_F_TX_L4_MASK | > > @@ -2613,11 +2616,6 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk > > *dev, struct rte_mbuf *mbuf) > > return true; > > } > > > > -ovs_assert(dp_packet_l4(pkt)); > > - > > -/* If packet is vxlan or geneve tunnel packet, calculate outer > > - * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated > > - * before. */ > > const uint64_t tunnel_type = mbuf->ol_flags & > > RTE_MBUF_F_TX_TUNNEL_MASK; > > if (OVS_UNLIKELY(tunnel_type && > > tunnel_type != RTE_MBUF_F_TX_TUNNEL_GENEVE && > > @@ -2635,6 +2633,11 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk > > *dev, struct rte_mbuf *mbuf) > > (char *) dp_packet_eth(pkt); > > mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) - > > (char *) dp_packet_l3(pkt); > > + > > > +/* Inner L2 length must account for the tunnel header length. > > */ > > +l2 = dp_packet_l4(pkt); > > Code looks ok to me, but it's tricky and the L2 settings with inner > requests are a bit unintuitive without a notepad and thinking from the > driver perspective backwards. Not sure there is much can be done to > mitigate that here, other than the comment you added. Unfortunately, I don't have a better idea. It was already unintuitive before this patch, but to make it worse, the logic was split across lib/netdev-dpdk.c and lib/netdev-native-tnl.c. Like for example this comment in dp_packet_tnl_ol_process(), which is DPDK specific. /* Attention please, tunnel inner l2 len is consist of udp header * len and tunnel header len and inner l2 len. */ > > Did you manage to test to confirm they're working as expected ? In general, I tested the series with CX6, E810 and ixgbe, with ipv4 traffic, ipv4 traffic tunneled in ipv4/ipv6 vxlan and ipv4 traffic tunneled in ipv4/ipv6 geneve. But I am not sure I covered every possible combinations. Specifically for this case you point at (outer + inner offloads), I tested CX6 with IPv4/IPv6 VxLAN and Geneve (for which I have traces in my bash history). With E810, I remember testing the same with the DPDK fixes, but I don't have a trace of it. I'll double check before sending a next revision. > > > +l3 = dp_packet_inner_l3(pkt); > > +l4 = dp_packet_inner_l4(pkt); > > see below > > > } else { > > /* If no outer offloading is requested, clear outer marks. */ > > mbuf->ol_flags &= ~all_outer_marks; > > @@ -2642,8 +2645,9 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, > > struct rte_mbuf *mbuf) > > mbuf->outer_l3_len = 0; > > > > /* Skip outer headers. */ > > -mbuf->l2_len += (char *) dp_packet_l4(pkt) - > > -(char *) dp_packet_eth(pkt); > > +l2 = dp_packet_eth(pkt); > > > +l3 = dp_packet_inner_l3(pkt); > > +l4 = dp_packet_inner_l4(pkt); > > You could move these outside the inner (pardon the pun) if else, but I > could understand if you prefer to set l2/l3/l4 together for better > readability ? Well, as you noted, this code is not trivial. I preferred to have all 3 pointers grouped, with a comment relating to the group. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3 5/6] netdev-dpdk: Use guest TSO segmentation size hint.
On Wed, May 15, 2024 at 2:09 PM Kevin Traynor wrote: > > On 19/04/2024 15:06, David Marchand wrote: > > In a typical setup like: > > guest A <-virtio-> OVS A <-vxlan-> OVS B <-virtio-> guest B > > > > TSO packets from guest A are segmented against the OVS A physical port > > mtu adjusted by the vxlan tunnel header size, regardless of guest A > > interface mtu. > > > > As an example, let's say guest A and guest B mtu are set to 1500 bytes. > > OVS A and OVS B physical ports mtu are set to 1600 bytes. > > Guest A will request TCP segmentation for 1448 bytes segments. > > On the other hand, OVS A will request 1498 bytes segments to the HW. > > This results in OVS B dropping packets because decapsulated packets > > are larger than the vhost-user port (serving guest B) mtu. > > > > 2024-04-17T14:13:01.239Z|2|netdev_dpdk(pmd-c03/id:7)|WARN|vhost0: > > Too big size 1564 max_packet_len 1518 > > > > vhost-user ports expose a guest mtu by filling mbuf->tso_segsz. > > Use it as a hint. > > > > This may result in segments (on the wire) slightly shorter than the > > optimal size. > > > > Reported-at: https://github.com/openvswitch/ovs-issues/issues/321 > > Signed-off-by: David Marchand > > --- > > Note: > > As we trust the guest with this change, should we put a lower limit on > > mbuf->tso_segsz? > > > > There are some checks I looked at (e.g [0]), but it could be checked > here for an earlier drop i suppose...additional comment below > > [0] > https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3754 I guess you meant https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3818 And same in v23.11, there are checks at the prepare stage: https://git.dpdk.org/dpdk-stable/tree/drivers/net/ice/ice_rxtx.c?h=23.11#n3681 I had forgotten about those checks. There is no limit exposed per driver from DPDK, so the simpler for OVS is to trust them. > > > --- > > lib/netdev-dpdk.c | 11 --- > > 1 file changed, 8 insertions(+), 3 deletions(-) > > > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > > index 661269e4b6..1dad2ef833 100644 > > --- a/lib/netdev-dpdk.c > > +++ b/lib/netdev-dpdk.c > > @@ -2671,14 +2671,19 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk > > *dev, struct rte_mbuf *mbuf) > > > > if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) { > > struct tcp_header *th = dp_packet_l4(pkt); > > +uint16_t link_tso_segsz; > > int hdr_len; > > > > if (tunnel_type) { > > -mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - > > - mbuf->l4_len - mbuf->outer_l3_len; > > +link_tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - > > + mbuf->l4_len - mbuf->outer_l3_len; > > } else { > > mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4; > > -mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; > > +link_tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; > > +} > > + > > +if (!mbuf->tso_segsz || mbuf->tso_segsz > link_tso_segsz) { > > It seems like something is not right if the flag is set but tso_segsz is > 0. It is set by vhost lib from gso_size, but I don't see a validation > there either. At the time I added a check on the 0 value, I thought there was a case where RTE_MBUF_F_TX_TCP_SEG could be set with no segsz value. But as you mention, all setters of this flag (either in vhost or in OVS) set a segsz too. So with segsz always set, combined with the drivers check, OVS probably does not need any check on tso_segsz. I intend to remove this check in a next revision. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3 4/6] netdev-dpdk: Refactor TSO request code.
Hello Kevin, Thanks for reviewing. On Fri, May 10, 2024 at 11:50 PM Kevin Traynor wrote: > > On 19/04/2024 15:06, David Marchand wrote: > > Replace check on th == NULL with an assert() because dp_packet_l4(pkt) > > is priorly used to compute (outer) L3 length. > > > > Besides, filling l4_len and tso_segsz only matters to TSO, so there is > > no need to check for other L4 checksum offloading requests. > > > > Signed-off-by: David Marchand > > --- > > lib/netdev-dpdk.c | 36 +++- > > 1 file changed, 11 insertions(+), 25 deletions(-) > > > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > > index 8b6a3ed189..661269e4b6 100644 > > --- a/lib/netdev-dpdk.c > > +++ b/lib/netdev-dpdk.c > > @@ -2584,7 +2584,6 @@ static bool > > netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf > > *mbuf) > > { > > struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf); > > -struct tcp_header *th; > > > > const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM | > > RTE_MBUF_F_TX_L4_MASK | > > @@ -2614,6 +2613,8 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, > > struct rte_mbuf *mbuf) > > return true; > > } > > > > +ovs_assert(dp_packet_l4(pkt)); > > I'm not clear why you want to change this from a warning/return > fail/drop to an assert ? From this point in the function, there is at least one request for checksum offloading pending. Any L3 (or higher) checksum requested by OVS means that the packet has been parsed/composed as either IP or IPv6 and packet->l4_ofs was set to point after the l3 header (with miniflow_extract / *_compose() helpers). So getting a NULL pointer for l4 here indicates a bug in OVS. An assert seems better than a warn/return that probably nobody notice(d). Did I miss a case where l4_ofs can be unset? > > Nit: should this be in the previous patch instead ? and I see it is > removed in a later patch. It is not supposed to be removed in the series. The last patch moves it later in the function. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] ci: Set platform parameter when building DPDK.
This change has no impact, since -Dmachine=default gets converted by DPDK into -Dplatform=generic (since v21.08, see the link to DPDK commit below). Yet, switch to explicitly setting -Dplatform and avoid the following warning: 2024-04-18T14:50:16.8001092Z config/meson.build:113: WARNING: The "machine" option is deprecated. Please use "cpu_instruction_set" instead. While at it, solve another warning and call explicitly meson setup. 2024-04-18T14:50:17.0770596Z WARNING: Running the setup command as `meson [options]` instead of `meson setup [options]` is ambiguous and deprecated. Link: https://git.dpdk.org/dpdk/commit/?id=bf66003b51ec Signed-off-by: David Marchand --- .ci/dpdk-build.sh | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh index 23f3166a54..e1b8e3ccbb 100755 --- a/.ci/dpdk-build.sh +++ b/.ci/dpdk-build.sh @@ -25,9 +25,9 @@ function build_dpdk() pushd dpdk-src fi -# Switching to 'default' machine to make the dpdk cache usable on +# Switching to 'generic' platform to make the dpdk cache usable on # different CPUs. We can't be sure that all CI machines are exactly same. -DPDK_OPTS="$DPDK_OPTS -Dmachine=default" +DPDK_OPTS="$DPDK_OPTS -Dplatform=generic" # Disable building DPDK unit tests. Not needed for OVS build or tests. DPDK_OPTS="$DPDK_OPTS -Dtests=false" @@ -49,7 +49,7 @@ function build_dpdk() # Install DPDK using prefix. DPDK_OPTS="$DPDK_OPTS --prefix=$DPDK_INSTALL_DIR" -meson $DPDK_OPTS build +meson setup $DPDK_OPTS build ninja -C build ninja -C build install popd -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 6/6] netdev-dpdk: Refactor tunnel checksum offloading.
All informations required for checksum offloading can be deduced by already tracked dp_packet l3_ofs, l4_ofs, inner_l3_ofs and inner_l4_ofs fields. Remove DPDK specific l[2-4]_len from generic OVS code. netdev-dpdk code then fills mbuf specifics step by step: - outer_l2_len and outer_l3_len are needed for tunneling (and below features), - l2_len and l3_len are needed for IP and L4 checksum (and below features), - l4_len and tso_segsz are needed when doing TSO, Signed-off-by: David Marchand --- lib/dp-packet.h | 37 -- lib/netdev-dpdk.c | 35 ++--- lib/netdev-native-tnl.c | 50 + 3 files changed, 27 insertions(+), 95 deletions(-) diff --git a/lib/dp-packet.h b/lib/dp-packet.h index 3622764c47..a75b1c5cdb 100644 --- a/lib/dp-packet.h +++ b/lib/dp-packet.h @@ -604,25 +604,6 @@ dp_packet_get_nd_payload(const struct dp_packet *b) } #ifdef DPDK_NETDEV -static inline void -dp_packet_set_l2_len(struct dp_packet *b, size_t l2_len) -{ -b->mbuf.l2_len = l2_len; -} - -static inline void -dp_packet_set_l3_len(struct dp_packet *b, size_t l3_len) -{ -b->mbuf.l3_len = l3_len; -} - -static inline void -dp_packet_set_l4_len(struct dp_packet *b, size_t l4_len) -{ -b->mbuf.l4_len = l4_len; -} - - static inline uint64_t * dp_packet_ol_flags_ptr(const struct dp_packet *b) { @@ -642,24 +623,6 @@ dp_packet_flow_mark_ptr(const struct dp_packet *b) } #else -static inline void -dp_packet_set_l2_len(struct dp_packet *b OVS_UNUSED, size_t l2_len OVS_UNUSED) -{ -/* There is no implementation. */ -} - -static inline void -dp_packet_set_l3_len(struct dp_packet *b OVS_UNUSED, size_t l3_len OVS_UNUSED) -{ -/* There is no implementation. */ -} - -static inline void -dp_packet_set_l4_len(struct dp_packet *b OVS_UNUSED, size_t l4_len OVS_UNUSED) -{ -/* There is no implementation. */ -} - static inline uint32_t * dp_packet_ol_flags_ptr(const struct dp_packet *b) { diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 1dad2ef833..31dd6f1d5a 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2584,6 +2584,9 @@ static bool netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) { struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf); +void *l2; +void *l3; +void *l4; const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM | RTE_MBUF_F_TX_L4_MASK | @@ -2613,11 +2616,6 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) return true; } -ovs_assert(dp_packet_l4(pkt)); - -/* If packet is vxlan or geneve tunnel packet, calculate outer - * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated - * before. */ const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK; if (OVS_UNLIKELY(tunnel_type && tunnel_type != RTE_MBUF_F_TX_TUNNEL_GENEVE && @@ -2635,6 +2633,11 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) (char *) dp_packet_eth(pkt); mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) - (char *) dp_packet_l3(pkt); + +/* Inner L2 length must account for the tunnel header length. */ +l2 = dp_packet_l4(pkt); +l3 = dp_packet_inner_l3(pkt); +l4 = dp_packet_inner_l4(pkt); } else { /* If no outer offloading is requested, clear outer marks. */ mbuf->ol_flags &= ~all_outer_marks; @@ -2642,8 +2645,9 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) mbuf->outer_l3_len = 0; /* Skip outer headers. */ -mbuf->l2_len += (char *) dp_packet_l4(pkt) - -(char *) dp_packet_eth(pkt); +l2 = dp_packet_eth(pkt); +l3 = dp_packet_inner_l3(pkt); +l4 = dp_packet_inner_l4(pkt); } } else { if (tunnel_type) { @@ -2663,22 +2667,27 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) } mbuf->outer_l2_len = 0; mbuf->outer_l3_len = 0; -mbuf->l2_len = (char *) dp_packet_l3(pkt) - - (char *) dp_packet_eth(pkt); -mbuf->l3_len = (char *) dp_packet_l4(pkt) - - (char *) dp_packet_l3(pkt); + +l2 = dp_packet_eth(pkt); +l3 = dp_packet_l3(pkt); +l4 = dp_packet_l4(pkt); } +ovs_assert(l4); + +mbuf->l2_len = (char *) l3 - (char *) l2; +mbuf->l3_len = (char *) l4 - (char *) l3; + if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) { -struct tcp_header *th = dp_packet_l4(pkt); +struct tcp_header *th = l4; uint16_t link_tso_segsz;
[ovs-dev] [PATCH v3 5/6] netdev-dpdk: Use guest TSO segmentation size hint.
In a typical setup like: guest A <-virtio-> OVS A <-vxlan-> OVS B <-virtio-> guest B TSO packets from guest A are segmented against the OVS A physical port mtu adjusted by the vxlan tunnel header size, regardless of guest A interface mtu. As an example, let's say guest A and guest B mtu are set to 1500 bytes. OVS A and OVS B physical ports mtu are set to 1600 bytes. Guest A will request TCP segmentation for 1448 bytes segments. On the other hand, OVS A will request 1498 bytes segments to the HW. This results in OVS B dropping packets because decapsulated packets are larger than the vhost-user port (serving guest B) mtu. 2024-04-17T14:13:01.239Z|2|netdev_dpdk(pmd-c03/id:7)|WARN|vhost0: Too big size 1564 max_packet_len 1518 vhost-user ports expose a guest mtu by filling mbuf->tso_segsz. Use it as a hint. This may result in segments (on the wire) slightly shorter than the optimal size. Reported-at: https://github.com/openvswitch/ovs-issues/issues/321 Signed-off-by: David Marchand --- Note: As we trust the guest with this change, should we put a lower limit on mbuf->tso_segsz? --- lib/netdev-dpdk.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 661269e4b6..1dad2ef833 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2671,14 +2671,19 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) { struct tcp_header *th = dp_packet_l4(pkt); +uint16_t link_tso_segsz; int hdr_len; if (tunnel_type) { -mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - - mbuf->l4_len - mbuf->outer_l3_len; +link_tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - + mbuf->l4_len - mbuf->outer_l3_len; } else { mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4; -mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; +link_tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; +} + +if (!mbuf->tso_segsz || mbuf->tso_segsz > link_tso_segsz) { +mbuf->tso_segsz = link_tso_segsz; } hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len; -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 4/6] netdev-dpdk: Refactor TSO request code.
Replace check on th == NULL with an assert() because dp_packet_l4(pkt) is priorly used to compute (outer) L3 length. Besides, filling l4_len and tso_segsz only matters to TSO, so there is no need to check for other L4 checksum offloading requests. Signed-off-by: David Marchand --- lib/netdev-dpdk.c | 36 +++- 1 file changed, 11 insertions(+), 25 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 8b6a3ed189..661269e4b6 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2584,7 +2584,6 @@ static bool netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) { struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf); -struct tcp_header *th; const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM | RTE_MBUF_F_TX_L4_MASK | @@ -2614,6 +2613,8 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) return true; } +ovs_assert(dp_packet_l4(pkt)); + /* If packet is vxlan or geneve tunnel packet, calculate outer * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated * before. */ @@ -2667,22 +2668,10 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) mbuf->l3_len = (char *) dp_packet_l4(pkt) - (char *) dp_packet_l3(pkt); } -th = dp_packet_l4(pkt); if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) { -if (!th) { -VLOG_WARN_RL(, "%s: TCP Segmentation without L4 header" - " pkt len: %"PRIu32"", dev->up.name, mbuf->pkt_len); -return false; -} -} - -if ((mbuf->ol_flags & RTE_MBUF_F_TX_L4_MASK) == RTE_MBUF_F_TX_TCP_CKSUM) { -if (!th) { -VLOG_WARN_RL(, "%s: TCP offloading without L4 header" - " pkt len: %"PRIu32"", dev->up.name, mbuf->pkt_len); -return false; -} +struct tcp_header *th = dp_packet_l4(pkt); +int hdr_len; if (tunnel_type) { mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - @@ -2692,16 +2681,13 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; } -if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) { -int hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len; -if (OVS_UNLIKELY((hdr_len + - mbuf->tso_segsz) > dev->max_packet_len)) { -VLOG_WARN_RL(, "%s: Oversized TSO packet. hdr: %"PRIu32", " - "gso: %"PRIu32", max len: %"PRIu32"", - dev->up.name, hdr_len, mbuf->tso_segsz, - dev->max_packet_len); -return false; -} +hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len; +if (OVS_UNLIKELY((hdr_len + mbuf->tso_segsz) > dev->max_packet_len)) { +VLOG_WARN_RL(, "%s: Oversized TSO packet. hdr: %"PRIu32", " + "gso: %"PRIu32", max len: %"PRIu32"", + dev->up.name, hdr_len, mbuf->tso_segsz, + dev->max_packet_len); +return false; } } -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 3/6] netdev-dpdk: Fix inner checksum when outer is not supported.
If outer checksum is not supported and OVS already set L3/L4 outer checksums in the packet, no outer mark should be left in ol_flags (as it confuses some driver, like net/ixgbe). l2_len must be adjusted to account for the tunnel header. Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.") Signed-off-by: David Marchand --- lib/netdev-dpdk.c | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index f732716141..8b6a3ed189 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2629,10 +2629,21 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) } if (tunnel_type && (mbuf->ol_flags & all_inner_requests)) { -mbuf->outer_l2_len = (char *) dp_packet_l3(pkt) - - (char *) dp_packet_eth(pkt); -mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) - - (char *) dp_packet_l3(pkt); +if (mbuf->ol_flags & all_outer_requests) { +mbuf->outer_l2_len = (char *) dp_packet_l3(pkt) - + (char *) dp_packet_eth(pkt); +mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) - + (char *) dp_packet_l3(pkt); +} else { +/* If no outer offloading is requested, clear outer marks. */ +mbuf->ol_flags &= ~all_outer_marks; +mbuf->outer_l2_len = 0; +mbuf->outer_l3_len = 0; + +/* Skip outer headers. */ +mbuf->l2_len += (char *) dp_packet_l4(pkt) - +(char *) dp_packet_eth(pkt); +} } else { if (tunnel_type) { /* No inner offload is requested, fallback to non tunnel -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 1/6] netdev-dpdk: Fallback to non tunnel checksum offloading.
The outer checksum offloading API in DPDK is ambiguous and was implemented by Intel folks in their drivers with the assumption that any outer offloading always goes with an inner offloading request. With net/i40e and net/ice drivers, in the case of encapsulating a ARP packet in a vxlan tunnel (which results in requesting outer ip checksum with a tunnel context but no inner offloading request), a Tx failure is triggered, associated with a port MDD event. 2024-03-27T16:02:07.084Z|00018|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event To avoid this situation, if no checksum or segmentation offloading is requested on the inner part of a packet, fallback to "normal" (non outer) offloading request. Reported-at: https://github.com/openvswitch/ovs-issues/issues/321 Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.") Fixes: f81d782c1906 ("netdev-native-tnl: Mark all vxlan/geneve packets as tunneled.") Signed-off-by: David Marchand --- Changes since v2: - kept offloads disabled for net/i40e and net/ice as this patch does not fix outer udp checksum (a DPDK fix is required), - updated commitlog with details to reproduce the issue, - adjusted indent, Changes since v1: - reset inner marks before converting outer requests, - fixed some coding style, --- lib/netdev-dpdk.c | 71 +++ 1 file changed, 41 insertions(+), 30 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 2111f77681..7e109903c0 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2584,16 +2584,18 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf); struct tcp_header *th; -const uint64_t all_requests = (RTE_MBUF_F_TX_IP_CKSUM | - RTE_MBUF_F_TX_L4_MASK | - RTE_MBUF_F_TX_OUTER_IP_CKSUM | - RTE_MBUF_F_TX_OUTER_UDP_CKSUM | - RTE_MBUF_F_TX_TCP_SEG); -const uint64_t all_marks = (RTE_MBUF_F_TX_IPV4 | -RTE_MBUF_F_TX_IPV6 | -RTE_MBUF_F_TX_OUTER_IPV4 | -RTE_MBUF_F_TX_OUTER_IPV6 | -RTE_MBUF_F_TX_TUNNEL_MASK); +const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM | + RTE_MBUF_F_TX_L4_MASK | + RTE_MBUF_F_TX_TCP_SEG); +const uint64_t all_outer_requests = (RTE_MBUF_F_TX_OUTER_IP_CKSUM | + RTE_MBUF_F_TX_OUTER_UDP_CKSUM); +const uint64_t all_requests = all_inner_requests | all_outer_requests; +const uint64_t all_inner_marks = (RTE_MBUF_F_TX_IPV4 | + RTE_MBUF_F_TX_IPV6); +const uint64_t all_outer_marks = (RTE_MBUF_F_TX_OUTER_IPV4 | + RTE_MBUF_F_TX_OUTER_IPV6 | + RTE_MBUF_F_TX_TUNNEL_MASK); +const uint64_t all_marks = all_inner_marks | all_outer_marks; if (!(mbuf->ol_flags & all_requests)) { /* No offloads requested, no marks should be set. */ @@ -2614,34 +2616,43 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated * before. */ const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK; -if (tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENEVE || -tunnel_type == RTE_MBUF_F_TX_TUNNEL_VXLAN) { -mbuf->outer_l2_len = (char *) dp_packet_l3(pkt) - - (char *) dp_packet_eth(pkt); -mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) - - (char *) dp_packet_l3(pkt); - -/* If neither inner checksums nor TSO is requested, inner marks - * should not be set. */ -if (!(mbuf->ol_flags & (RTE_MBUF_F_TX_IP_CKSUM | -RTE_MBUF_F_TX_L4_MASK | -RTE_MBUF_F_TX_TCP_SEG))) { -mbuf->ol_flags &= ~(RTE_MBUF_F_TX_IPV4 | -RTE_MBUF_F_TX_IPV6); -} -} else if (OVS_UNLIKELY(tunnel_type)) { +if (OVS_UNLIKELY(tunnel_type && + tunnel_type != RTE_MBUF_F_TX_TUNNEL_GENEVE && + tunnel_type != RTE_MBUF_F_TX_TUNNEL_VXLAN)) { VLOG_WARN_RL(, "%s: Unexpected tunnel type: %#"PRIx64, netdev_get_name(>up), tunnel_type); netdev_dpdk_mbuf_dump(netdev_get_name(>up), "Packet with unexpected tunnel type", mbuf); return false; +} + +if (tunnel_type && (mbuf->ol_flags & all_inner_requests)) { +mbuf->outer_l2_len =
[ovs-dev] [PATCH v3 2/6] netdev-dpdk: Disable outer UDP checksum for net/iavf.
Same as the commit 6f93d8e62f13 ("netdev-dpdk: Disable outer UDP checksum offload for ice/i40e driver."), disable outer UDP checksum and related offloads for net/iavf. Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.") Signed-off-by: David Marchand --- Note: - DPDK (in progress) fixes can be found at: https://patchwork.dpdk.org/project/dpdk/list/?series=31780=* --- lib/netdev-dpdk.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 7e109903c0..f732716141 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -1355,12 +1355,14 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) } if (!strcmp(info.driver_name, "net_ice") -|| !strcmp(info.driver_name, "net_i40e")) { +|| !strcmp(info.driver_name, "net_i40e") +|| !strcmp(info.driver_name, "net_iavf")) { /* FIXME: Driver advertises the capability but doesn't seem * to actually support it correctly. Can remove this once * the driver is fixed on DPDK side. */ VLOG_INFO("%s: disabled Tx outer udp checksum offloads for a " - "net/ice or net/i40e port.", netdev_get_name(>up)); + "net/ice, net/i40e or net/iavf port.", + netdev_get_name(>up)); info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM; info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO; info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO; -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v1] nedev-dpdk: Fix config with dpdk net_bonding offloads.
Hello, On Fri, Apr 12, 2024 at 8:30 AM Jun Wang wrote: > > If it's a DPDK net_bonding, it may cause > offload-related configurations to take effect, > leading to offload failure. I did not look at the patch for now. What is the interest of using a net/bonding DPDK port when there is native support of bonding in OVS? I am not familiar with OVN setups so maybe I am missing something on this side. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Fallback to non tunnel offloading API.
On Fri, Apr 5, 2024 at 3:00 PM Ilya Maximets wrote: > >> > >>> Basically, resolving a neighbor with ARP inside tunnels is broken on > >>> Intel nics (even without re-enabling outer udp checksum). > >>> This can be observed with the following debug patch at the end of > >>> netdev_dpdk_prep_hwol_packet(): > >>> > >>> +char buf[256]; > >>> +if (rte_get_tx_ol_flag_list(mbuf->ol_flags, buf, sizeof(buf)) != 0) > >>> +buf[0] = '\0'; > >>> +VLOG_WARN("len=%u, ol_flags=%s, outer_l2_len=%u, outer_l3_len=%u, > >>> l2_len=%u, l3_len=%u, l4_len=%u, tso_segsz=%u", mbuf->pkt_len, buf, > >>> mbuf->outer_l2_len, mbuf->outer_l3_len, mbuf->l2_len, mbuf->l3_len, > >>> mbuf->l4_len, mbuf->tso_segsz); > >>> > >>> Then doing a "arping" inside the tunnel triggers: > >>> 2024-04-03T16:05:40.920Z|00014|netdev_dpdk(pmd-c03/id:8)|WARN|len=96, > >>> ol_flags=RTE_MBUF_F_TX_L4_NO_CKSUM RTE_MBUF_F_TX_OUTER_IP_CKSUM > >>> RTE_MBUF_F_TX_OUTER_IPV4 RTE_MBUF_F_TX_TUNNEL_VXLAN , outer_l2_len=18, > >>> outer_l3_len=20, l2_len=0, l3_len=0, l4_len=0, tso_segsz=0 > > The fact that l2_len and l3_len are not set here looks like an OVS > bug though, as AFAIU, these should always be set if any Tx offload > is requested. The commit that introduces such Tx offloads requests is: f81d782c19 - netdev-native-tnl: Mark all vxlan/geneve packets as tunneled. (7 weeks ago) -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Fallback to non tunnel offloading API.
On Wed, Apr 3, 2024 at 8:13 PM Ilya Maximets wrote: > > - This patch fixes some misusage of the DPDK API. > > Hmm, I understand that the driver does something funny when it gets > outer flags set without any inner flags, but how is that a misuse > of the DPDK API? Could you point me to the API docs that say that > inner flags must always be set in this case or that setting only > outer offloads is not allowed? Setting the tunnel type (which is set along outer checksum in OVS) is described as: /** * Bits 45:48 used for the tunnel type. * The tunnel type must be specified for TSO or checksum on the inner part * of tunnel packets. * These flags can be used with RTE_MBUF_F_TX_TCP_SEG for TSO, or * RTE_MBUF_F_TX_xxx_CKSUM. * The mbuf fields for inner and outer header lengths are required: * outer_l2_len, outer_l3_len, l2_len, l3_len, l4_len and tso_segsz for TSO. */ #define RTE_MBUF_F_TX_TUNNEL_VXLAN (0x1ULL << 45) #define RTE_MBUF_F_TX_TUNNEL_GRE (0x2ULL << 45) #define RTE_MBUF_F_TX_TUNNEL_IPIP(0x3ULL << 45) #define RTE_MBUF_F_TX_TUNNEL_GENEVE (0x4ULL << 45) /** TX packet with MPLS-in-UDP RFC 7510 header. */ #define RTE_MBUF_F_TX_TUNNEL_MPLSINUDP (0x5ULL << 45) #define RTE_MBUF_F_TX_TUNNEL_VXLAN_GPE (0x6ULL << 45) #define RTE_MBUF_F_TX_TUNNEL_GTP (0x7ULL << 45) #define RTE_MBUF_F_TX_TUNNEL_ESP (0x8ULL << 45) It is not specified what to expect it neither TSO nor inner checksum is requested. In a same way, it is not described what to expect if outer API is called with no inner offload. Adding Ferruh and Thomas who may have one opinion. > > I agree that it seems safer to just downgrade all outer flags to > inner ones on OVS side, when no inner offloads are requested, I'm > just not sure I agree that it's an API misuse. Especially since > non-Intel cards seem to work fine. I suppose you mean mlx5. Has it been tested on other nics? > > > Basically, resolving a neighbor with ARP inside tunnels is broken on > > Intel nics (even without re-enabling outer udp checksum). > > This can be observed with the following debug patch at the end of > > netdev_dpdk_prep_hwol_packet(): > > > > +char buf[256]; > > +if (rte_get_tx_ol_flag_list(mbuf->ol_flags, buf, sizeof(buf)) != 0) > > +buf[0] = '\0'; > > +VLOG_WARN("len=%u, ol_flags=%s, outer_l2_len=%u, outer_l3_len=%u, > > l2_len=%u, l3_len=%u, l4_len=%u, tso_segsz=%u", mbuf->pkt_len, buf, > > mbuf->outer_l2_len, mbuf->outer_l3_len, mbuf->l2_len, mbuf->l3_len, > > mbuf->l4_len, mbuf->tso_segsz); > > > > Then doing a "arping" inside the tunnel triggers: > > 2024-04-03T16:05:40.920Z|00014|netdev_dpdk(pmd-c03/id:8)|WARN|len=96, > > ol_flags=RTE_MBUF_F_TX_L4_NO_CKSUM RTE_MBUF_F_TX_OUTER_IP_CKSUM > > RTE_MBUF_F_TX_OUTER_IPV4 RTE_MBUF_F_TX_TUNNEL_VXLAN , outer_l2_len=18, > > outer_l3_len=20, l2_len=0, l3_len=0, l4_len=0, tso_segsz=0 > > 2024-04-03T16:05:40.920Z|00012|dpdk|WARN|ice_interrupt_handler(): > > OICR: MDD event > > > > We need this fix in OVS regardless of the outer udp checksum issue. > > I'll respin this fix in a new series, without touching UDP checksum capa. > > > > > > - It does seem that X710 nics have no support for outer udp checksum > > (according to its datasheet). Some X722 version may have support for > > it, but net/i40e does not configure the Tx descriptor accordingly. > > On the other hand, E810 ones seem fine (according to its datasheet). > > > > After more debugging, I managed to get outer udp checksum working. > > I understand the DPDK rte_net_intel_cksum_flags_prepare() helper does > > not set the pseudo header checksum in the outer udp header. > > I proposed a fix in the dpdk bz. > > > > Waiting for the fix on DPDK side... it is still possible to add the > > missing bits in OVS (see the branch I pointed at in the OVS issue). > > Since this feature never worked with ice in OVS and it is experimental, > I tend to think that we should just disable it for ice as well until > DPDK is fixed. > > A little too many fixes for that thing we have already and this one will > involve some extra driver-specific logic that we don't have any automated > tests for. I don't mind waiting for the DPDK fix before re-enabling outer udp and other offloads. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Fallback to non tunnel offloading API.
On Thu, Mar 28, 2024 at 10:16 AM David Marchand wrote: > > The outer checksum offloading API in DPDK is ambiguous and was > added by Intel folks with the assumption that any outer offloading > always goes with an inner offloading request. > > With net/i40e and net/ice drivers, requesting outer ip checksum with a > tunnel context but no inner offloading request triggers a Tx failure > associated with a port MDD event. > 2024-03-27T16:02:07.084Z|00018|dpdk|WARN|ice_interrupt_handler(): OICR: > MDD event > > To avoid this situation, if no checksum or segmentation offloading is > requested on the inner part of a packet, fallback to "normal" (non outer) > offloading request. > And outer offloading can be re-enabled for net/i40e and netice. > > Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.") > Signed-off-by: David Marchand > --- > Changes since v1: > - reset inner marks before converting outer requests, > - fixed some coding style, > > --- > lib/netdev-dpdk.c | 83 --- > 1 file changed, 43 insertions(+), 40 deletions(-) > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index 2111f77681..ae43594a3d 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -1354,18 +1354,6 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) > info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_TCP_CKSUM; > } > > -if (!strcmp(info.driver_name, "net_ice") > -|| !strcmp(info.driver_name, "net_i40e")) { > -/* FIXME: Driver advertises the capability but doesn't seem > - * to actually support it correctly. Can remove this once > - * the driver is fixed on DPDK side. */ > -VLOG_INFO("%s: disabled Tx outer udp checksum offloads for a " > - "net/ice or net/i40e port.", netdev_get_name(>up)); > -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM; > -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO; > -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO; > -} > - A few comments after spending some time on the topic. - This patch fixes some misusage of the DPDK API. Basically, resolving a neighbor with ARP inside tunnels is broken on Intel nics (even without re-enabling outer udp checksum). This can be observed with the following debug patch at the end of netdev_dpdk_prep_hwol_packet(): +char buf[256]; +if (rte_get_tx_ol_flag_list(mbuf->ol_flags, buf, sizeof(buf)) != 0) +buf[0] = '\0'; +VLOG_WARN("len=%u, ol_flags=%s, outer_l2_len=%u, outer_l3_len=%u, l2_len=%u, l3_len=%u, l4_len=%u, tso_segsz=%u", mbuf->pkt_len, buf, mbuf->outer_l2_len, mbuf->outer_l3_len, mbuf->l2_len, mbuf->l3_len, mbuf->l4_len, mbuf->tso_segsz); Then doing a "arping" inside the tunnel triggers: 2024-04-03T16:05:40.920Z|00014|netdev_dpdk(pmd-c03/id:8)|WARN|len=96, ol_flags=RTE_MBUF_F_TX_L4_NO_CKSUM RTE_MBUF_F_TX_OUTER_IP_CKSUM RTE_MBUF_F_TX_OUTER_IPV4 RTE_MBUF_F_TX_TUNNEL_VXLAN , outer_l2_len=18, outer_l3_len=20, l2_len=0, l3_len=0, l4_len=0, tso_segsz=0 2024-04-03T16:05:40.920Z|00012|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event We need this fix in OVS regardless of the outer udp checksum issue. I'll respin this fix in a new series, without touching UDP checksum capa. - It does seem that X710 nics have no support for outer udp checksum (according to its datasheet). Some X722 version may have support for it, but net/i40e does not configure the Tx descriptor accordingly. On the other hand, E810 ones seem fine (according to its datasheet). After more debugging, I managed to get outer udp checksum working. I understand the DPDK rte_net_intel_cksum_flags_prepare() helper does not set the pseudo header checksum in the outer udp header. I proposed a fix in the dpdk bz. Waiting for the fix on DPDK side... it is still possible to add the missing bits in OVS (see the branch I pointed at in the OVS issue). - About the workaround (disabling outer udp checksum for net/ice and net/i40e), the net/iavf is subject to the same bugs. So we should disable outer udp checksum too for this driver. However, I am not sure the iavf driver (can?) differentiates which PF / hw is used underneath. So we may have no solution but to always disable this type of offloading in OVS for net/iavf. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Request for Source File Location: 'ovs-vswitchd.conf.db' MAN Page
Hello, On Thu, Mar 28, 2024 at 11:13 AM Farhan Tariq wrote: > > I'm adding a new feature to OvS and need to update the 'ovs-vswitchd.conf.db' > MAN page. However, I'm unable to locate the source file for this MAN page to > add the necessary information about the new feature. > > Could you please provide the exact file path or location for the > 'ovs-vswitchd.conf.db' MAN page? You are probably looking for vswitchd/vswitch.xml. ovsdb man page gets generated from vswitchd/vswitch.xml (and vswitchd/vswitch.ovsschema) content. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] netdev-dpdk: Fallback to non tunnel offloading API.
The outer checksum offloading API in DPDK is ambiguous and was added by Intel folks with the assumption that any outer offloading always goes with an inner offloading request. With net/i40e and net/ice drivers, requesting outer ip checksum with a tunnel context but no inner offloading request triggers a Tx failure associated with a port MDD event. 2024-03-27T16:02:07.084Z|00018|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event To avoid this situation, if no checksum or segmentation offloading is requested on the inner part of a packet, fallback to "normal" (non outer) offloading request. And outer offloading can be re-enabled for net/i40e and netice. Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.") Signed-off-by: David Marchand --- Changes since v1: - reset inner marks before converting outer requests, - fixed some coding style, --- lib/netdev-dpdk.c | 83 --- 1 file changed, 43 insertions(+), 40 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 2111f77681..ae43594a3d 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -1354,18 +1354,6 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_TCP_CKSUM; } -if (!strcmp(info.driver_name, "net_ice") -|| !strcmp(info.driver_name, "net_i40e")) { -/* FIXME: Driver advertises the capability but doesn't seem - * to actually support it correctly. Can remove this once - * the driver is fixed on DPDK side. */ -VLOG_INFO("%s: disabled Tx outer udp checksum offloads for a " - "net/ice or net/i40e port.", netdev_get_name(>up)); -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM; -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO; -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO; -} - if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_IPV4_CKSUM) { dev->hw_ol_features |= NETDEV_TX_IPV4_CKSUM_OFFLOAD; } else { @@ -2584,16 +2572,18 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf); struct tcp_header *th; -const uint64_t all_requests = (RTE_MBUF_F_TX_IP_CKSUM | - RTE_MBUF_F_TX_L4_MASK | - RTE_MBUF_F_TX_OUTER_IP_CKSUM | - RTE_MBUF_F_TX_OUTER_UDP_CKSUM | - RTE_MBUF_F_TX_TCP_SEG); -const uint64_t all_marks = (RTE_MBUF_F_TX_IPV4 | -RTE_MBUF_F_TX_IPV6 | -RTE_MBUF_F_TX_OUTER_IPV4 | -RTE_MBUF_F_TX_OUTER_IPV6 | -RTE_MBUF_F_TX_TUNNEL_MASK); +const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM | + RTE_MBUF_F_TX_L4_MASK | + RTE_MBUF_F_TX_TCP_SEG); +const uint64_t all_outer_requests = (RTE_MBUF_F_TX_OUTER_IP_CKSUM | + RTE_MBUF_F_TX_OUTER_UDP_CKSUM); +const uint64_t all_requests = all_inner_requests | all_outer_requests; +const uint64_t all_inner_marks = (RTE_MBUF_F_TX_IPV4 | + RTE_MBUF_F_TX_IPV6); +const uint64_t all_outer_marks = (RTE_MBUF_F_TX_OUTER_IPV4 | + RTE_MBUF_F_TX_OUTER_IPV6 | + RTE_MBUF_F_TX_TUNNEL_MASK); +const uint64_t all_marks = all_inner_marks | all_outer_marks; if (!(mbuf->ol_flags & all_requests)) { /* No offloads requested, no marks should be set. */ @@ -2610,32 +2600,45 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) return true; } +const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK; +if (OVS_UNLIKELY(tunnel_type + && tunnel_type != RTE_MBUF_F_TX_TUNNEL_GENEVE + && tunnel_type != RTE_MBUF_F_TX_TUNNEL_VXLAN)) { +VLOG_WARN_RL(, "%s: Unexpected tunnel type: %#"PRIx64, + netdev_get_name(>up), tunnel_type); +netdev_dpdk_mbuf_dump(netdev_get_name(>up), + "Packet with unexpected tunnel type", mbuf); +return false; +} + /* If packet is vxlan or geneve tunnel packet, calculate outer * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated * before. */ -const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK; -if (tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENEVE || -tunnel_type == RTE_MBUF_F_TX_TUNNEL_VXLAN) { +if ((tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENE
Re: [ovs-dev] [PATCH] netdev-dpdk: Fallback to non tunnel offloading API.
On Thu, Mar 28, 2024 at 5:40 AM junwan...@cestc.cn wrote: > > I validated this modification on my x710 network card, but I found that > the outer UDP checksum of the transmitted packets is incorrect, leading > to communication abnormalities. I think it's necessary to disable the outer > UDP checksum because although the capability reported by DPDK > indicates support, in reality, the hardware doesn't actually support > offloading, > resulting in outer UDP checksum errors. > > tx_geneve_tso_offload="false", tx_ip_csum_offload="true", > tx_out_ip_csum_offload="true", > tx_out_udp_csum_offload="true", tx_sctp_csum_offload="true", > tx_tcp_csum_offload="true", > tx_tcp_seg_offload="false", tx_udp_csum_offload="true", > tx_vxlan_tso_offload="false" Well, good timing, thanks for the report. I was testing ipv6 in ipv4 (which seemed to work) and I realised something is wrong at the outer -> inner conversion by looking at the ol_flags in my debug prints. Now, trying the opposite (ipv4 in ipv6), I think I reproduce your issue with a E810 nic: 04:50:46.211854 50:7c:6f:3c:0c:26 > 50:7c:6f:3c:10:5a, ethertype IPv6 (0x86dd), length 168: (hlim 64, next-header UDP (17) payload length: 114) 2001:4e48::2.39854 > 2001::1.geneve: [bad udp cksum 0x89ad -> 0xfadd!] Geneve, Flags [none], vni 0x0, proto TEB (0x6558) 52:54:00:00:11:01 > 4e:a9:1d:ce:85:4a, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 9408, offset 0, flags [DF], proto ICMP (1), length 84) 172.31.22.2 > 172.31.22.1: ICMP echo request, id 1442, seq 9, length 64 Please have a try with the v2 (I'll post soon). If it still fails, can you provide a reproducer (ideally without OVN to reduce the scope)? -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] netdev-dpdk: Fallback to non tunnel offloading API.
On Wed, Mar 27, 2024 at 5:51 PM David Marchand wrote: > /* If packet is vxlan or geneve tunnel packet, calculate outer > * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated > * before. */ > -const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK; > -if (tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENEVE || > -tunnel_type == RTE_MBUF_F_TX_TUNNEL_VXLAN) { > +if ((tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENEVE || > + tunnel_type == RTE_MBUF_F_TX_TUNNEL_VXLAN) && > +mbuf->ol_flags & all_inner_requests) { > + > mbuf->outer_l2_len = (char *) dp_packet_l3(pkt) - > (char *) dp_packet_eth(pkt); > mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) - > (char *) dp_packet_l3(pkt); > +} else { > +if (OVS_UNLIKELY(!(mbuf->ol_flags & all_inner_requests))) { > +/* If no inner offloading is requesting, fallback to non > tunneling > + * checksum offloads. */ Inner marks must be reset before converting outer marks. Otherwise, this results (with IPv4 traffic encapsulated in IPv6 geneve tunnel) in such a ol_flags combination: RTE_MBUF_F_RX_RSS_HASH RTE_MBUF_F_TX_UDP_CKSUM RTE_MBUF_F_TX_IP_CKSUM RTE_MBUF_F_TX_IPV4 RTE_MBUF_F_TX_IPV6 v2 in preparation. > > -/* If neither inner checksums nor TSO is requested, inner marks > - * should not be set. */ > -if (!(mbuf->ol_flags & (RTE_MBUF_F_TX_IP_CKSUM | > -RTE_MBUF_F_TX_L4_MASK | > -RTE_MBUF_F_TX_TCP_SEG))) { > -mbuf->ol_flags &= ~(RTE_MBUF_F_TX_IPV4 | > -RTE_MBUF_F_TX_IPV6); > +if (mbuf->ol_flags & RTE_MBUF_F_TX_OUTER_IP_CKSUM) { > +mbuf->ol_flags |= RTE_MBUF_F_TX_IP_CKSUM; > +mbuf->ol_flags |= RTE_MBUF_F_TX_IPV4; > +} > +if (mbuf->ol_flags & RTE_MBUF_F_TX_OUTER_UDP_CKSUM) { > +mbuf->ol_flags |= RTE_MBUF_F_TX_UDP_CKSUM; > +mbuf->ol_flags |= (mbuf->ol_flags & > RTE_MBUF_F_TX_OUTER_IPV4) ? > + RTE_MBUF_F_TX_IPV4 : RTE_MBUF_F_TX_IPV6; > +} > +mbuf->ol_flags &= ~(all_outer_requests | all_outer_marks); -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] netdev-dpdk: Fallback to non tunnel offloading API.
The outer checksum offloading API in DPDK is ambiguous and was added by Intel folks with the assumption that any outer offloading always goes with an inner offloading request. With net/i40e and net/ice drivers, requesting outer ip checksum with a tunnel context but no inner offloading request triggers a Tx failure associated with a port MDD event. 2024-03-27T16:02:07.084Z|00018|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event To avoid this situation, if no checksum or segmentation offloading is requested on the inner part of a packet, fallback to "normal" (non outer) offloading request. And outer offloading can be re-enabled for net/i40e and netice. Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.") Signed-off-by: David Marchand --- lib/netdev-dpdk.c | 84 +++ 1 file changed, 42 insertions(+), 42 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 2111f77681..939817474c 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -1354,18 +1354,6 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_TCP_CKSUM; } -if (!strcmp(info.driver_name, "net_ice") -|| !strcmp(info.driver_name, "net_i40e")) { -/* FIXME: Driver advertises the capability but doesn't seem - * to actually support it correctly. Can remove this once - * the driver is fixed on DPDK side. */ -VLOG_INFO("%s: disabled Tx outer udp checksum offloads for a " - "net/ice or net/i40e port.", netdev_get_name(>up)); -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM; -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO; -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO; -} - if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_IPV4_CKSUM) { dev->hw_ol_features |= NETDEV_TX_IPV4_CKSUM_OFFLOAD; } else { @@ -2584,20 +2572,20 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf); struct tcp_header *th; -const uint64_t all_requests = (RTE_MBUF_F_TX_IP_CKSUM | - RTE_MBUF_F_TX_L4_MASK | - RTE_MBUF_F_TX_OUTER_IP_CKSUM | - RTE_MBUF_F_TX_OUTER_UDP_CKSUM | - RTE_MBUF_F_TX_TCP_SEG); -const uint64_t all_marks = (RTE_MBUF_F_TX_IPV4 | -RTE_MBUF_F_TX_IPV6 | -RTE_MBUF_F_TX_OUTER_IPV4 | -RTE_MBUF_F_TX_OUTER_IPV6 | -RTE_MBUF_F_TX_TUNNEL_MASK); - -if (!(mbuf->ol_flags & all_requests)) { +const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM | + RTE_MBUF_F_TX_L4_MASK | + RTE_MBUF_F_TX_TCP_SEG); +const uint64_t all_outer_requests = (RTE_MBUF_F_TX_OUTER_IP_CKSUM | + RTE_MBUF_F_TX_OUTER_UDP_CKSUM); +const uint64_t all_inner_marks = (RTE_MBUF_F_TX_IPV4 | + RTE_MBUF_F_TX_IPV6); +const uint64_t all_outer_marks = (RTE_MBUF_F_TX_OUTER_IPV4 | + RTE_MBUF_F_TX_OUTER_IPV6 | + RTE_MBUF_F_TX_TUNNEL_MASK); + +if (!(mbuf->ol_flags & (all_inner_requests | all_outer_requests))) { /* No offloads requested, no marks should be set. */ -mbuf->ol_flags &= ~all_marks; +mbuf->ol_flags &= ~(all_inner_marks | all_outer_marks); uint64_t unexpected = mbuf->ol_flags & RTE_MBUF_F_TX_OFFLOAD_MASK; if (OVS_UNLIKELY(unexpected)) { @@ -2610,32 +2598,44 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) return true; } +const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK; +if (OVS_UNLIKELY(tunnel_type + && tunnel_type != RTE_MBUF_F_TX_TUNNEL_GENEVE + && tunnel_type != RTE_MBUF_F_TX_TUNNEL_VXLAN)) { +VLOG_WARN_RL(, "%s: Unexpected tunnel type: %#"PRIx64, + netdev_get_name(>up), tunnel_type); +netdev_dpdk_mbuf_dump(netdev_get_name(>up), + "Packet with unexpected tunnel type", mbuf); +return false; +} + /* If packet is vxlan or geneve tunnel packet, calculate outer * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated * before. */ -const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK; -if (tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENEVE || -tun
Re: [ovs-dev] [PATCH v2 2/4] netdev-linux: Favour inner packet for multi-encapsulated TSO.
On Thu, Feb 15, 2024 at 7:03 AM Mike Pattrick wrote: > I've made a branch where we properly account for outer and inner > checksums, and it passes the tests mostly, except for afxdp. > > For afxdp we crash in dp_packet_prealloc_headroom(). netdev-afxdp has > a hardcoded OVS_XDP_HEADROOM=128 bytes and the multiple layers of > tunneling exceeds that. I ran a test where I set this to 256 and the > test passes, but that seems like a non-ideal solution. We probably > shouldn't abort() in dp_packet_resize(), as it could be possible to > accidentally run into this. This is exactly the point I wanted to stress with DPDK dp-packets. The reason behind was to check this old patch of mine: https://patchwork.ozlabs.org/project/openvswitch/patch/20220318153339.31083-1-david.march...@redhat.com/ DPDK dp-packets data are supposed to be located at RTE_PKTMBUF_HEADROOM == 128 bytes, on rx. But I uncovered recently that we won't hit this headroom limit with net/af_xdp backing netdev-dpdk ports... The net/af_xdp driver tries to be smart and avoid copies by using the unaligned chunk af_xdp feature. https://git.dpdk.org/dpdk/commit/?id=d8a210774e1d4c295fd93b983538da0d15312edd A consequence is that this driver places received data with a 384 bytes headroom (RTE_PKTMBUF_HEADROOM + XDP_PACKET_HEADROOM). Which then defeats my unit test... This placement of data looks incorrect to me, from the DPDK mbuf API "spirit". Applications expect a RTE_PKTMBUF_HEADROOM headroom, and they size their buffers accordingly. This extra headroom would mean applications need to account for this peculiarity when using this driver... I will need to spend more time on this, but not now. > > Dropping the packet is probably preferable IMO, but that is also a > very large change, as none of the calling functions have return codes > themselves and some of the 2rd degree call backs don't either, so many > functions will need to change. Or extend dp_packet_resize() for af_xdp dp-packet. The tricky part is that the dp-packet is part of a umem buffer. If we make a af_xdp dp-packet points at a different malloc'd data buffer, we need to distinguish for this case when freeing this dp-packet. I can put this on my todolist. > > You can see the branch here: https://github.com/mkp-rh/ovs/tree/multitun > And the test run here: https://github.com/mkp-rh/ovs/actions/runs/7911539363 > > I'll clean up this a bit and address some of the other things > mentioned, like the incorrect Fixes tag. We don't need to fix all issues, the main point is the inner checksum issue, as it is something that got broken in 3.3. If we strip the 3rd layer of tunnel from my unit test, it would be enough to reproduce without hitting af_xdp headroom limit. Or do you think we can extend an existing test? At least, fixes should be isolated from the new features like one introduced in patch 1 of this series. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 2/4] netdev-linux: Favour inner packet for multi-encapsulated TSO.
Hello Mike, On Mon, Feb 12, 2024 at 8:50 PM Mike Pattrick wrote: > > Previously if an OVS configuration nested multiple layers of UDP tunnels > like VXLAN or GENEVE on top of each other through netdev-linux > interfaces, the vnet header would be incorrectly set to the outermost > UDP tunnel layer instead of the intermediary tunnel layer. > > This resulted in the middle UDP tunnel not checksum offloading properly. > > Fixes: 3337e6d91c5b ("userspace: Enable L4 checksum offloading by default.") > Reported-by: David Marchand > Signed-off-by: Mike Pattrick I have some trouble relating this patch to the issue I faced :-). Could you detail a test that shows the issue you fix here? After applying (only this patch), I still reproduce an issue with inner checksums. As I reported this issue to you offlist, let me put the details in public here. I wrote a system-traffic.at unit test that stacks 3 vxlan tunnels (separate topic, but for the context, my goal was to stress DPDK dp-packets wrt headroom). If I try this unit test before commit 084c8087292c ("userspace: Support VXLAN and GENEVE TSO."), I have no issue. The topology is as follows: ## # # at_ns0. init_net # . # at_vxlan1 (10.1.1.1/24) . br0 (10.1.1.100/24) # (remote 172.31.1.100) . | # . at_vxlan0 # . (remote 172.31.1.1) # . # at_vxlan3 (172.31.1.1/24) . br-underlay0 (172.31.1.100/24) # (remote 172.31.2.100) . | # . at_vxlan2 # . (remote 172.31.2.1) # . # at_vxlan5 (172.31.2.1/24) . br-underlay1 (172.31.2.100/24) # (remote 172.31.3.100) . | # . at_vxlan4 # . (remote 172.31.3.1) # . # p0 (172.31.3.1/24). br-underlay2 (172.31.3.100/24) # | . | # \-.-ovs-p0 # ## (gmail will probably bust this copy/paste, so putting a link to the actual test: https://github.com/david-marchand/ovs/commit/manyvxlan~2#diff-45a77f85f9679bc66ac97300392c0d5d9f5c53264fa8a82d735a553246e71faeR400) With this setup, I try to ping, from at_ns0 netns, the ip address of the br tap iface plugged with the other side of each tunnel: - Most outter level, no encapsulation, all good: 16:24:51.590966 a6:0a:bf:e2:f3:f2 > 82:cf:78:de:ed:46, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 63550, offset 0, flags [DF], proto ICMP (1), length 84) 172.31.3.1 > 172.31.3.100: ICMP echo request, id 26707, seq 1, length 64 16:24:51.591084 82:cf:78:de:ed:46 > a6:0a:bf:e2:f3:f2, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 28720, offset 0, flags [none], proto ICMP (1), length 84) 172.31.3.100 > 172.31.3.1: ICMP echo reply, id 26707, seq 1, length 64 - One tunnel encap all good: 16:24:54.140629 a6:0a:bf:e2:f3:f2 > 82:cf:78:de:ed:46, ethertype IPv4 (0x0800), length 148: (tos 0x0, ttl 64, id 61052, offset 0, flags [none], proto UDP (17), length 134) 172.31.3.1.36831 > 172.31.3.100.vxlan: [udp sum ok] VXLAN, flags [I] (0x08), vni 0 1e:db:ec:e5:28:6d > 9a:39:be:e8:18:4b, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 54399, offset 0, flags [DF], proto ICMP (1), length 84) 172.31.2.1 > 172.31.2.100: ICMP echo request, id 51488, seq 1, length 64 16:24:54.140772 82:cf:78:de:ed:46 > a6:0a:bf:e2:f3:f2, ethertype IPv4 (0x0800), length 148: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 134) 172.31.3.100.39912 > 172.31.3.1.vxlan: [no cksum] VXLAN, flags [I] (0x08), vni 0 9a:39:be:e8:18:4b > 1e:db:ec:e5:28:6d, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 29701, offset 0, flags [none], proto ICMP (1), length 84) 172.31.2.100 > 172.31.2.1: ICMP echo reply, id 51488, seq 1, length 64 - Two tunnels encap: 16:24:58.578900 a6:0a:bf:e2:f3:f2 > 82:cf:78:de:ed:46, ethertype IPv4 (0x0800), length 142: (tos 0x0, ttl 64, id 61719, offset 0, flags [none], proto UDP (17), length 128) 172.31.3.1.50673 > 172.31.3.100.vxlan: [udp sum ok] VXLAN, flags [I] (0x08), vni 0 1e:db:ec:e5:28:6d > 9a:39:be:e8:18:4b, ethertype IPv4 (0x0800), length 92: (tos 0x0, ttl 64, id 35175, offset 0, flags [none], proto UDP (17), length 78) 172.31.2.1.44060 > 172.31.2.100.vxlan: [udp sum ok] VXLAN, flags [I] (0x08), vni 1 62:53:3f:82:da:56 > Broadcast, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 172.31.1.100 tell 172.31.1.1, length 28 16:24:58.579021 82:cf:78:de:ed:46 > a6:0a:bf:e2:f3:f2, ethertype IPv4 (0x0800), length 142: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 128) 172.31.3.100.56325 > 172.31.3.1.vxlan: [n
Re: [ovs-dev] [PATCH v4 3/4] dp-packet: Include inner offsets in adjustments and checks.
On Mon, Feb 12, 2024 at 7:54 AM Mike Pattrick wrote: > > Include inner offsets in functions where l3 and l4 offsets are either > modified or checked. > > Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.") > Signed-off-by: Mike Pattrick Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v4 2/4] bfd: Set proper offsets and flags in BFD packets.
On Mon, Feb 12, 2024 at 7:53 AM Mike Pattrick wrote: > > Previously the BFD packet creation code did not appropriately set > offsets or flags. This contributed to issues involving encapsulation and > the TSO code. > > The transition to using standard functions also means some other > metadata like packet_type are set appropriately. > > Fixes: ccc096898c46 ("bfd: Implement Bidirectional Forwarding Detection.") > Signed-off-by: Mike Pattrick Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/2] dpif-netdev: Do not create handler threads.
On Tue, Feb 6, 2024 at 3:47 PM Eelco Chaudron wrote: > On 6 Feb 2024, at 15:17, David Marchand wrote: > > > On Tue, Feb 6, 2024 at 2:31 PM Eelco Chaudron wrote: > >> > >> Avoid unnecessary thread creation as no upcalls are generated, > >> resulting in idle threads waiting for process termination. > >> > >> This optimization significantly reduces memory usage, cutting it > >> by half on a 128 CPU/thread system during testing, with the number > >> of threads reduced from 95 to 0. > >> > >> Signed-off-by: Eelco Chaudron > > > > I find it weird that the dpif layer reports an information on how the > > ofproto-dpif layer behaves. > > The handler threads are something ofproto-dpif is responsible for. > > The upcall receiving loop is something the ofproto-dpif owns. > > Why should the dpif layer tells how many handlers are needed? > > > > > > I would have seen a different change, where the dpif layer exports a > > capability, like dpif_can_recv() { return !!dpif->dpif_class->recv; }. > > ofproto-dpif would then deduce there is no handler to start at all. > > That was my first idea also, but then I found there is already an API call to > the dpif layer where it can tell the user (ofproto in this case) how many > threads it needs to function correctly. Here is the API definition: > > 369 /* Queries 'dpif' to see if a certain number of handlers are > required by > 370 * the implementation. > 371 * > 372 * If a certain number of handlers are required, returns 'true' and > sets > 373 * 'n_handlers' to that number of handler threads. > 374 * > 375 * If not, returns 'false'. > 376 */ > 377 bool (*number_handlers_required)(struct dpif *dpif, uint32_t > *n_handlers); > > I guess the ‘If a certain number of handlers are required, returns 'true’’ > part fits here, as we need 0. The fact that it exists does not convince me on its validity :-). I must be missing something. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 1/2] dpif-netdev: Do not create handler threads.
On Tue, Feb 6, 2024 at 2:31 PM Eelco Chaudron wrote: > > Avoid unnecessary thread creation as no upcalls are generated, > resulting in idle threads waiting for process termination. > > This optimization significantly reduces memory usage, cutting it > by half on a 128 CPU/thread system during testing, with the number > of threads reduced from 95 to 0. > > Signed-off-by: Eelco Chaudron I find it weird that the dpif layer reports an information on how the ofproto-dpif layer behaves. The handler threads are something ofproto-dpif is responsible for. The upcall receiving loop is something the ofproto-dpif owns. Why should the dpif layer tells how many handlers are needed? I would have seen a different change, where the dpif layer exports a capability, like dpif_can_recv() { return !!dpif->dpif_class->recv; }. ofproto-dpif would then deduce there is no handler to start at all. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 4/4] ofproto-dpif-monitor: Remove unneeded calls to clear packets.
On Tue, Jan 30, 2024 at 11:15 PM Mike Pattrick wrote: > > Currently the monitor will call dp_packet_clear() on the dp_packet that > is shared amongst BFD, LLDP, and CFM. However, all of these packets are > created with eth_compose(), which already calls dp_packet_clear(). > > Signed-off-by: Mike Pattrick Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 3/4] dp-packet: Include inner offsets in adjustments and checks.
ruct dp_packet, l4_ofs)); > > +BUILD_ASSERT_DECL(offsetof(struct dp_packet, l4_ofs) + > + MEMBER_SIZEOF(struct dp_packet, l4_ofs) == > + offsetof(struct dp_packet, inner_l3_ofs)); > + > +BUILD_ASSERT_DECL(offsetof(struct dp_packet, inner_l3_ofs) + > + MEMBER_SIZEOF(struct dp_packet, inner_l3_ofs) == > + offsetof(struct dp_packet, inner_l4_ofs)); > + > /* The below build assert makes sure it's safe to read/write 128-bits > starting > * at the l2_pad_size location. */ > BUILD_ASSERT_DECL(sizeof(struct dp_packet) - > @@ -125,7 +134,7 @@ avx512_dp_packet_resize_l2(struct dp_packet *b, int > resize_by_bytes) > /* Each lane represents 16 bits in a 12-bit register. In this case the > * first three 16-bit values, which will map to the l2_5_ofs, l3_ofs and > * l4_ofs fields. */ > -const uint8_t k_lanes = 0b1110; > +const uint8_t k_lanes = 0b10; > > /* Set all 16-bit words in the 128-bits v_offset register to the value we > * need to add/substract from the l2_5_ofs, l3_ofs, and l4_ofs fields. */ Touching this part scares me. I think some comments are wrong, and otherwise I hope Intel CI will be enough to check nothing gets broken here :-). -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 2/4] bfd: Set proper offsets and flags in BFD packets.
On Tue, Jan 30, 2024 at 11:15 PM Mike Pattrick wrote: > > Previously the BFD packet creation code did not appropriately set > offsets or flags. This contributed to issues involving encapsulation and > the TSO code. I noted that apart from fixing the offsets / flags used to checksum offloading, this patch also fixes the packet_type used by other dp_packet helpers. I see nothing fixed on that later topic though. > > Fixes: ccc096898c46 ("bfd: Implement Bidirectional Forwarding Detection.") > Signed-off-by: Mike Pattrick Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/4] dp-packet: Validate correct offset for L4 inner size.
On Tue, Jan 30, 2024 at 11:15 PM Mike Pattrick wrote: > > This patch fixes the correctness of dp_packet_inner_l4_size() when > checking for the existence of an inner L4 header. Previously it checked > for the outer L4 header. > > This function is currently only used when a packet is already flagged > for tunneling, so an incorrect determination isn't possible as long as > the flags of the packet are correct. > > Fixes: 85bcbbed839a ("userspace: Enable tunnel tests with TSO.") > Signed-off-by: Mike Pattrick Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] system-common-macros: Fix conntrack matching.
On Fri, Jan 19, 2024 at 1:49 PM Ilya Maximets wrote: > > On 1/18/24 14:00, David Marchand wrote: > > Seen in GHA recently. > > Unit tests are checking conntracks relating to a destination ip address > > but the FORMAT_CT macro is not strict enough and would match unrelated > > conntracks too. > > > > Example: > > 148. system-traffic.at:6432: testing conntrack - DNAT with > > additional SNAT ... > > [...] > > ./system-traffic.at:6460: ovs-appctl dpctl/dump-conntrack | > > grep "dst=10.1.1.1" | > > sed -e 's/port=[0-9]*/port=/g' > > -e 's/id=[0-9]*/id=/g' > > -e 's/state=[0-9_A-Z]*/state=/g' | sort | uniq > > [...] > > @@ -1,2 +1,7 @@ > > tcp,orig=(src=10.1.1.1,dst=172.1.1.2,sport=,... > > +tcp,...,reply=(src=13.107.42.16,dst=10.1.1.10,sport=,... > > +tcp,...,reply=(src=168.63.129.16,dst=10.1.1.10,sport=,... > > +tcp,...,reply=(src=20.242.161.191,dst=10.1.1.10,sport=,... > > +tcp,orig=(src=13.107.42.16,dst=10.1.1.10,sport=,... > > +tcp,orig=(src=20.242.161.191,dst=10.1.1.10,sport=,... > > > > Fixes: 07659514c3c1 ("Add support for connection tracking.") > > Signed-off-by: David Marchand > > --- > > tests/system-common-macros.at | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at > > index 01ebe364ee..07be29f673 100644 > > --- a/tests/system-common-macros.at > > +++ b/tests/system-common-macros.at > > @@ -256,7 +256,7 @@ m4_define([STRIP_MONITOR_CSUM], [grep "csum:" | sed > > 's/csum:.*/csum: /']) > > # and limit the output to the rows containing 'ip-addr'. > > # > > m4_define([FORMAT_CT], > > -[[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e > > 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | > > sort | uniq]]) > > +[[grep "dst=$1\>" | sed -e 's/port=[0-9]*/port=/g' -e > > 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | > > sort | uniq]]) > > > > # NETNS_DAEMONIZE([namespace], [command], [pidfile]) > > # > > I remembered why the macro is loose. We wanted to be able > to match on "subnets" by supplying only part of the address. > > There was at least one test that used this functionality. > Eelco removed it though here: > > https://github.com/openvswitch/ovs/commit/a80883f7682158c7a6955360ee852e8279f748e9 > > Did you check if have any more instances of such tests? I did not. > They can be tricky to find, as we can supply 10.1.1.2 in order > to match 10.1.1.240, for example. Ok, you can discard my patch. Thanks. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] system-common-macros: Fix conntrack matching.
On Fri, Jan 19, 2024 at 1:20 PM Simon Horman wrote: > > On Thu, Jan 18, 2024 at 02:00:18PM +0100, David Marchand wrote: > > Seen in GHA recently. > > Unit tests are checking conntracks relating to a destination ip address > > but the FORMAT_CT macro is not strict enough and would match unrelated > > conntracks too. > > > > Example: > > 148. system-traffic.at:6432: testing conntrack - DNAT with > > additional SNAT ... > > [...] > > ./system-traffic.at:6460: ovs-appctl dpctl/dump-conntrack | > > grep "dst=10.1.1.1" | > > sed -e 's/port=[0-9]*/port=/g' > > -e 's/id=[0-9]*/id=/g' > > -e 's/state=[0-9_A-Z]*/state=/g' | sort | uniq > > [...] > > @@ -1,2 +1,7 @@ > > tcp,orig=(src=10.1.1.1,dst=172.1.1.2,sport=,... > > +tcp,...,reply=(src=13.107.42.16,dst=10.1.1.10,sport=,... > > +tcp,...,reply=(src=168.63.129.16,dst=10.1.1.10,sport=,... > > +tcp,...,reply=(src=20.242.161.191,dst=10.1.1.10,sport=,... > > +tcp,orig=(src=13.107.42.16,dst=10.1.1.10,sport=,... > > +tcp,orig=(src=20.242.161.191,dst=10.1.1.10,sport=,... > > > > Fixes: 07659514c3c1 ("Add support for connection tracking.") > > Signed-off-by: David Marchand > > --- > > tests/system-common-macros.at | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at > > index 01ebe364ee..07be29f673 100644 > > --- a/tests/system-common-macros.at > > +++ b/tests/system-common-macros.at > > @@ -256,7 +256,7 @@ m4_define([STRIP_MONITOR_CSUM], [grep "csum:" | sed > > 's/csum:.*/csum: /']) > > # and limit the output to the rows containing 'ip-addr'. > > # > > m4_define([FORMAT_CT], > > -[[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e > > 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | > > sort | uniq]]) > > +[[grep "dst=$1\>" | sed -e 's/port=[0-9]*/port=/g' -e > > 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | > > sort | uniq]]) > > > > # NETNS_DAEMONIZE([namespace], [command], [pidfile]) > > # > > Sorry, I feel I mist be missing something very obvious, but > I'm unsure why the match is on "dst=$1\>". I would have thought > the match would be "dst=$1," instead. \> matches the end of a word. Using , as a delimiter works too in this case. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3] netdev-dpdk: Trigger port reconfiguration in main thread for resets.
When OVS (main thread) configures a DPDK netdev, it holds a netdev_dpdk mutex lock. As part of this configure operation, the net/iavf driver (used with i40e VF devices) triggers a queue count change. The PF entity (serviced by a kernel PF driver for example) handles this change and requests back that the VF driver resets the VF device. The driver then completes the VF reset operation on its side and waits for completion of the iavf-event thread responsible for handling various VF device events. On the other hand, handling of the VF reset request in this iavf-event thread results in notifying the application with a port reset request (RTE_ETH_EVENT_INTR_RESET). The OVS reset callback tries to take a hold of the same netdev_dpdk mutex and blocks the iavf-event thread. As a resut, the net/iavf driver (still running on OVS main thread) is unable to complete as it is waiting for iavf-event to complete. To break from this situation, the OVS reset callback now won't take a netdev_dpdk mutex. Instead, the port reset request is stored in a simple RTE_ETH_MAXPORTS array associated to a seq object. This is enough to let the VF driver complete this port initialisation. The OVS main thread later handles the port reset request. More details in the DPDK upstream bz as this issue appeared following a change in DPDK. Link: https://bugs.dpdk.org/show_bug.cgi?id=1337 Signed-off-by: David Marchand --- Changes since v2: - fixed build with clang, - fixed indentation, - updated NEWS, Changes since v1: - converted to atomic accesses on netdev_dpdk_pending_reset[], --- NEWS | 7 - lib/netdev-dpdk.c | 76 +-- 2 files changed, 61 insertions(+), 22 deletions(-) diff --git a/NEWS b/NEWS index 2153b48053..a6617546c6 100644 --- a/NEWS +++ b/NEWS @@ -54,13 +54,6 @@ v3.3.0 - xx xxx - Support for multicast snooping to show the protocol responsible for adding/updating the entry. -Known issues: - - DPDK: v23.11 has a change in behavior in handling i40e VF devices. This - may block and prevent OVS from adding such devices as ports in a netdev - datapath bridge. - For the details, see https://bugs.dpdk.org/show_bug.cgi?id=1337 which - describes the issue first detected in the 21.11 LTS branch. - v3.2.0 - 17 Aug 2023 diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index fb26825ff8..45f61930d4 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -58,6 +58,7 @@ #include "openvswitch/match.h" #include "openvswitch/ofp-parse.h" #include "openvswitch/ofp-print.h" +#include "openvswitch/poll-loop.h" #include "openvswitch/shash.h" #include "openvswitch/vlog.h" #include "ovs-numa.h" @@ -2101,32 +2102,73 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev, return new_port_id; } +static struct seq *netdev_dpdk_reset_seq; +static uint64_t netdev_dpdk_last_reset_seq; +static atomic_bool netdev_dpdk_pending_reset[RTE_MAX_ETHPORTS]; + +static void +netdev_dpdk_wait(const struct netdev_class *netdev_class OVS_UNUSED) +{ +uint64_t last_reset_seq = seq_read(netdev_dpdk_reset_seq); + +if (netdev_dpdk_last_reset_seq == last_reset_seq) { +seq_wait(netdev_dpdk_reset_seq, netdev_dpdk_last_reset_seq); +} else { +poll_immediate_wake(); +} +} + +static void +netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED) +{ +uint64_t reset_seq = seq_read(netdev_dpdk_reset_seq); + +if (reset_seq != netdev_dpdk_last_reset_seq) { +dpdk_port_t port_id; + +netdev_dpdk_last_reset_seq = reset_seq; + +for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) { +struct netdev_dpdk *dev; +bool pending_reset; + +atomic_read_relaxed(_dpdk_pending_reset[port_id], +_reset); +if (!pending_reset) { +continue; +} +atomic_store_relaxed(_dpdk_pending_reset[port_id], false); + +ovs_mutex_lock(_mutex); +dev = netdev_dpdk_lookup_by_port_id(port_id); +if (dev) { +ovs_mutex_lock(>mutex); +dev->reset_needed = true; +netdev_request_reconfigure(>up); +VLOG_DBG_RL(, "%s: Device reset requested.", +netdev_get_name(>up)); +ovs_mutex_unlock(>mutex); +} +ovs_mutex_unlock(_mutex); +} +} +} + static int dpdk_eth_event_callback(dpdk_port_t port_id, enum rte_eth_event_type type, void *param OVS_UNUSED, void *ret_param OVS_UNUSED) { -struct netdev_dpdk *dev; - switch ((int) type) { case RTE_ETH_EVENT_INTR_RESET: -ovs_mutex_lock(_mutex); -dev = netdev_dpdk_lookup_by_port_id(port_id); -if (dev) { -ovs_mute
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Trigger port reconfiguration in main thread for resets.
On Thu, Jan 18, 2024 at 4:43 PM Ilya Maximets wrote: > > On 1/18/24 15:41, David Marchand wrote: > > When OVS (main thread) configures a DPDK netdev, it holds a netdev_dpdk > > mutex lock. > > As part of this configure operation, the net/iavf driver (used with i40e > > VF devices) triggers a queue count change. The PF entity (serviced by a > > kernel PF driver for example) handles this change and requests back that > > the VF driver resets the VF device. The driver then completes the VF reset > > operation on its side and waits for completion of the iavf-event thread > > responsible for handling various VF device events. > > > > On the other hand, handling of the VF reset request in this iavf-event > > thread results in notifying the application with a port reset request > > (RTE_ETH_EVENT_INTR_RESET). The OVS reset callback tries to take a hold > > of the same netdev_dpdk mutex and blocks the iavf-event thread. > > > > As a resut, the net/iavf driver (still running on OVS main thread) is > > unable to complete as it is waiting for iavf-event to complete. > > > > To break from this situation, the OVS reset callback now won't take a > > netdev_dpdk mutex. Instead, the port reset request is stored in a simple > > RTE_ETH_MAXPORTS array associated to a seq object. > > This is enough to let the VF driver complete this port initialisation. > > The OVS main thread later handles the port reset request. > > > > More details in the DPDK upstream bz as this issue appeared following a > > change in DPDK. > > > > Link: https://bugs.dpdk.org/show_bug.cgi?id=1337 > > Signed-off-by: David Marchand > > --- > > Changes since v1: > > - converted to atomic accesses on netdev_dpdk_pending_reset[], > > > > > > --- > > lib/netdev-dpdk.c | 76 +-- > > 1 file changed, 61 insertions(+), 15 deletions(-) > > > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > > index fb26825ff8..6b15e4c03a 100644 > > --- a/lib/netdev-dpdk.c > > +++ b/lib/netdev-dpdk.c > > @@ -58,6 +58,7 @@ > > #include "openvswitch/match.h" > > #include "openvswitch/ofp-parse.h" > > #include "openvswitch/ofp-print.h" > > +#include "openvswitch/poll-loop.h" > > #include "openvswitch/shash.h" > > #include "openvswitch/vlog.h" > > #include "ovs-numa.h" > > @@ -2101,32 +2102,73 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev, > > return new_port_id; > > } > > > > +static struct seq *netdev_dpdk_reset_seq; > > +static uint64_t netdev_dpdk_last_reset_seq; > > +static bool netdev_dpdk_pending_reset[RTE_MAX_ETHPORTS]; > > Should be an atomic_bool, I suppose. Yes, already fixed. > > > + > > +static void > > +netdev_dpdk_wait(const struct netdev_class *netdev_class OVS_UNUSED) > > +{ > > +uint64_t last_reset_seq = seq_read(netdev_dpdk_reset_seq); > > + > > +if (netdev_dpdk_last_reset_seq == last_reset_seq) { > > +seq_wait(netdev_dpdk_reset_seq, netdev_dpdk_last_reset_seq); > > +} else { > > +poll_immediate_wake(); > > +} > > +} > > + > > +static void > > +netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED) > > +{ > > +uint64_t reset_seq = seq_read(netdev_dpdk_reset_seq); > > + > > +if (reset_seq != netdev_dpdk_last_reset_seq) { > > +dpdk_port_t port_id; > > + > > +netdev_dpdk_last_reset_seq = reset_seq; > > + > > +for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) { > > +struct netdev_dpdk *dev; > > +bool pending_reset; > > + > > +atomic_read_relaxed(_dpdk_pending_reset[port_id], > > + _reset); > > Indentation. Indeed. > > > +if (!pending_reset) { > > +continue; > > +} > > +atomic_store_relaxed(_dpdk_pending_reset[port_id], > > false); > > + > > +ovs_mutex_lock(_mutex); > > +dev = netdev_dpdk_lookup_by_port_id(port_id); > > +if (dev) { > > +ovs_mutex_lock(>mutex); > > +dev->reset_needed = true; > > +netdev_request_reconfigure(>up); > > +VLOG_DBG_RL(, "%s: Device reset requested.", > > +netdev_get_name(>up)); > > +ovs_mutex_unloc
Re: [ovs-dev] [PATCH v2] netdev-dpdk: Trigger port reconfiguration in main thread for resets.
On Thu, Jan 18, 2024 at 3:41 PM David Marchand wrote: > Changes since v1: > - converted to atomic accesses on netdev_dpdk_pending_reset[], I did not wait for the clang results during my tests... cooking v3. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] netdev-dpdk: Trigger port reconfiguration in main thread for resets.
When OVS (main thread) configures a DPDK netdev, it holds a netdev_dpdk mutex lock. As part of this configure operation, the net/iavf driver (used with i40e VF devices) triggers a queue count change. The PF entity (serviced by a kernel PF driver for example) handles this change and requests back that the VF driver resets the VF device. The driver then completes the VF reset operation on its side and waits for completion of the iavf-event thread responsible for handling various VF device events. On the other hand, handling of the VF reset request in this iavf-event thread results in notifying the application with a port reset request (RTE_ETH_EVENT_INTR_RESET). The OVS reset callback tries to take a hold of the same netdev_dpdk mutex and blocks the iavf-event thread. As a resut, the net/iavf driver (still running on OVS main thread) is unable to complete as it is waiting for iavf-event to complete. To break from this situation, the OVS reset callback now won't take a netdev_dpdk mutex. Instead, the port reset request is stored in a simple RTE_ETH_MAXPORTS array associated to a seq object. This is enough to let the VF driver complete this port initialisation. The OVS main thread later handles the port reset request. More details in the DPDK upstream bz as this issue appeared following a change in DPDK. Link: https://bugs.dpdk.org/show_bug.cgi?id=1337 Signed-off-by: David Marchand --- Changes since v1: - converted to atomic accesses on netdev_dpdk_pending_reset[], --- lib/netdev-dpdk.c | 76 +-- 1 file changed, 61 insertions(+), 15 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index fb26825ff8..6b15e4c03a 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -58,6 +58,7 @@ #include "openvswitch/match.h" #include "openvswitch/ofp-parse.h" #include "openvswitch/ofp-print.h" +#include "openvswitch/poll-loop.h" #include "openvswitch/shash.h" #include "openvswitch/vlog.h" #include "ovs-numa.h" @@ -2101,32 +2102,73 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev, return new_port_id; } +static struct seq *netdev_dpdk_reset_seq; +static uint64_t netdev_dpdk_last_reset_seq; +static bool netdev_dpdk_pending_reset[RTE_MAX_ETHPORTS]; + +static void +netdev_dpdk_wait(const struct netdev_class *netdev_class OVS_UNUSED) +{ +uint64_t last_reset_seq = seq_read(netdev_dpdk_reset_seq); + +if (netdev_dpdk_last_reset_seq == last_reset_seq) { +seq_wait(netdev_dpdk_reset_seq, netdev_dpdk_last_reset_seq); +} else { +poll_immediate_wake(); +} +} + +static void +netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED) +{ +uint64_t reset_seq = seq_read(netdev_dpdk_reset_seq); + +if (reset_seq != netdev_dpdk_last_reset_seq) { +dpdk_port_t port_id; + +netdev_dpdk_last_reset_seq = reset_seq; + +for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) { +struct netdev_dpdk *dev; +bool pending_reset; + +atomic_read_relaxed(_dpdk_pending_reset[port_id], + _reset); +if (!pending_reset) { +continue; +} +atomic_store_relaxed(_dpdk_pending_reset[port_id], false); + +ovs_mutex_lock(_mutex); +dev = netdev_dpdk_lookup_by_port_id(port_id); +if (dev) { +ovs_mutex_lock(>mutex); +dev->reset_needed = true; +netdev_request_reconfigure(>up); +VLOG_DBG_RL(, "%s: Device reset requested.", +netdev_get_name(>up)); +ovs_mutex_unlock(>mutex); +} +ovs_mutex_unlock(_mutex); +} +} +} + static int dpdk_eth_event_callback(dpdk_port_t port_id, enum rte_eth_event_type type, void *param OVS_UNUSED, void *ret_param OVS_UNUSED) { -struct netdev_dpdk *dev; - switch ((int) type) { case RTE_ETH_EVENT_INTR_RESET: -ovs_mutex_lock(_mutex); -dev = netdev_dpdk_lookup_by_port_id(port_id); -if (dev) { -ovs_mutex_lock(>mutex); -dev->reset_needed = true; -netdev_request_reconfigure(>up); -VLOG_DBG_RL(, "%s: Device reset requested.", -netdev_get_name(>up)); -ovs_mutex_unlock(>mutex); -} -ovs_mutex_unlock(_mutex); +atomic_store_relaxed(_dpdk_pending_reset[port_id], true); +seq_change(netdev_dpdk_reset_seq); break; default: /* Ignore all other types. */ break; - } - return 0; +} +return 0; } static void @@ -5001,6 +5043,8 @@ netdev_dpdk_class_init(void) "[netdev]", 0, 1, netdev_dpdk_get_mempo
Re: [ovs-dev] [PATCH] netdev-dpdk: Trigger port reconfiguration in main thread for resets.
On Thu, Jan 18, 2024 at 2:23 PM Ilya Maximets wrote: > > On 1/18/24 14:16, David Marchand wrote: > > When OVS (main thread) configures a DPDK netdev, it holds a netdev_dpdk > > mutex lock. > > As part of this configure operation, the net/iavf driver (used with i40e > > VF devices) triggers a queue count change. The PF entity (serviced by a > > kernel PF driver for example) handles this change and requests back that > > the VF driver resets the VF device. The driver then completes the VF reset > > operation on its side and waits for completion of the iavf-event thread > > responsible for handling various VF device events. > > > > On the other hand, handling of the VF reset request in this iavf-event > > thread results in notifying the application with a port reset request > > (RTE_ETH_EVENT_INTR_RESET). The OVS reset callback tries to take a hold > > of the same netdev_dpdk mutex and blocks the iavf-event thread. > > > > As a resut, the net/iavf driver (still running on OVS main thread) is > > unable to complete as it is waiting for iavf-event to complete. > > > > To break from this situation, the OVS reset callback now won't take a > > netdev_dpdk mutex. Instead, the port reset request is stored in a simple > > RTE_ETH_MAXPORTS array associated to a seq object. > > This is enough to let the VF driver complete this port initialisation. > > The OVS main thread later handles the port reset request. > > > > Link: https://bugs.dpdk.org/show_bug.cgi?id=1337 > > Signed-off-by: David Marchand > > --- > > lib/netdev-dpdk.c | 73 +-- > > 1 file changed, 58 insertions(+), 15 deletions(-) > > > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > > index fb26825ff8..528850971a 100644 > > --- a/lib/netdev-dpdk.c > > +++ b/lib/netdev-dpdk.c > > @@ -58,6 +58,7 @@ > > #include "openvswitch/match.h" > > #include "openvswitch/ofp-parse.h" > > #include "openvswitch/ofp-print.h" > > +#include "openvswitch/poll-loop.h" > > #include "openvswitch/shash.h" > > #include "openvswitch/vlog.h" > > #include "ovs-numa.h" > > @@ -2101,32 +2102,70 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev, > > return new_port_id; > > } > > > > +static struct seq *netdev_dpdk_reset_seq; > > +static uint64_t netdev_dpdk_last_reset_seq; > > +static bool netdev_dpdk_pending_reset[RTE_MAX_ETHPORTS]; > > + > > +static void > > +netdev_dpdk_wait(const struct netdev_class *netdev_class OVS_UNUSED) > > +{ > > +uint64_t last_reset_seq = seq_read(netdev_dpdk_reset_seq); > > + > > +if (netdev_dpdk_last_reset_seq == last_reset_seq) { > > +seq_wait(netdev_dpdk_reset_seq, netdev_dpdk_last_reset_seq); > > +} else { > > +poll_immediate_wake(); > > +} > > +} > > + > > +static void > > +netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED) > > +{ > > +uint64_t reset_seq = seq_read(netdev_dpdk_reset_seq); > > + > > +if (reset_seq != netdev_dpdk_last_reset_seq) { > > +dpdk_port_t port_id; > > + > > +netdev_dpdk_last_reset_seq = reset_seq; > > + > > +for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) { > > +struct netdev_dpdk *dev; > > + > > +if (!netdev_dpdk_pending_reset[port_id]) { > > +continue; > > +} > > +netdev_dpdk_pending_reset[port_id] = false; > > + > > +ovs_mutex_lock(_mutex); > > +dev = netdev_dpdk_lookup_by_port_id(port_id); > > +if (dev) { > > +ovs_mutex_lock(>mutex); > > +dev->reset_needed = true; > > +netdev_request_reconfigure(>up); > > +VLOG_DBG_RL(, "%s: Device reset requested.", > > +netdev_get_name(>up)); > > +ovs_mutex_unlock(>mutex); > > +} > > +ovs_mutex_unlock(_mutex); > > +} > > +} > > +} > > + > > static int > > dpdk_eth_event_callback(dpdk_port_t port_id, enum rte_eth_event_type type, > > void *param OVS_UNUSED, void *ret_param OVS_UNUSED) > > Can we rely on this callback to always be called from a main thread? > Otherwise, we should use atomics for an array or a separate lock. I assumed it was not necessary after reading seq.h. * Thread-safety * = * * Fully thread safe. seq_change() synchronizes with seq_read() and * seq_wait() on the same variable in release-acquire fashion. That * is, all effects of the memory accesses performed by a thread prior * to seq_change() are visible to the threads returning from * seq_read() or seq_wait() observing that change. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] netdev-dpdk: Trigger port reconfiguration in main thread for resets.
When OVS (main thread) configures a DPDK netdev, it holds a netdev_dpdk mutex lock. As part of this configure operation, the net/iavf driver (used with i40e VF devices) triggers a queue count change. The PF entity (serviced by a kernel PF driver for example) handles this change and requests back that the VF driver resets the VF device. The driver then completes the VF reset operation on its side and waits for completion of the iavf-event thread responsible for handling various VF device events. On the other hand, handling of the VF reset request in this iavf-event thread results in notifying the application with a port reset request (RTE_ETH_EVENT_INTR_RESET). The OVS reset callback tries to take a hold of the same netdev_dpdk mutex and blocks the iavf-event thread. As a resut, the net/iavf driver (still running on OVS main thread) is unable to complete as it is waiting for iavf-event to complete. To break from this situation, the OVS reset callback now won't take a netdev_dpdk mutex. Instead, the port reset request is stored in a simple RTE_ETH_MAXPORTS array associated to a seq object. This is enough to let the VF driver complete this port initialisation. The OVS main thread later handles the port reset request. Link: https://bugs.dpdk.org/show_bug.cgi?id=1337 Signed-off-by: David Marchand --- lib/netdev-dpdk.c | 73 +-- 1 file changed, 58 insertions(+), 15 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index fb26825ff8..528850971a 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -58,6 +58,7 @@ #include "openvswitch/match.h" #include "openvswitch/ofp-parse.h" #include "openvswitch/ofp-print.h" +#include "openvswitch/poll-loop.h" #include "openvswitch/shash.h" #include "openvswitch/vlog.h" #include "ovs-numa.h" @@ -2101,32 +2102,70 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev, return new_port_id; } +static struct seq *netdev_dpdk_reset_seq; +static uint64_t netdev_dpdk_last_reset_seq; +static bool netdev_dpdk_pending_reset[RTE_MAX_ETHPORTS]; + +static void +netdev_dpdk_wait(const struct netdev_class *netdev_class OVS_UNUSED) +{ +uint64_t last_reset_seq = seq_read(netdev_dpdk_reset_seq); + +if (netdev_dpdk_last_reset_seq == last_reset_seq) { +seq_wait(netdev_dpdk_reset_seq, netdev_dpdk_last_reset_seq); +} else { +poll_immediate_wake(); +} +} + +static void +netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED) +{ +uint64_t reset_seq = seq_read(netdev_dpdk_reset_seq); + +if (reset_seq != netdev_dpdk_last_reset_seq) { +dpdk_port_t port_id; + +netdev_dpdk_last_reset_seq = reset_seq; + +for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) { +struct netdev_dpdk *dev; + +if (!netdev_dpdk_pending_reset[port_id]) { +continue; +} +netdev_dpdk_pending_reset[port_id] = false; + +ovs_mutex_lock(_mutex); +dev = netdev_dpdk_lookup_by_port_id(port_id); +if (dev) { +ovs_mutex_lock(>mutex); +dev->reset_needed = true; +netdev_request_reconfigure(>up); +VLOG_DBG_RL(, "%s: Device reset requested.", +netdev_get_name(>up)); +ovs_mutex_unlock(>mutex); +} +ovs_mutex_unlock(_mutex); +} +} +} + static int dpdk_eth_event_callback(dpdk_port_t port_id, enum rte_eth_event_type type, void *param OVS_UNUSED, void *ret_param OVS_UNUSED) { -struct netdev_dpdk *dev; - switch ((int) type) { case RTE_ETH_EVENT_INTR_RESET: -ovs_mutex_lock(_mutex); -dev = netdev_dpdk_lookup_by_port_id(port_id); -if (dev) { -ovs_mutex_lock(>mutex); -dev->reset_needed = true; -netdev_request_reconfigure(>up); -VLOG_DBG_RL(, "%s: Device reset requested.", -netdev_get_name(>up)); -ovs_mutex_unlock(>mutex); -} -ovs_mutex_unlock(_mutex); +netdev_dpdk_pending_reset[port_id] = true; +seq_change(netdev_dpdk_reset_seq); break; default: /* Ignore all other types. */ break; - } - return 0; +} +return 0; } static void @@ -5001,6 +5040,8 @@ netdev_dpdk_class_init(void) "[netdev]", 0, 1, netdev_dpdk_get_mempool_info, NULL); +netdev_dpdk_reset_seq = seq_create(); +netdev_dpdk_last_reset_seq = seq_read(netdev_dpdk_reset_seq); ret = rte_eth_dev_callback_register(RTE_ETH_ALL, RTE_ETH_EVENT_INTR_RESET, dpdk_eth
[ovs-dev] [PATCH] system-common-macros: Fix conntrack matching.
Seen in GHA recently. Unit tests are checking conntracks relating to a destination ip address but the FORMAT_CT macro is not strict enough and would match unrelated conntracks too. Example: 148. system-traffic.at:6432: testing conntrack - DNAT with additional SNAT ... [...] ./system-traffic.at:6460: ovs-appctl dpctl/dump-conntrack | grep "dst=10.1.1.1" | sed -e 's/port=[0-9]*/port=/g' -e 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | sort | uniq [...] @@ -1,2 +1,7 @@ tcp,orig=(src=10.1.1.1,dst=172.1.1.2,sport=,... +tcp,...,reply=(src=13.107.42.16,dst=10.1.1.10,sport=,... +tcp,...,reply=(src=168.63.129.16,dst=10.1.1.10,sport=,... +tcp,...,reply=(src=20.242.161.191,dst=10.1.1.10,sport=,... +tcp,orig=(src=13.107.42.16,dst=10.1.1.10,sport=,... +tcp,orig=(src=20.242.161.191,dst=10.1.1.10,sport=,... Fixes: 07659514c3c1 ("Add support for connection tracking.") Signed-off-by: David Marchand --- tests/system-common-macros.at | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at index 01ebe364ee..07be29f673 100644 --- a/tests/system-common-macros.at +++ b/tests/system-common-macros.at @@ -256,7 +256,7 @@ m4_define([STRIP_MONITOR_CSUM], [grep "csum:" | sed 's/csum:.*/csum: /']) # and limit the output to the rows containing 'ip-addr'. # m4_define([FORMAT_CT], -[[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | sort | uniq]]) +[[grep "dst=$1\>" | sed -e 's/port=[0-9]*/port=/g' -e 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | sort | uniq]]) # NETNS_DAEMONIZE([namespace], [command], [pidfile]) # -- 2.43.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] dpdk: Update to use v23.11.
Hello, On Mon, Jan 15, 2024 at 12:58 PM Ilya Maximets wrote: > OK. I think today we have no real choice but to go with the option 1. > We'll need a NEWS entry for that in the patch. I'll make sure to include > a variant of it in the release announce in February if nothing changes > until then. > > But I think we should still pursue the option 2 in case the solution will > be found before the final release in February. > > Though if there will be no conclusion on the long term problem until autumn, > we should go with 3 and move 24.11 adoption to summer of 2025. And follow > that strategy going forward, as the current approach is not sustainable. > > > > > David, let us know if you agree ? If so, maybe you can send a new > > version of the patch with the added documentation. I can help with docs > > or discussing further. > > David, could you, please, add a note in the NEWS file and send a new version > of the patch? I just sent a v2. I'll continue investigating the iavf issue. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] dpdk: Update to use v23.11.
This commit adds support for DPDK v23.11. It updates the CI script and documentation and includes the following changes coming from the dpdk-latest branch: - sparse: Add some compiler intrinsics for DPDK build. https://patchwork.ozlabs.org/project/openvswitch/list/?series=371129=* - ci: Cache DPDK installed libraries only. - ci: Reduce optional libraries in DPDK. https://patchwork.ozlabs.org/project/openvswitch/list/?series=383367=* - system-dpdk: Ignore net/ice error log about QinQ offloading. https://patchwork.ozlabs.org/project/openvswitch/list/?series=385259=* There is a known issue with i40e VF devices where OVS main thread may block when adding such devices as dpif-netdev dpdk ports. Signed-off-by: David Marchand --- .ci/dpdk-build.sh| 28 +++- .ci/linux-build.sh | 9 .github/workflows/build-and-test.yml | 4 ++-- Documentation/faq/releases.rst | 2 +- Documentation/intro/install/dpdk.rst | 16 +++--- Documentation/topics/dpdk/phy.rst| 12 +- Documentation/topics/dpdk/vdev.rst | 2 +- Documentation/topics/dpdk/vhost-user.rst | 2 +- Documentation/topics/testing.rst | 2 +- Documentation/topics/userspace-tso.rst | 2 +- NEWS | 9 debian/control.in| 2 +- include/sparse/automake.mk | 1 + include/sparse/ia32intrin.h | 23 +++ rhel/openvswitch-fedora.spec.in | 2 +- tests/system-dpdk-macros.at | 1 + 16 files changed, 80 insertions(+), 37 deletions(-) create mode 100644 include/sparse/ia32intrin.h diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh index d4c178ee0d..23f3166a54 100755 --- a/.ci/dpdk-build.sh +++ b/.ci/dpdk-build.sh @@ -5,25 +5,27 @@ set -x function build_dpdk() { -local VERSION_FILE="dpdk-dir/cached-version" local DPDK_VER=$1 local DPDK_OPTS="" +local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir" +local VERSION_FILE="$DPDK_INSTALL_DIR/cached-version" -rm -rf dpdk-dir +rm -rf dpdk-src +rm -rf $DPDK_INSTALL_DIR if [ "${DPDK_VER##refs/*/}" != "${DPDK_VER}" ]; then -git clone --single-branch $DPDK_GIT dpdk-dir -b "${DPDK_VER##refs/*/}" -pushd dpdk-dir +git clone --single-branch $DPDK_GIT dpdk-src -b "${DPDK_VER##refs/*/}" +pushd dpdk-src git log -1 --oneline else wget https://fast.dpdk.org/rel/dpdk-$1.tar.xz tar xvf dpdk-$1.tar.xz > /dev/null DIR_NAME=$(tar -tf dpdk-$1.tar.xz | head -1 | cut -f1 -d"/") -mv ${DIR_NAME} dpdk-dir -pushd dpdk-dir +mv ${DIR_NAME} dpdk-src +pushd dpdk-src fi -# Switching to 'default' machine to make dpdk-dir cache usable on +# Switching to 'default' machine to make the dpdk cache usable on # different CPUs. We can't be sure that all CI machines are exactly same. DPDK_OPTS="$DPDK_OPTS -Dmachine=default" @@ -40,16 +42,22 @@ function build_dpdk() DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd" enable_drivers="net/null,net/af_xdp,net/tap,net/virtio,net/pcap" DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers" +# OVS depends on the vhost library (and its dependencies). +# net/tap depends on the gso library. +DPDK_OPTS="$DPDK_OPTS -Denable_libs=cryptodev,dmadev,gso,vhost" # Install DPDK using prefix. -DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build" +DPDK_OPTS="$DPDK_OPTS --prefix=$DPDK_INSTALL_DIR" meson $DPDK_OPTS build ninja -C build ninja -C build install - -echo "Installed DPDK in $(pwd)" popd + +# Remove examples sources. +rm -rf $DPDK_INSTALL_DIR/share/dpdk/examples + +echo "Installed DPDK in $DPDK_INSTALL_DIR" echo "${DPDK_VER}" > ${VERSION_FILE} } diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh index 90581c10b7..cf1462a0c4 100755 --- a/.ci/linux-build.sh +++ b/.ci/linux-build.sh @@ -10,8 +10,9 @@ JOBS=${JOBS:-"-j4"} function install_dpdk() { -local VERSION_FILE="dpdk-dir/cached-version" -local DPDK_LIB=$(pwd)/dpdk-dir/build/lib/x86_64-linux-gnu +local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir" +local VERSION_FILE="${DPDK_INSTALL_DIR}/cached-version" +local DPDK_LIB=${DPDK_INSTALL_DIR}/lib/x86_64-linux-gnu if [ "$DPDK_SHARED" ]; then EXTRA_OPTS="$EXTRA_OPTS --with-dpdk=shared" @@ -27,13 +28,13 @@ function install_dpdk() export PATH=$(pwd)/dpdk-dir/build/bin:$PATH if [ ! -f "${VERSION_FILE}" ]; then -echo "Could not find DPDK in $(pwd)/dpdk-dir" +echo "Could not find DPDK in $DPDK_INSTA
[ovs-dev] [PATCH v6] system-dpdk: Test with mlx5 devices.
The DPDK unit test only runs if vfio or igb_uio kernel modules are loaded: on systems with only mlx5, this test is always skipped. Besides, the test tries to grab the first device listed by dpdk-devbind.py, regardless of the PCI device status regarding kmod binding. Remove dependency on this DPDK script and use a minimal script that reads PCI sysfs. This script is not perfect, as one can imagine PCI devices bound to vfio-pci for virtual machines. Plus, this script only tries to take over vfio-pci devices. mlx5 devices can't be taken over blindly as it could mean losing connectivity to the machine if the netdev was in use for this system. For those two reasons, add a new environment variable DPDK_PCI_ADDR for testers to select the PCI device of their liking. For consistency and grep, the temporary file PCI_ADDR is renamed to DPDK_PCI_ADDR. Reviewed-by: Maxime Coquelin Acked-by: Eelco Chaudron Signed-off-by: David Marchand --- Changes since v5: - rebased, - moved the script to the python scripts list in automake.mk, - bumped copyright date, Changes since v4: - separated from the original series, - rebased, - dropped mlx5 devices from the discovery script, - documented DPDK_PCI_ADDR env variable, Changes since v3: - fixed nit from Maxime, Changes since v2: - sorted logs alphabetically, --- Documentation/topics/testing.rst | 11 ++--- tests/automake.mk| 1 + tests/system-dpdk-find-device.py | 39 tests/system-dpdk-macros.at | 10 ++-- tests/system-dpdk.at | 14 ++-- 5 files changed, 57 insertions(+), 18 deletions(-) create mode 100755 tests/system-dpdk-find-device.py diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst index 5f6940b84d..fb9b3e77b1 100644 --- a/Documentation/topics/testing.rst +++ b/Documentation/topics/testing.rst @@ -343,15 +343,20 @@ To see a list of all the available tests, run:: These tests support a `DPDK supported NIC`_. The tests operate on a wider set of environments, for instance, when a virtual port is used. -They do require proper DPDK variables (``DPDK_DIR`` and ``DPDK_BUILD``). Moreover you need to have root privileges to load the required modules and to bind -the NIC to the DPDK-compatible driver. +a PCI device to the DPDK-compatible driver. .. _DPDK supported NIC: https://core.dpdk.org/supported/#nics +The phy test will skip if no suitable PCI device is found. +It is possible to select which PCI device is used for this test by setting the +DPDK_PCI_ADDR environment variable, which is especially useful when testing +with a mlx5 device:: + +# DPDK_PCI_ADDR=:82:00.0 make check-dpdk + All tests are skipped if no hugepages are configured. User must look into the DPDK manual to figure out how to `Configure hugepages`_. -The phy test will skip if no compatible physical device is available. .. _Configure hugepages: https://doc.dpdk.org/guides-22.11/linux_gsg/sys_reqs.html diff --git a/tests/automake.mk b/tests/automake.mk index 2ae0aeecaf..10c9fbb01f 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -520,6 +520,7 @@ CHECK_PYFILES = \ tests/flowgen.py \ tests/genpkts.py \ tests/ovsdb-monitor-sort.py \ + tests/system-dpdk-find-device.py \ tests/test-daemon.py \ tests/test-dpparse.py \ tests/test-json.py \ diff --git a/tests/system-dpdk-find-device.py b/tests/system-dpdk-find-device.py new file mode 100755 index 00..ced74e7f31 --- /dev/null +++ b/tests/system-dpdk-find-device.py @@ -0,0 +1,39 @@ +#!/usr/bin/env python3 +# Copyright (c) 2024 Red Hat, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +from pathlib import Path +import os +import sys + +# The tester might want to select a PCI device, if so, trust it. +if 'DPDK_PCI_ADDR' in os.environ: +print(os.environ['DPDK_PCI_ADDR']) +sys.exit(0) + +for device in sorted(Path('/sys/bus/pci/devices').iterdir()): +class_path = device / 'class' +# Only consider Network class devices +if class_path.read_text().strip() != '0x02': +continue +kmod_path = device / 'driver' / 'module' +kmod_name = kmod_path.resolve().name +# Only care about devices bound to vfio_pci or igb_uio. +if kmod_name not in ['vfio_pci', 'igb_uio']: +continue +print(device.resolve().name) +sys.exit(0) + +sys.exit(1) diff --git a/tests/system-dpdk-macros.at
[ovs-dev] [PATCH v5 2/2] tests: Move MFEX tests to dpif-netdev.
The MFEX code and tests do not depend on DPDK anymore. We can move the unit tests to dpif-netdev. Reviewed-by: Maxime Coquelin Acked-by: Eelco Chaudron Acked-by: Kumar Amber Signed-off-by: David Marchand --- Changes since v4: - rebased, - added dummy numa configuration, Changes since v3: - removed documentation update, --- tests/dpif-netdev.at | 165 tests/system-dpdk.at | 197 --- 2 files changed, 165 insertions(+), 197 deletions(-) diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at index d0359b5eab..c9474af0ad 100644 --- a/tests/dpif-netdev.at +++ b/tests/dpif-netdev.at @@ -852,3 +852,168 @@ OVS_VSWITCHD_STOP(["dnl /.*failed to put.*$/d /.*failed to flow_del.*$/d"]) AT_CLEANUP + +AT_SETUP([dpif-netdev - MFEX Autovalidator]) +AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], []) +AT_SKIP_IF([! $PYTHON3 $srcdir/genpkts.py 2000 > packets]) +OVS_VSWITCHD_START( + [add-port br0 p1 \ + -- set Interface p1 type=dummy-pmd], [], [], [--dummy-numa="0,0,0,0,1,1,1,1"]) + +AT_SKIP_IF([! ovs-appctl dpif-netdev/miniflow-parser-get | sed 1,4d | grep "True"], [], [dnl +]) + +AT_CHECK([ovs-appctl dpif-netdev/dpif-impl-set dpif_avx512], [0], [dnl +DPIF implementation set to dpif_avx512. +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set autovalidator], [0], [dnl +Miniflow extract implementation set to autovalidator. +]) + +cat packets | while read line; do + AT_CHECK([ovs-appctl netdev-dummy/receive p1 $line], [0], [ignore]) +done + +OVS_WAIT_UNTIL([test `ovs-vsctl get interface p1 statistics | grep -oP 'rx_packets=\s*\K\d+'` -ge 16000]) + +OVS_VSWITCHD_STOP +AT_CLEANUP + +AT_SETUP([dpif-netdev - MFEX Autovalidator Fuzzy]) +AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], []) +AT_SKIP_IF([! $PYTHON3 $srcdir/genpkts.py 2000 fuzzy > packets]) +OVS_VSWITCHD_START( + [add-port br0 p1 \ + -- set Interface p1 type=dummy-pmd], [], [], [--dummy-numa="0,0,0,0,1,1,1,1"]) + +AT_SKIP_IF([! ovs-appctl dpif-netdev/miniflow-parser-get | sed 1,4d | grep "True"], [], [dnl +]) + +AT_CHECK([ovs-appctl dpif-netdev/dpif-impl-set dpif_avx512], [0], [dnl +DPIF implementation set to dpif_avx512. +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set autovalidator], [0], [dnl +Miniflow extract implementation set to autovalidator. +]) + +cat packets | while read line; do + AT_CHECK([ovs-appctl netdev-dummy/receive p1 $line], [0], [ignore]) +done + +OVS_WAIT_UNTIL([test `ovs-vsctl get interface p1 statistics | grep -oP 'rx_packets=\s*\K\d+'` -ge 16000]) + +OVS_VSWITCHD_STOP(["dnl +/upcall: datapath reached the dynamic limit of .* flows./d"]) +AT_CLEANUP + +AT_SETUP([dpif-netdev - MFEX Configuration]) +OVS_VSWITCHD_START( + [set Open_vSwitch . other_config:pmd-cpu-mask=0x1 \ + -- add-port br0 p1 \ + -- set Interface p1 type=dummy-pmd], [], [], [--dummy-numa="0,0,0,0,1,1,1,1"]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set scalar 1], [2], +[], [dnl +Error: unknown argument 1. +ovs-appctl: ovs-vswitchd: server returned an error +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 6 study 300 xyz], [2], +[], [dnl +Error: invalid study_pkt_cnt value: xyz. +ovs-appctl: ovs-vswitchd: server returned an error +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set scalar abcd], [2], +[], [dnl +Error: unknown argument abcd. +ovs-appctl: ovs-vswitchd: server returned an error +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 0 scalar abcd], [2], +[], [dnl +Error: unknown argument abcd. +ovs-appctl: ovs-vswitchd: server returned an error +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd], [2], +[], [dnl +Error: -pmd option requires a thread id argument. +ovs-appctl: ovs-vswitchd: server returned an error +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set tudy abcd], [2], +[], [dnl +Error: unknown argument abcd. +ovs-appctl: ovs-vswitchd: server returned an error +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 7 study abcd], [2], +[], [dnl +Error: invalid study_pkt_cnt value: abcd. +ovs-appctl: ovs-vswitchd: server returned an error +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 0 study], [0], [dnl +Miniflow extract implementation set to study, on pmd thread 0, studying 128 packets. +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 0 study 512], [0], [dnl +Miniflow extract implementation set to study, on pmd thread 0, studying 512 packets. +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set study 512], [0], [dnl +Miniflow extract implementation set to study, studying 512 packets. +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set study], [0], [dnl +Miniflow extract implementation set to study, studying 128 packets. +]) + +AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 0 autovalidator], [0], [
[ovs-dev] [PATCH v5 1/2] system-dpdk: Use dummy-pmd port for packet injection.
net_pcap is not always available in DPDK (like, in a dev environment when you forgot to install the libpcap-devel). On the other hand, OVS already has its own way to inject packets into a bridge. Let's make use of it. The generating script outputs a bulk of 8 packets per line (to save some cpu spent calling ovs-appctl). Suggested-by: Ilya Maximets Reviewed-by: Maxime Coquelin Acked-by: Eelco Chaudron Signed-off-by: David Marchand --- Changes since v4: - rebased, - updated genpkts.py so it outputs 8 packets per line and have netdev-dummy/receive called for 8 packets at a time (this little optimisation seems enough to avoid hitting OVS_CTL_TIMEOUT), - stopped calling genpkts.py asynchronuously, Changes since v3: - dropped documentation update following rebase, - fixed regression in fuzzy packets generation, Changes since v2: - updated documentation, - cleaned tests/automake.mk, - fixed shebang in python script, - added missing check for scapy availability, Changes since v1: - renamed generator script, - decreased packet count for fuzzy test, - simplified wait expression for packet count, --- tests/automake.mk | 6 +--- tests/{mfex_fuzzy.py => genpkts.py} | 56 ++--- tests/system-dpdk.at| 24 - 3 files changed, 43 insertions(+), 43 deletions(-) rename tests/{mfex_fuzzy.py => genpkts.py} (66%) diff --git a/tests/automake.mk b/tests/automake.mk index f8a925012d..2ae0aeecaf 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -146,10 +146,6 @@ $(srcdir)/tests/fuzz-regression-list.at: tests/automake.mk echo "TEST_FUZZ_REGRESSION([$$basename])"; \ done > $@.tmp && mv $@.tmp $@ -EXTRA_DIST += $(MFEX_AUTOVALIDATOR_TESTS) -MFEX_AUTOVALIDATOR_TESTS = \ - tests/mfex_fuzzy.py - OVSDB_CLUSTER_TESTSUITE_AT = \ tests/ovsdb-cluster-testsuite.at \ tests/ovsdb-execution.at \ @@ -522,7 +518,7 @@ tests_test_type_props_SOURCES = tests/test-type-props.c CHECK_PYFILES = \ tests/appctl.py \ tests/flowgen.py \ - tests/mfex_fuzzy.py \ + tests/genpkts.py \ tests/ovsdb-monitor-sort.py \ tests/test-daemon.py \ tests/test-dpparse.py \ diff --git a/tests/mfex_fuzzy.py b/tests/genpkts.py similarity index 66% rename from tests/mfex_fuzzy.py rename to tests/genpkts.py index 50b9870641..3354e116d0 100755 --- a/tests/mfex_fuzzy.py +++ b/tests/genpkts.py @@ -17,51 +17,44 @@ except ModuleNotFoundError: from scapy.all import RandMAC, RandIP, PcapWriter, RandIP6, RandShort, fuzz from scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP, random -# Path for the pcap file location. -path = str(sys.argv[1]) # The number of packets generated will be size * 8. -size = int(sys.argv[2]) +size = int(sys.argv[1]) # Traffic option is used to choose between fuzzy or simple packet type. -if len(sys.argv) > 3: -traffic_opt = str(sys.argv[3]) +if len(sys.argv) > 2: +traffic_opt = str(sys.argv[2]) else: traffic_opt = "" -pktdump = PcapWriter(path, append=False, sync=True) - -pkt = [] - for i in range(0, size): +pkt = [] + if traffic_opt == "fuzzy": eth = Ether(src=RandMAC(), dst=RandMAC()) vlan = Dot1Q() -udp = UDP(dport=RandShort(), sport=RandShort()) ipv4 = IP(src=RandIP(), dst=RandIP(), len=random.randint(0, 100)) ipv6 = IPv6(src=RandIP6(), dst=RandIP6(), plen=random.randint(0, 100)) +udp = UDP(dport=RandShort(), sport=RandShort()) tcp = TCP(dport=RandShort(), sport=RandShort(), flags='S', dataofs=random.randint(0, 15)) # IPv4 packets with fuzzing -pkt.append(fuzz(eth / ipv4 / udp)) -pkt.append(fuzz(eth / ipv4 / tcp)) -pkt.append(fuzz(eth / vlan / ipv4 / udp)) -pkt.append(fuzz(eth / vlan / ipv4 / tcp)) +pkt.append(fuzz(eth / ipv4 / udp).build().hex()) +pkt.append(fuzz(eth / ipv4 / tcp).build().hex()) +pkt.append(fuzz(eth / vlan / ipv4 / udp).build().hex()) +pkt.append(fuzz(eth / vlan / ipv4 / tcp).build().hex()) # IPv6 packets with fuzzing -pkt.append(fuzz(eth / ipv6 / udp)) -pkt.append(fuzz(eth / ipv6 / tcp)) -pkt.append(fuzz(eth / vlan / ipv6 / udp)) -pkt.append(fuzz(eth / vlan / ipv6 / tcp)) +pkt.append(fuzz(eth / ipv6 / udp).build().hex()) +pkt.append(fuzz(eth / ipv6 / tcp).build().hex()) +pkt.append(fuzz(eth / vlan / ipv6 / udp).build().hex()) +pkt.append(fuzz(eth / vlan / ipv6 / tcp).build().hex()) else: mac_addr_src = "52:54:00:FF:FF:{:02X}".format(i % 0xff) mac_addr_dst = "80:FF:FF:FF:FF:{:02X}".format(i % 0xff) -src_port = 200 + (i % 20) -dst_port = 1000 + (i % 20) eth = Ether(src=mac_addr_src, dst=mac_addr_dst) vlan = Dot1Q(vlan=(i % 10)) -udp = UDP(dport=src_port, s
Re: [ovs-dev] [PATCH v3 09/11] ci: Fix dpdk build cache key generation.
On Tue, Dec 5, 2023 at 4:00 PM Eelco Chaudron wrote: > > When new drivers are introduced, the cache key is not accurately computed. > > Before the commit 1a1b3106d90e ("ci: Separate DPDK from OVS build."), the > DPDK build process was integrated in .ci/linux-{setup,build}.sh scripts, > where specific lines were employed to generate the key. Since it is now > separated in .ci/dpdk-{setup,build}.sh, this patch computes the key based > on the content of those dedicated scripts. > > Fixes: 4e90baca89f0 ("system-dpdk: Run traffic tests.") > Signed-off-by: Eelco Chaudron Thanks for the fix. Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] dpdk: Update to use v23.11.
This commit adds support for DPDK v23.11. It updates the CI script and documentation and includes the following changes coming from the dpdk-latest branch: - sparse: Add some compiler intrinsics for DPDK build. https://patchwork.ozlabs.org/project/openvswitch/list/?series=371129=* - ci: Cache DPDK installed libraries only. - ci: Reduce optional libraries in DPDK. https://patchwork.ozlabs.org/project/openvswitch/list/?series=383367=* - system-dpdk: Ignore net/ice error log about QinQ offloading. https://patchwork.ozlabs.org/project/openvswitch/list/?series=385259=* Signed-off-by: David Marchand --- .ci/dpdk-build.sh| 28 +++- .ci/linux-build.sh | 9 .github/workflows/build-and-test.yml | 4 ++-- Documentation/faq/releases.rst | 2 +- Documentation/intro/install/dpdk.rst | 16 +++--- Documentation/topics/dpdk/phy.rst| 12 +- Documentation/topics/dpdk/vdev.rst | 2 +- Documentation/topics/dpdk/vhost-user.rst | 2 +- Documentation/topics/testing.rst | 2 +- Documentation/topics/userspace-tso.rst | 2 +- NEWS | 2 ++ debian/control.in| 2 +- include/sparse/automake.mk | 1 + include/sparse/ia32intrin.h | 23 +++ rhel/openvswitch-fedora.spec.in | 2 +- tests/system-dpdk-macros.at | 1 + 16 files changed, 73 insertions(+), 37 deletions(-) create mode 100644 include/sparse/ia32intrin.h diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh index aa83e44643..698b9e1b14 100755 --- a/.ci/dpdk-build.sh +++ b/.ci/dpdk-build.sh @@ -5,25 +5,27 @@ set -x function build_dpdk() { -local VERSION_FILE="dpdk-dir/cached-version" local DPDK_VER=$1 local DPDK_OPTS="" +local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir" +local VERSION_FILE="$DPDK_INSTALL_DIR/cached-version" -rm -rf dpdk-dir +rm -rf dpdk-src +rm -rf $DPDK_INSTALL_DIR if [ "${DPDK_VER##refs/*/}" != "${DPDK_VER}" ]; then -git clone --single-branch $DPDK_GIT dpdk-dir -b "${DPDK_VER##refs/*/}" -pushd dpdk-dir +git clone --single-branch $DPDK_GIT dpdk-src -b "${DPDK_VER##refs/*/}" +pushd dpdk-src git log -1 --oneline else wget https://fast.dpdk.org/rel/dpdk-$1.tar.xz tar xvf dpdk-$1.tar.xz > /dev/null DIR_NAME=$(tar -tf dpdk-$1.tar.xz | head -1 | cut -f1 -d"/") -mv ${DIR_NAME} dpdk-dir -pushd dpdk-dir +mv ${DIR_NAME} dpdk-src +pushd dpdk-src fi -# Switching to 'default' machine to make dpdk-dir cache usable on +# Switching to 'default' machine to make the dpdk cache usable on # different CPUs. We can't be sure that all CI machines are exactly same. DPDK_OPTS="$DPDK_OPTS -Dmachine=default" @@ -40,16 +42,22 @@ function build_dpdk() DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd" enable_drivers="net/null,net/af_xdp,net/tap,net/virtio" DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers" +# OVS depends on the vhost library (and its dependencies). +# net/tap depends on the gso library. +DPDK_OPTS="$DPDK_OPTS -Denable_libs=cryptodev,dmadev,gso,vhost" # Install DPDK using prefix. -DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build" +DPDK_OPTS="$DPDK_OPTS --prefix=$DPDK_INSTALL_DIR" meson $DPDK_OPTS build ninja -C build ninja -C build install - -echo "Installed DPDK in $(pwd)" popd + +# Remove examples sources. +rm -rf $DPDK_INSTALL_DIR/share/dpdk/examples + +echo "Installed DPDK in $DPDK_INSTALL_DIR" echo "${DPDK_VER}" > ${VERSION_FILE} } diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh index aa2ecc5050..2536d57d0e 100755 --- a/.ci/linux-build.sh +++ b/.ci/linux-build.sh @@ -9,8 +9,9 @@ EXTRA_OPTS="--enable-Werror" function install_dpdk() { -local VERSION_FILE="dpdk-dir/cached-version" -local DPDK_LIB=$(pwd)/dpdk-dir/build/lib/x86_64-linux-gnu +local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir" +local VERSION_FILE="${DPDK_INSTALL_DIR}/cached-version" +local DPDK_LIB=${DPDK_INSTALL_DIR}/lib/x86_64-linux-gnu if [ "$DPDK_SHARED" ]; then EXTRA_OPTS="$EXTRA_OPTS --with-dpdk=shared" @@ -26,13 +27,13 @@ function install_dpdk() export PATH=$(pwd)/dpdk-dir/build/bin:$PATH if [ ! -f "${VERSION_FILE}" ]; then -echo "Could not find DPDK in $(pwd)/dpdk-dir" +echo "Could not find DPDK in $DPDK_INSTALL_DIR" return 1 fi # Update the library paths. sudo ldconfig -echo "Found cached DPDK $(cat ${
[ovs-dev] [dpdk-latest] system-dpdk: Ignore net/ice error log about QinQ offloading.
The net/ice DPDK driver complains with an ERROR level log message if the hw firmware only supports SVM (Single Vlan Mode). DVM (Dual Vlan mode) seems required when using QinQ offloading. OVS does not care about this offloading feature and configures nothing on that topic. While seeing this error log, some manual tests show that untagged/tagged/"double" tagged packets (with 0x8100 ethertype) are still received/transmitted fine. Ignore this log waiting for a fix on the DPDK side. Link: https://bugs.dpdk.org/show_bug.cgi?id=1331 Signed-off-by: David Marchand --- tests/system-dpdk-macros.at | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index dcdfa55741..c011487541 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -86,6 +86,7 @@ $1";/does not exist. The Open vSwitch kernel module is probably not loaded./d /does not support MTU configuration,/d /EAL: No \(available\|free\) .*hugepages reported/d /Failed to enable flow control/d +/ice_vsi_config_outer_vlan_stripping(): Single VLAN mode (SVM) does not support qinq/d /Rx checksum offload is not supported on/d /TELEMETRY: No legacy callbacks, legacy socket not created/d"]) ]) -- 2.42.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5] system-dpdk: Test with mlx5 devices.
On Tue, Nov 28, 2023 at 9:40 AM David Marchand wrote: > > On Wed, Nov 22, 2023 at 5:34 PM David Marchand > wrote: > > > > The DPDK unit test only runs if vfio or igb_uio kernel modules are loaded: > > on systems with only mlx5, this test is always skipped. > > > > Besides, the test tries to grab the first device listed by dpdk-devbind.py, > > regardless of the PCI device status regarding kmod binding. > > > > Remove dependency on this DPDK script and use a minimal script that > > reads PCI sysfs. > > > > This script is not perfect, as one can imagine PCI devices bound to > > vfio-pci for virtual machines. > > Plus, this script only tries to take over vfio-pci devices. mlx5 devices > > can't be taken over blindly as it could mean losing connectivity to the > > machine if the netdev was in use for this system. > > > > For those two reasons, add a new environment variable DPDK_PCI_ADDR for > > testers to select the PCI device of their liking. > > For consistency and grep, the temporary file PCI_ADDR is renamed > > to DPDK_PCI_ADDR. > > > > Reviewed-by: Maxime Coquelin > > Acked-by: Eelco Chaudron > > Signed-off-by: David Marchand > > This patch can't be merged as is. > I am preparing some fixes for the system-dpdk MTU tests that got > merged since my v4. Coming back on this. I sent a fix https://patchwork.ozlabs.org/project/openvswitch/patch/20231201142931.1782046-1-david.march...@redhat.com/. This current patch on making it possible to select a PCI device to test is kind of orthogonal to this fix. Yet, people will likely want to test with both applied. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] system-dpdk: Wait for MTU changes to be applied.
Because a DPDK backed netdev configuration is done in an asynchronous way, and a MTU change requires a reconfiguration, directly checking ovs-vswitchd logs or querying ovsdb for the interface current MTU value is racy. Add synchronisation points on the interface MTU value in ovsdb as it ensures that a netdev (re)configuration did happen. With those synchronisation points in place, error messages may be checked in logs afterward. Fixes: bf47829116a8 ("tests: Add OVS-DPDK MTU unit tests.") Signed-off-by: David Marchand --- Changes since v1: - dropped test output, --- tests/system-dpdk.at | 42 -- 1 file changed, 12 insertions(+), 30 deletions(-) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 17742d20a0..af092a2000 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -511,15 +511,13 @@ dnl Add userspace bridge and attach it to OVS with default MTU value AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr]) AT_CHECK([ovs-vsctl show], [], [stdout]) -sleep 2 dnl Check default MTU value in the datapath -AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl -1500 -]) +OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [1500]) dnl Increase MTU value and check in the datapath AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9000]) +OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [9000]) dnl Fail if MTU is not supported AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" ovs-vswitchd.log], [], [stdout]) @@ -527,10 +525,6 @@ AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" ovs-vswitch dnl Fail if error is encountered during MTU setup AT_FAIL_IF([grep "Interface phy0 MTU (9000) setup error" ovs-vswitchd.log], [], [stdout]) -AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl -9000 -]) - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 phy0], [], [stdout], [stderr]) @@ -555,7 +549,9 @@ AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr]) AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9000]) AT_CHECK([ovs-vsctl show], [], [stdout]) -sleep 2 + +dnl Check MTU value in the datapath +OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [9000]) dnl Fail if MTU is not supported AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" ovs-vswitchd.log], [], [stdout]) @@ -563,17 +559,9 @@ AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" ovs-vswitch dnl Fail if error is encountered during MTU setup AT_FAIL_IF([grep "Interface phy0 MTU (9000) setup error" ovs-vswitchd.log], [], [stdout]) -dnl Check MTU value in the datapath -AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl -9000 -]) - dnl Decrease MTU value and check in the datapath AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=2000]) - -AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl -2000 -]) +OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [2000]) dnl Clean up @@ -680,7 +668,9 @@ AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr]) AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9702]) AT_CHECK([ovs-vsctl show], [], [stdout]) -sleep 2 + +dnl Check MTU value in the datapath +OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [9702]) dnl Fail if MTU is not supported AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" ovs-vswitchd.log], [], [stdout]) @@ -688,11 +678,6 @@ AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" ovs-vswitch dnl Fail if error is encountered during MTU setup AT_FAIL_IF([grep "Interface phy0 MTU (9702) setup error" ovs-vswitchd.log], [], [stdout]) -dnl Check MTU value in the datapath -AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl -9702 -]) - dnl Set MTU value above upper bound and check for error AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9711]) AT_CHECK([grep "phy0: unsupported MTU 9711" ovs-vswitchd.log], [], [stdout]) @@ -721,7 +706,9 @@ AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr]) AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=68]) AT_CHECK([ovs-vsctl show], [], [stdout]) -sleep 2 + +dnl Check MTU value in the datapath +OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [68]) dnl Fail i
[ovs-dev] [PATCH] system-dpdk: Wait for MTU changes to be applied.
Because a DPDK backed netdev configuration is done in an asynchronous way, and a MTU change requires a reconfiguration, directly checking ovs-vswitchd logs or querying ovsdb for the interface current MTU value is racy. $ DPDK_DIR=/root/ovs/dpdk-dir/v22.11 make -C build check-dpdk \ TESTSUITEFLAGS="-d 15" ... 15: OVS-DPDK - MTU decrease phy portFAILED (system-dpdk.at:574) Test log: ... ../../tests/system-dpdk.at:572: ovs-vsctl set Interface phy0 mtu_request=2000 ../../tests/system-dpdk.at:574: ovs-vsctl get Interface phy0 mtu --- - 2023-12-01 08:55:46.896454338 -0500 +++ .../tests/system-dpdk-testsuite.dir/at-groups/15/stdout 2023-12-01 08:55:46.894532711 -0500 @@ -1,2 +1,2 @@ -2000 +9000 ovs-vswitchd log: 2023-12-01T13:55:44.666Z|00098|netdev_dpdk|INFO|Port 0: 50:7c:6f:3c:0c:26 2023-12-01T13:55:44.667Z|00099|netdev_dpdk|INFO|phy0: rx-steering: default rss ... 2023-12-01T13:55:44.686Z|00102|timeval|WARN|Unreasonably long 1070ms poll interval (339ms user, 728ms system) 2023-12-01T13:55:44.686Z|00103|timeval|WARN|faults: 1 minor, 0 major 2023-12-01T13:55:44.686Z|00104|timeval|WARN|context switches: 0 voluntary, 9 involuntary ... 2023-12-01T13:55:45.692Z|00150|poll_loop|INFO|wakeup due to [POLLIN] on fd 54 (FIFO pipe:[1781849]) at ../vswitchd/bridge.c:421 (68% CPU usage) 2023-12-01T13:55:46.883Z|00151|netdev_dpdk|INFO|Port 0: 50:7c:6f:3c:0c:26 2023-12-01T13:55:46.884Z|00152|netdev_dpdk|INFO|phy0: rx-steering: default rss Add synchronisation points on the interface MTU value in ovsdb as it ensures that a netdev (re)configuration did happen. With those synchronisation points in place, error messages may be checked in logs afterward. Fixes: bf47829116a8 ("tests: Add OVS-DPDK MTU unit tests.") Signed-off-by: David Marchand --- tests/system-dpdk.at | 42 -- 1 file changed, 12 insertions(+), 30 deletions(-) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 17742d20a0..af092a2000 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -511,15 +511,13 @@ dnl Add userspace bridge and attach it to OVS with default MTU value AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr]) AT_CHECK([ovs-vsctl show], [], [stdout]) -sleep 2 dnl Check default MTU value in the datapath -AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl -1500 -]) +OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [1500]) dnl Increase MTU value and check in the datapath AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9000]) +OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [9000]) dnl Fail if MTU is not supported AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" ovs-vswitchd.log], [], [stdout]) @@ -527,10 +525,6 @@ AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" ovs-vswitch dnl Fail if error is encountered during MTU setup AT_FAIL_IF([grep "Interface phy0 MTU (9000) setup error" ovs-vswitchd.log], [], [stdout]) -AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl -9000 -]) - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 phy0], [], [stdout], [stderr]) @@ -555,7 +549,9 @@ AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr]) AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9000]) AT_CHECK([ovs-vsctl show], [], [stdout]) -sleep 2 + +dnl Check MTU value in the datapath +OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [9000]) dnl Fail if MTU is not supported AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" ovs-vswitchd.log], [], [stdout]) @@ -563,17 +559,9 @@ AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" ovs-vswitch dnl Fail if error is encountered during MTU setup AT_FAIL_IF([grep "Interface phy0 MTU (9000) setup error" ovs-vswitchd.log], [], [stdout]) -dnl Check MTU value in the datapath -AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl -9000 -]) - dnl Decrease MTU value and check in the datapath AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=2000]) - -AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl -2000 -]) +OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [2000]) dnl Clean up @@ -680,7 +668,9 @@ AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr]) AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9702]) AT_CHECK([ovs-vsctl show], [],
Re: [ovs-dev] [DPDK 23.11 ? ][ovs-build] |fail| pw1869766 cirrus: Update from FreeBSD 12 to 14.
On Wed, Nov 29, 2023 at 8:06 PM Ilya Maximets wrote: > > On 11/29/23 18:51, ovs_jenk...@intel.com wrote: > > Test-Label: intel-ovs-compilation > > Test-Status: fail > > http://patchwork.ozlabs.org/api/patches/1869766/ > > > > AVX-512_compilation: failed > > DPLCS Test: fail > > DPIF Test: fail > > MFEX Test: fail > > Actions Test: fail > > Errors in DPCLS test: > > make check-dpdk > > make all-am > > make[1]: Entering directory '/root/ovs-dev' > > make[1]: Leaving directory '/root/ovs-dev' > > set /bin/bash './tests/system-dpdk-testsuite' -C tests > > AUTOTEST_PATH='utilities:vswitchd:ovsdb:vtep:tests:ipsec::'; \ > > "$@" -j1 || (test X'' = Xyes && "$@" --recheck) > > > > > ## -- ## > > ## system-dpdk-testsuite.log was created. ## > > ## -- ## > > > > Please send `tests/system-dpdk-testsuite.log' and all information you think > > might help: > > > >To: > >Subject: [openvswitch 3.2.90] system-dpdk-testsuite: 2 6 10 14 15 18 19 > > failed > > > > You may investigate any problem if you feel able to do so, in which > > case the test suite provides a good starting point. Its output may > > be found below `tests/system-dpdk-testsuite.dir'. > > > > make: *** [Makefile:7061: check-dpdk] Error 1 > >> 2023-11-29T17:03:47.661Z|00089|dpdk|INFO|Device with port_id=0 already > >> stopped > >> 2023-11-29T17:03:47.661Z|1|dpdk(pmd-c88/id:103)|INFO|PMD thread uses > >> DPDK lcore 2. > >> 2023-11-29T17:03:47.778Z|00090|dpdk|ERR|ice_vsi_config_outer_vlan_stripping(): > >> Single VLAN mode (SVM) does not support qinq > > Hi, Michael, others. > > This is a strange error in the log. This log message was introduced > in DPDK commit de5da9d16430 ("net/ice: support double VLAN"), which > is only in DPDK 23.07+, while we should still be using 22.11 here. > > Could you, please, check? > > But also, is something needs to be changed for the support of 23.11 > so the tests will not fail once we upgrade? Looks like some driver > configuration error that should not happen. Driver bug? > CC: David, Kevin. As far as I can see, this is "just" an error level log message with no impact. My guess is that offloading of vlan is now dependant on some firmware level feature. Such firmware stuff is probably related to ddp files or other black box. The net/ice driver spews this warning but does not fail to initialise the port. OVS does not rely on vlan offloading (and it is a good thing as I think Intel broke it a number of times with the E810 driver...). I tested 23.11 on dpdk-latest, with untagged packets, single encapsulated vlan packet and double vlan, and all of them are received correctly. Sending is also functional. A fix is needed on DPDK side. For now, OVS can ignore this stupid (sorry I don't have better words and I am a bit tired of such stuff) log message. Cc: Intel maintainers. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5] system-dpdk: Test with mlx5 devices.
On Wed, Nov 22, 2023 at 5:34 PM David Marchand wrote: > > The DPDK unit test only runs if vfio or igb_uio kernel modules are loaded: > on systems with only mlx5, this test is always skipped. > > Besides, the test tries to grab the first device listed by dpdk-devbind.py, > regardless of the PCI device status regarding kmod binding. > > Remove dependency on this DPDK script and use a minimal script that > reads PCI sysfs. > > This script is not perfect, as one can imagine PCI devices bound to > vfio-pci for virtual machines. > Plus, this script only tries to take over vfio-pci devices. mlx5 devices > can't be taken over blindly as it could mean losing connectivity to the > machine if the netdev was in use for this system. > > For those two reasons, add a new environment variable DPDK_PCI_ADDR for > testers to select the PCI device of their liking. > For consistency and grep, the temporary file PCI_ADDR is renamed > to DPDK_PCI_ADDR. > > Reviewed-by: Maxime Coquelin > Acked-by: Eelco Chaudron > Signed-off-by: David Marchand This patch can't be merged as is. I am preparing some fixes for the system-dpdk MTU tests that got merged since my v4. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 6/9] ci: Fix dpdk build cache key generation.
On Mon, Nov 27, 2023 at 3:26 PM Eelco Chaudron wrote: > On 27 Nov 2023, at 13:53, David Marchand wrote: > > On Mon, Nov 27, 2023 at 1:39 PM Eelco Chaudron wrote: > >> > >> When new drivers are introduced, the cache key is not accurately computed. > >> Previously, the dpdk build process was integrated into the main Linux > >> build script, where specific lines were employed to generate the key. > >> Since it is now separated into two distinct files, this patch will > >> compute the key based on the content of these two files. > > > > I would rephrase this last sentence, as "two distinct files" and > > "these two files" are a bit unclear. > > Afaiu, the former refers to .ci/{linux,dpdk}-build.sh, while the > > latter refers to .ci/dpdk-{setup,build}.sh > > > > Otherwise the fix lgtm. > > So what about changing the commit message as follows: > > > Previously, the dpdk build process was integrated into the > .ci/{linux,dpdk}-build.sh scripts, where specific lines were employed to > generate the key. Since it is now separated into two distinct files, > .ci/dpdk-{setup,build}.sh, this patch will compute the key based on the > content of these two files. """ When new drivers are introduced, the cache key is not accurately computed. Before the commit 1a1b3106d90e ("ci: Separate DPDK from OVS build."), the DPDK build process was integrated in .ci/linux-{setup,build}.sh scripts, where specific lines were employed to generate the key. Since it is now separated in .ci/dpdk-{setup,build}.sh, this patch computes the key based on the content of those dedicated scripts. """ Deal? -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 6/9] ci: Fix dpdk build cache key generation.
Hello Eelco, On Mon, Nov 27, 2023 at 1:39 PM Eelco Chaudron wrote: > > When new drivers are introduced, the cache key is not accurately computed. > Previously, the dpdk build process was integrated into the main Linux > build script, where specific lines were employed to generate the key. > Since it is now separated into two distinct files, this patch will > compute the key based on the content of these two files. I would rephrase this last sentence, as "two distinct files" and "these two files" are a bit unclear. Afaiu, the former refers to .ci/{linux,dpdk}-build.sh, while the latter refers to .ci/dpdk-{setup,build}.sh Otherwise the fix lgtm. > > Fixes: 4e90baca89f0 ("system-dpdk: Run traffic tests.") > Signed-off-by: Eelco Chaudron > --- > .github/workflows/build-and-test.yml |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/.github/workflows/build-and-test.yml > b/.github/workflows/build-and-test.yml > index d74668f61..e9a2714fb 100644 > --- a/.github/workflows/build-and-test.yml > +++ b/.github/workflows/build-and-test.yml > @@ -30,7 +30,7 @@ jobs: ># This also allows us to use cache from any branch as long as version ># and a way we're building DPDK stays the same. >run: | > -grep -irE 'RTE_|DPDK|meson|ninja' .ci/dpdk-* > dpdk-ci-signature > +cat .ci/dpdk-* > dpdk-ci-signature > grep -rwE 'DPDK_GIT|DPDK_VER' .github/ >> dpdk-ci-signature > if [ "${DPDK_VER##refs/*/}" != "${DPDK_VER}" ]; then > git ls-remote --heads $DPDK_GIT $DPDK_VER >> dpdk-ci-signature > Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH branch-3.2] dpdk: Use DPDK 22.11.3 release for OVS 3.2.
On Thu, Nov 23, 2023 at 12:51 PM Kevin Traynor wrote: > > Update the CI and docs to use DPDK 22.11.3. > > Signed-off-by: Kevin Traynor Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH branch-3.1] dpdk: Use DPDK 22.11.3 release for OVS 3.1.
On Thu, Nov 23, 2023 at 12:51 PM Kevin Traynor wrote: > > Update the CI and docs to use DPDK 22.11.3. > > Signed-off-by: Kevin Traynor Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH branch-3.0] dpdk: Use DPDK 21.11.5 release for OVS 3.0.
On Thu, Nov 23, 2023 at 12:51 PM Kevin Traynor wrote: > > Update the CI and docs to use DPDK 21.11.5. > > Signed-off-by: Kevin Traynor Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH branch-2.17] dpdk: Use DPDK 21.11.5 release for OVS 2.17.
On Thu, Nov 23, 2023 at 12:50 PM Kevin Traynor wrote: > > Update the CI and docs to use DPDK 21.11.5. > > Signed-off-by: Kevin Traynor > --- > .ci/linux-build.sh | 2 +- > Documentation/faq/releases.rst | 2 +- > Documentation/intro/install/dpdk.rst | 8 > NEWS | 3 +++ > 4 files changed, 9 insertions(+), 6 deletions(-) > > diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh > index f5021e1a8..9464ea49c 100755 > --- a/.ci/linux-build.sh > +++ b/.ci/linux-build.sh > @@ -221,5 +221,5 @@ fi > if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then > if [ -z "$DPDK_VER" ]; then > -DPDK_VER="21.11.2" > +DPDK_VER="21.11.5" > fi > install_dpdk $DPDK_VER > diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst > index 49895c595..0e0c589a3 100644 > --- a/Documentation/faq/releases.rst > +++ b/Documentation/faq/releases.rst > @@ -211,5 +211,5 @@ Q: What DPDK version does each Open vSwitch release work > with? > 2.15.x 20.11.6 > 2.16.x 20.11.6 > -2.17.x 21.11.2 > +2.17.x 21.11.5 > > > diff --git a/Documentation/intro/install/dpdk.rst > b/Documentation/intro/install/dpdk.rst > index a284e6851..559e8eb1f 100644 > --- a/Documentation/intro/install/dpdk.rst > +++ b/Documentation/intro/install/dpdk.rst > @@ -43,5 +43,5 @@ In addition to the requirements described in > :doc:`general`, building Open > vSwitch with DPDK will require the following: > > -- DPDK 21.11.2 > +- DPDK 21.11.5 > > - A `DPDK supported NIC`_ > @@ -74,7 +74,7 @@ Install DPDK > > $ cd /usr/src/ > - $ wget https://fast.dpdk.org/rel/dpdk-21.11.2.tar.xz > - $ tar xf dpdk-21.11.2.tar.xz > - $ export DPDK_DIR=/usr/src/dpdk-stable-21.11.2 > + $ wget https://fast.dpdk.org/rel/dpdk-21.11.5.tar.xz > + $ tar xf dpdk-21.11.5.tar.xz > + $ export DPDK_DIR=/usr/src/dpdk-stable-21.11.5 > $ cd $DPDK_DIR > > diff --git a/NEWS b/NEWS > index 7d4a8c081..642beb45b 100644 > --- a/NEWS > +++ b/NEWS > @@ -1,4 +1,7 @@ > v2.17.9 - xx xxx > - > + - Bug fixes I see in the history that the "Bug fixes" characterization is usually added when releasing a version. So I am not sure it should be added in this patch. > + - DPDK: > + * OVS validated with DPDK 21.11.5 > > v2.17.8 - 17 Oct 2023 Otherwise, it lgtm. Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v5] system-dpdk: Test with mlx5 devices.
The DPDK unit test only runs if vfio or igb_uio kernel modules are loaded: on systems with only mlx5, this test is always skipped. Besides, the test tries to grab the first device listed by dpdk-devbind.py, regardless of the PCI device status regarding kmod binding. Remove dependency on this DPDK script and use a minimal script that reads PCI sysfs. This script is not perfect, as one can imagine PCI devices bound to vfio-pci for virtual machines. Plus, this script only tries to take over vfio-pci devices. mlx5 devices can't be taken over blindly as it could mean losing connectivity to the machine if the netdev was in use for this system. For those two reasons, add a new environment variable DPDK_PCI_ADDR for testers to select the PCI device of their liking. For consistency and grep, the temporary file PCI_ADDR is renamed to DPDK_PCI_ADDR. Reviewed-by: Maxime Coquelin Acked-by: Eelco Chaudron Signed-off-by: David Marchand --- Changes since v4: - separated from the original series, - rebased, - dropped mlx5 devices from the discovery script, - documented DPDK_PCI_ADDR env variable, Changes since v3: - fixed nit from Maxime, Changes since v2: - sorted logs alphabetically, --- Documentation/topics/testing.rst | 11 ++--- tests/automake.mk| 1 + tests/system-dpdk-find-device.py | 39 tests/system-dpdk-macros.at | 10 ++-- tests/system-dpdk.at | 14 ++-- 5 files changed, 57 insertions(+), 18 deletions(-) create mode 100755 tests/system-dpdk-find-device.py diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst index 5f6940b84d..fb9b3e77b1 100644 --- a/Documentation/topics/testing.rst +++ b/Documentation/topics/testing.rst @@ -343,15 +343,20 @@ To see a list of all the available tests, run:: These tests support a `DPDK supported NIC`_. The tests operate on a wider set of environments, for instance, when a virtual port is used. -They do require proper DPDK variables (``DPDK_DIR`` and ``DPDK_BUILD``). Moreover you need to have root privileges to load the required modules and to bind -the NIC to the DPDK-compatible driver. +a PCI device to the DPDK-compatible driver. .. _DPDK supported NIC: https://core.dpdk.org/supported/#nics +The phy test will skip if no suitable PCI device is found. +It is possible to select which PCI device is used for this test by setting the +DPDK_PCI_ADDR environment variable, which is especially useful when testing +with a mlx5 device:: + +# DPDK_PCI_ADDR=:82:00.0 make check-dpdk + All tests are skipped if no hugepages are configured. User must look into the DPDK manual to figure out how to `Configure hugepages`_. -The phy test will skip if no compatible physical device is available. .. _Configure hugepages: https://doc.dpdk.org/guides-22.11/linux_gsg/sys_reqs.html diff --git a/tests/automake.mk b/tests/automake.mk index f8a925012d..69117514cc 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -192,6 +192,7 @@ SYSTEM_OFFLOADS_TESTSUITE_AT = \ SYSTEM_DPDK_TESTSUITE_AT = \ tests/system-common-macros.at \ + tests/system-dpdk-find-device.py \ tests/system-dpdk-macros.at \ tests/system-dpdk-testsuite.at \ tests/system-dpdk.at diff --git a/tests/system-dpdk-find-device.py b/tests/system-dpdk-find-device.py new file mode 100755 index 00..b0997380bc --- /dev/null +++ b/tests/system-dpdk-find-device.py @@ -0,0 +1,39 @@ +#!/usr/bin/env python3 +# Copyright (c) 2021 Red Hat, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +from pathlib import Path +import os +import sys + +# The tester might want to select a PCI device, if so, trust it. +if 'DPDK_PCI_ADDR' in os.environ: +print(os.environ['DPDK_PCI_ADDR']) +sys.exit(0) + +for device in sorted(Path('/sys/bus/pci/devices').iterdir()): +class_path = device / 'class' +# Only consider Network class devices +if class_path.read_text().strip() != '0x02': +continue +kmod_path = device / 'driver' / 'module' +kmod_name = kmod_path.resolve().name +# Only care about devices bound to vfio_pci or igb_uio. +if kmod_name not in ['vfio_pci', 'igb_uio']: +continue +print(device.resolve().name) +sys.exit(0) + +sys.exit(1) diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index dcdfa55741..3b5a3512d4 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/s
[ovs-dev] [dpdk-latest 2/2] ci: Reduce optional libraries in DPDK.
Since DPDK v23.11, it is possible to select more easily which optional library is enabled. OVS needs the vhost library (and its dependencies). The net/tap DPDK driver needs the gso library. Other optional library can be disabled. This reduces the cache entry for DPDK from ~7MB to ~4MB. Signed-off-by: David Marchand --- .ci/dpdk-build.sh | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh index 8380c48637..698b9e1b14 100755 --- a/.ci/dpdk-build.sh +++ b/.ci/dpdk-build.sh @@ -42,6 +42,9 @@ function build_dpdk() DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd" enable_drivers="net/null,net/af_xdp,net/tap,net/virtio" DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers" +# OVS depends on the vhost library (and its dependencies). +# net/tap depends on the gso library. +DPDK_OPTS="$DPDK_OPTS -Denable_libs=cryptodev,dmadev,gso,vhost" # Install DPDK using prefix. DPDK_OPTS="$DPDK_OPTS --prefix=$DPDK_INSTALL_DIR" -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [dpdk-latest 1/2] ci: Cache DPDK installed libraries only.
Rather than save the whole DPDK sources and build artefacts, checkout sources in a separate directory and build DPDK there. Only the installed artefacts are then going to the cache. Example sources in the share/dpdk installed directory can be pruned too. This makes a (v23.11-rc3) DPDK cache entry size go from ~120MB to ~7MB. Signed-off-by: David Marchand --- .ci/dpdk-build.sh | 25 +++-- .ci/linux-build.sh | 9 + 2 files changed, 20 insertions(+), 14 deletions(-) diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh index aa83e44643..8380c48637 100755 --- a/.ci/dpdk-build.sh +++ b/.ci/dpdk-build.sh @@ -5,25 +5,27 @@ set -x function build_dpdk() { -local VERSION_FILE="dpdk-dir/cached-version" local DPDK_VER=$1 local DPDK_OPTS="" +local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir" +local VERSION_FILE="$DPDK_INSTALL_DIR/cached-version" -rm -rf dpdk-dir +rm -rf dpdk-src +rm -rf $DPDK_INSTALL_DIR if [ "${DPDK_VER##refs/*/}" != "${DPDK_VER}" ]; then -git clone --single-branch $DPDK_GIT dpdk-dir -b "${DPDK_VER##refs/*/}" -pushd dpdk-dir +git clone --single-branch $DPDK_GIT dpdk-src -b "${DPDK_VER##refs/*/}" +pushd dpdk-src git log -1 --oneline else wget https://fast.dpdk.org/rel/dpdk-$1.tar.xz tar xvf dpdk-$1.tar.xz > /dev/null DIR_NAME=$(tar -tf dpdk-$1.tar.xz | head -1 | cut -f1 -d"/") -mv ${DIR_NAME} dpdk-dir -pushd dpdk-dir +mv ${DIR_NAME} dpdk-src +pushd dpdk-src fi -# Switching to 'default' machine to make dpdk-dir cache usable on +# Switching to 'default' machine to make the dpdk cache usable on # different CPUs. We can't be sure that all CI machines are exactly same. DPDK_OPTS="$DPDK_OPTS -Dmachine=default" @@ -42,14 +44,17 @@ function build_dpdk() DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers" # Install DPDK using prefix. -DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build" +DPDK_OPTS="$DPDK_OPTS --prefix=$DPDK_INSTALL_DIR" meson $DPDK_OPTS build ninja -C build ninja -C build install - -echo "Installed DPDK in $(pwd)" popd + +# Remove examples sources. +rm -rf $DPDK_INSTALL_DIR/share/dpdk/examples + +echo "Installed DPDK in $DPDK_INSTALL_DIR" echo "${DPDK_VER}" > ${VERSION_FILE} } diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh index 170f4730f3..ea1e6599f4 100755 --- a/.ci/linux-build.sh +++ b/.ci/linux-build.sh @@ -12,8 +12,9 @@ EXTRA_OPTS="--enable-Werror" function install_dpdk() { -local VERSION_FILE="dpdk-dir/cached-version" -local DPDK_LIB=$(pwd)/dpdk-dir/build/lib/x86_64-linux-gnu +local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir" +local VERSION_FILE="${DPDK_INSTALL_DIR}/cached-version" +local DPDK_LIB=${DPDK_INSTALL_DIR}/lib/x86_64-linux-gnu if [ "$DPDK_SHARED" ]; then EXTRA_OPTS="$EXTRA_OPTS --with-dpdk=shared" @@ -29,13 +30,13 @@ function install_dpdk() export PATH=$(pwd)/dpdk-dir/build/bin:$PATH if [ ! -f "${VERSION_FILE}" ]; then -echo "Could not find DPDK in $(pwd)/dpdk-dir" +echo "Could not find DPDK in $DPDK_INSTALL_DIR" return 1 fi # Update the library paths. sudo ldconfig -echo "Found cached DPDK $(cat ${VERSION_FILE}) build in $(pwd)/dpdk-dir" +echo "Found cached DPDK $(cat ${VERSION_FILE}) build in $DPDK_INSTALL_DIR" } function configure_ovs() -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v9 11/11] system-dpdk: Run traffic tests.
On Mon, Nov 20, 2023 at 4:58 PM David Marchand wrote: > > Integrate system-traffic.at tests as part of check-dpdk. > > Some tests that can't work with the userspace datapath are skipped by > overriding some OVS_CHECK_* macros. > > ADD_VETH is implemented using the net/af_xdp DPDK driver. > > Signed-off-by: David Marchand > Acked-by: Eelco Chaudron I ran this series 10 times with no failure in GHA. https://github.com/david-marchand/ovs/actions/runs/6882128624 (you may browse through the 10 attempts through the "Latest #10" button in this page though you need to be logged in) Intel CI seems happy this time for every patch. The robot also reported no issue when testing per patch. SHIP IT! Notes: - there is an issue with "recent" glibc which will be looked at separately to this series. For the detail on this topic, starting glibc 2.37 (iow Fedora 38 / Ubuntu 23.04), dpdk logs are split at a 128 bytes boundary which breaks log matching in the unit tests. - this series runs fine on the dpdk-latest branch too, -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v9 11/11] system-dpdk: Run traffic tests.
Integrate system-traffic.at tests as part of check-dpdk. Some tests that can't work with the userspace datapath are skipped by overriding some OVS_CHECK_* macros. ADD_VETH is implemented using the net/af_xdp DPDK driver. Signed-off-by: David Marchand Acked-by: Eelco Chaudron --- Changes since v6: - fixed some checkpatch warning, Changes since v4: - switched to net/af_xdp, this removes the tweaking needed for net/tap, and it lets existing tool relying on kernel netdevs. veth offloading still needs some tweaking, Changes since v3: - reverted --dummy-numa and opted for configuring a number of rxqs relevant to the number of NUMA sockets, Changes since v2: - added ADD_VETH_IGNORE_LOGS and moved ignored error logs to OVS_TRAFFIC_VSWITCHD_STOP, - added --no-pci to DPDK options to avoid failing the tests when running in a vm with a virtio-net device, - faked a mono numa/mono core so that OVS requests at max 2 txqs on the net/tap port, --- .ci/dpdk-build.sh| 3 +- .github/workflows/build-and-test.yml | 2 +- tests/system-dpdk-macros.at | 77 tests/system-dpdk-testsuite.at | 2 + tests/system-dpdk.at | 3 -- 5 files changed, 82 insertions(+), 5 deletions(-) diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh index 35540f0694..aa83e44643 100755 --- a/.ci/dpdk-build.sh +++ b/.ci/dpdk-build.sh @@ -38,7 +38,8 @@ function build_dpdk() # any DPDK driver. # check-dpdk unit tests requires testpmd and some net/ driver. DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd" -DPDK_OPTS="$DPDK_OPTS -Denable_drivers=net/null,net/tap,net/virtio" +enable_drivers="net/null,net/af_xdp,net/tap,net/virtio" +DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers" # Install DPDK using prefix. DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build" diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml index 4f62efb7c3..09654205e7 100644 --- a/.github/workflows/build-and-test.yml +++ b/.github/workflows/build-and-test.yml @@ -5,7 +5,7 @@ on: [push, pull_request] jobs: build-dpdk: env: - dependencies: gcc libnuma-dev ninja-build + dependencies: gcc libbpf-dev libnuma-dev ninja-build pkgconf CC: gcc DPDK_GIT: https://dpdk.org/git/dpdk-stable DPDK_VER: 22.11.1 diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index 7fedfd6515..dcdfa55741 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -127,3 +127,80 @@ m4_define([OVS_DPDK_STOP_TESTPMD], [AT_CHECK([kill `cat testpmd.pid`]) OVS_WAIT([kill -0 `cat testpmd.pid`], [kill -9 `cat testpmd.pid`]) ]) + + +# OVS_TRAFFIC_VSWITCHD_START([vsctl-args], [vsctl-output], [dbinit-aux-args]) +# +# Creates a database and starts ovsdb-server, starts ovs-vswitchd +# connected to that database, calls ovs-vsctl to create a bridge named +# br0 with predictable settings, passing 'vsctl-args' as additional +# commands to ovs-vsctl. If 'vsctl-args' causes ovs-vsctl to provide +# output (e.g. because it includes "create" commands) then 'vsctl-output' +# specifies the expected output after filtering through uuidfilt. +# 'dbinit-aux-args' are passed as additional commands to 'ovs-vsctl init' +# before starting ovs-vswitchd. +m4_define([OVS_TRAFFIC_VSWITCHD_START], + [ + OVS_DPDK_PRE_CHECK() + OVS_WAIT_WHILE([ip link show ovs-netdev]) + dnl For functional tests, no need for DPDK PCI probing. + OVS_DPDK_START([--no-pci], [--disable-system], [$3]) + dnl Add bridges, ports, etc. + OVS_WAIT_WHILE([ip link show br0]) + AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| uuidfilt])], [0], [$2]) +]) + + +# OVS_TRAFFIC_VSWITCHD_STOP([ALLOWLIST], [extra_cmds]) +# +# Gracefully stops ovs-vswitchd and ovsdb-server, checking their log files +# for messages with severity WARN or higher and signaling an error if any +# is present. The optional ALLOWLIST may contain shell-quoted "sed" +# commands to delete any warnings that are actually expected, e.g.: +# +# OVS_TRAFFIC_VSWITCHD_STOP(["/expected error/d"]) +# +# 'extra_cmds' are shell commands to be executed after OVS_VSWITCHD_STOP() is +# invoked. They can be used to perform additional cleanups such as name space +# removal. +m4_define([OVS_TRAFFIC_VSWITCHD_STOP], + [OVS_DPDK_STOP_VSWITCHD([$1]) + AT_CHECK([:; $2]) +]) + + +# Plug a veth into OVS via DPDK net/af_xdp. +m4_define([ADD_VETH], +[ AT_CHECK([ip link add $1 type veth peer name ovs-$1 || return 77]) + CONFIGURE_VETH_OFFLOADS([$1]) + AT_CHECK([ip link set $1 netns $2]) + AT_CHECK([ip link set dev ovs-$1 up]) + AT_CHECK([ovs-vsctl add-port $3 ovs-$1 -- \ +set interface ovs-$1 external-ids:iface-id="$1" -- \ +set interface ovs-$1 type=dpdk -- \ +set interface ovs-$1 options:dpd
[ovs-dev] [PATCH v9 09/11] system-dpdk: Refactor tests using vhost-user ports.
Introduce macros responsible for adding a vhost-user / vhost-user client port to a userspace datapath bridge and check associated logs. Signed-off-by: David Marchand --- Changes since v8: - added this patch to avoid issues like what is fixed by the previous patch in this series, --- tests/system-dpdk.at | 123 +-- 1 file changed, 36 insertions(+), 87 deletions(-) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index b7ae4af406..4da2afd683 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -18,6 +18,29 @@ m4_define([CHECK_MEMPOOL_PARAM], [ | grep "User configured shared mempool set for: MTU $1, NUMA $2."]) ]) +dnl ADD_VHOST_USER_CLIENT_PORT([bridge], [port], [socket]) +dnl Add a dpdk vhost-user client port to a bridge and check this port is ready +dnl to be used by looking at the logs. +m4_define([ADD_VHOST_USER_CLIENT_PORT], [ +AT_CHECK([ovs-vsctl add-port $1 $2 -- \ + set Interface $2 type=dpdkvhostuserclient options:vhost-server-path=$3], + [], [stdout], [stderr]) +OVS_WAIT_UNTIL([grep "VHOST_CONFIG: ($3) vhost-user client: socket created" ovs-vswitchd.log]) +OVS_WAIT_UNTIL([grep "vHost User device '$2' created in 'client' mode, using client socket" ovs-vswitchd.log]) +OVS_WAIT_UNTIL([grep "VHOST_CONFIG: ($3) reconnecting..." ovs-vswitchd.log]) +]) + +dnl ADD_VHOST_USER_PORT([bridge], [port], [socket]) +dnl Add a dpdk vhost-user port to a bridge and check this port is ready +dnl to be used by looking at the logs. +m4_define([ADD_VHOST_USER_PORT], [ +AT_CHECK([ovs-vsctl add-port $1 $2 -- set Interface $2 type=dpdkvhostuser], [], + [stdout], [stderr]) +OVS_WAIT_UNTIL([grep "VHOST_CONFIG: ($3) vhost-user server: socket created" ovs-vswitchd.log]) +OVS_WAIT_UNTIL([grep "Socket $3 created for vhost-user port $2" ovs-vswitchd.log]) +OVS_WAIT_UNTIL([grep "VHOST_CONFIG: ($3) binding succeeded" ovs-vswitchd.log]) +]) + dnl -- dnl Check if EAL init is successful AT_SETUP([OVS-DPDK - EAL init]) @@ -64,15 +87,10 @@ OVS_DPDK_START([--no-pci]) dnl Add userspace bridge and attach it to OVS AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) -AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuserclient0 -- set Interface dpdkvhostuserclient0 type=dpdkvhostuserclient options:vhost-server-path=$OVS_RUNDIR/dpdkvhostclient0], [], [stdout], [stderr]) +ADD_VHOST_USER_CLIENT_PORT([br10], [dpdkvhostuserclient0], [$OVS_RUNDIR/dpdkvhostclient0]) AT_CHECK([ovs-vsctl show], [], [stdout]) sleep 2 -dnl Parse log file -AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user client: socket created" ovs-vswitchd.log], [], [stdout]) -AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' mode, using client socket" ovs-vswitchd.log], [], [stdout]) -AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." ovs-vswitchd.log], [], [stdout]) - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) OVS_DPDK_STOP_VSWITCHD(["dnl @@ -92,19 +110,9 @@ OVS_DPDK_START([--no-pci]) dnl Add userspace bridge and attach it to OVS AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) -AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuser0 -- set Interface dpdkvhostuser0 \ - type=dpdkvhostuser], [], - [stdout], [stderr]) +ADD_VHOST_USER_PORT([br10], [dpdkvhostuser0], [$OVS_RUNDIR/dpdkvhostuser0]) AT_CHECK([ovs-vsctl show], [], [stdout]) -dnl Parse log file -AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostuser0) vhost-user server: socket created" \ - ovs-vswitchd.log], [], [stdout]) -AT_CHECK([grep "Socket $OVS_RUNDIR/dpdkvhostuser0 created for vhost-user port dpdkvhostuser0" \ - ovs-vswitchd.log], [], [stdout]) -AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostuser0) binding succeeded" ovs-vswitchd.log], [], - [stdout]) - dnl Set up namespaces ADD_NAMESPACES(ns1, ns2) @@ -158,18 +166,9 @@ OVS_DPDK_START([--no-pci]) dnl Add userspace bridge and attach it to OVS AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) -AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuserclient0 -- set Interface \ - dpdkvhostuserclient0 \ - type=dpdkvhostuserclient \ - options:vhost-server-path=$OVS_RUNDIR/dpdkvhostclient0], [], - [stdout], [stderr]) +ADD_VHOST_USER_CLIENT_PORT([br10], [dpdkvhostuserclient0], [$OVS_RUNDIR/dpdkvhostclient0]) AT_CHECK([ovs-vsctl show], [], [stdout]) -dnl Parse log file -AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user client: socket created" ovs-vswitchd.log], [], [stdout]) -AT
[ovs-dev] [PATCH v9 08/11] system-dpdk: Fix race in some vhost-user client MTU test.
Adding those grep gives enough time to OVS to be ready to connect with the testpmd virtio-user port instantiated afterward. Fixes: bf47829116a8 ("tests: Add OVS-DPDK MTU unit tests.") Signed-off-by: David Marchand --- Changes since v8: - added this patch for random failures observed on my laptop, --- tests/system-dpdk.at | 4 1 file changed, 4 insertions(+) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index c125d8051f..b7ae4af406 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -813,6 +813,10 @@ AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=9702]) AT_CHECK([ovs-vsctl show], [], [stdout]) sleep 2 +AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user client: socket created" ovs-vswitchd.log], [], [stdout]) +AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' mode, using client socket" ovs-vswitchd.log], [], [stdout]) +AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." ovs-vswitchd.log], [], [stdout]) + OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"]) OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v9 10/11] system-dpdk: Rework cleanup for vhost-user client tests.
Those tests are subject to a race when a testpmd hosting the vhost-user server is stopped and OVS has enough time to detect the vhost-user socket drop and tries to reconnect to this socket. In such a situation, the tests can fail as the OVS process with the vhost-user client port complains with a warning log: 2023-09-08T13:15:18.160Z|00163|dpdk|INFO|VHOST_CONFIG: (.../005/dpdkvhostclient0) vhost peer closed 2023-09-08T13:15:18.160Z|00164|netdev_dpdk|INFO|vHost Device '.../005/dpdkvhostclient0' connection has been destroyed 2023-09-08T13:15:18.160Z|00165|dpdk|INFO|VHOST_CONFIG: (.../005/dpdkvhostclient0) vhost-user client: socket created, fd: 24 2023-09-08T13:15:18.160Z|00166|dpdk|WARN|VHOST_CONFIG: (.../005/dpdkvhostclient0) failed to connect: Connection refused 2023-09-08T13:15:18.160Z|00167|dpdk|INFO|VHOST_CONFIG: (.../005/dpdkvhostclient0) reconnecting... Invert the order of the cleanup steps. Signed-off-by: David Marchand Acked-by: Eelco Chaudron --- Changes since v6: - added this fix for spurious failures hit by Eelco, --- tests/system-dpdk.at | 15 +-- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 4da2afd683..5e486d1f47 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -221,10 +221,9 @@ AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_bytes $((`ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q0_good_bytes` + dnl `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q1_good_bytes`))]) -OVS_DPDK_STOP_TESTPMD() - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) +OVS_DPDK_STOP_TESTPMD() OVS_DPDK_STOP_VSWITCHD(["dnl /VHOST_CONFIG: (.*dpdkvhostclient0) recvmsg failed/d /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d @@ -619,10 +618,9 @@ AT_CHECK([ovs-vsctl get Interface dpdkvhostuserclient0 mtu], [0], [dnl 9000 ]) -OVS_DPDK_STOP_TESTPMD() - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) +OVS_DPDK_STOP_TESTPMD() OVS_DPDK_STOP_VSWITCHD(["dnl /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d"]) AT_CLEANUP @@ -662,10 +660,9 @@ AT_CHECK([ovs-vsctl get Interface dpdkvhostuserclient0 mtu], [0], [dnl 2000 ]) -OVS_DPDK_STOP_TESTPMD() - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) +OVS_DPDK_STOP_TESTPMD() OVS_DPDK_STOP_VSWITCHD(["dnl /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d"]) AT_CLEANUP @@ -785,10 +782,9 @@ dnl Set MTU value above upper bound and check for error AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=9711]) AT_CHECK([grep "dpdkvhostuserclient0: unsupported MTU 9711" ovs-vswitchd.log], [], [stdout]) -OVS_DPDK_STOP_TESTPMD() - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) +OVS_DPDK_STOP_TESTPMD() OVS_DPDK_STOP_VSWITCHD(["dnl /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d /dpdkvhostuserclient0: unsupported MTU 9711/d @@ -828,10 +824,9 @@ dnl Set MTU value below lower bound and check for error AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=67]) AT_CHECK([grep "dpdkvhostuserclient0: unsupported MTU 67" ovs-vswitchd.log], [], [stdout]) -OVS_DPDK_STOP_TESTPMD() - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) +OVS_DPDK_STOP_TESTPMD() OVS_DPDK_STOP_VSWITCHD(["dnl /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d /dpdkvhostuserclient0: unsupported MTU 67/d -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v9 07/11] system-dpdk: Remove tap interfaces from vport MTU tests.
The unit tests for changing MTU with vhost-user ports are not using those tap interfaces. Signed-off-by: David Marchand --- Changes since v7: - added this patch after getting regressions in Intel CI because of the next patch, --- tests/system-dpdk.at | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index f635d7f925..c125d8051f 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -640,9 +640,7 @@ AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user client: AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' mode, using client socket" ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." ovs-vswitchd.log], [], [stdout]) -OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1" \ ---vdev="net_tap0,iface=tap0"]) - +OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"]) OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) OVS_WAIT_UNTIL([ovs-vsctl get Interface dpdkvhostuserclient0 link_state | grep -w up]) @@ -690,9 +688,7 @@ AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user client: AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' mode, using client socket" ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." ovs-vswitchd.log], [], [stdout]) -OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1" \ ---vdev="net_tap0,iface=tap0"]) - +OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"]) OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) OVS_WAIT_UNTIL([ovs-vsctl get Interface dpdkvhostuserclient0 link_state | grep -w up]) @@ -817,8 +813,7 @@ AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=9702]) AT_CHECK([ovs-vsctl show], [], [stdout]) sleep 2 -OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1" \ ---vdev="net_tap0,iface=tap0"]) +OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"]) OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) OVS_WAIT_UNTIL([ovs-vsctl get Interface dpdkvhostuserclient0 link_state | grep -w up]) @@ -866,8 +861,7 @@ AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user client: AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' mode, using client socket" ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." ovs-vswitchd.log], [], [stdout]) -OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1" \ ---vdev="net_tap0,iface=tap0"]) +OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"]) OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) OVS_WAIT_UNTIL([ovs-vsctl get Interface dpdkvhostuserclient0 link_state | grep -w up]) -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v9 05/11] system-dpdk: Refactor OVS daemons helpers.
Align system-dpdk existing helpers to other common OVS helpers so they can accept some optional arguments. Introduce a OVS_DPDK_STOP_VSWITCHD wrapper around OVS_VSWITCHD_STOP to catch dpdk related logs in a centralised fashion. Signed-off-by: David Marchand Acked-by: Eelco Chaudron --- Changes since v6: - did a minor cleanup on vhost-user client test log pattern (one entry concerned vhost-user ports, not vhost-user *client* ports), --- tests/system-dpdk-macros.at | 21 - tests/system-dpdk.at| 158 +++- 2 files changed, 82 insertions(+), 97 deletions(-) diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index 35d14bee8f..7fedfd6515 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -36,12 +36,13 @@ m4_define([OVS_DPDK_PRE_PHY_SKIP], # m4_define([OVS_DPDK_START], [dnl start ovs dpdk - OVS_DPDK_START_OVSDB() + OVS_DPDK_START_OVSDB($3) dnl Enable DPDK functionality AT_CHECK([ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true]) - OVS_DPDK_START_VSWITCHD($1) + OVS_DPDK_START_VSWITCHD([$1], [$2]) ]) + # OVS_DPDK_START_OVSDB() # # Create an empty database and start ovsdb-server. @@ -60,9 +61,10 @@ m4_define([OVS_DPDK_START_OVSDB], AT_CAPTURE_FILE([ovsdb-server.log]) dnl Initialize database. - AT_CHECK([ovs-vsctl --no-wait init]) + AT_CHECK([ovs-vsctl --no-wait init $1]) ]) + # OVS_DPDK_START_VSWITCHD() # # Add special configuration for dpdk-init. Start ovs-vswitchd. @@ -72,12 +74,23 @@ m4_define([OVS_DPDK_START_VSWITCHD], AT_CHECK([ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-extra="--log-level=pmd.*:error $1"]) dnl Start ovs-vswitchd. - AT_CHECK([ovs-vswitchd --detach --no-chdir --pidfile --log-file -vvconn -vofproto_dpif -vunixctl], [0], [stdout], [stderr]) + AT_CHECK([ovs-vswitchd $2 --detach --no-chdir --pidfile --log-file -vvconn -vofproto_dpif -vunixctl], [0], [stdout], [stderr]) AT_CAPTURE_FILE([ovs-vswitchd.log]) on_exit "kill_ovs_vswitchd `cat ovs-vswitchd.pid`" ]) +m4_define([OVS_DPDK_STOP_VSWITCHD], + [OVS_VSWITCHD_STOP([dnl +$1";/does not exist. The Open vSwitch kernel module is probably not loaded./d +/does not support MTU configuration,/d +/EAL: No \(available\|free\) .*hugepages reported/d +/Failed to enable flow control/d +/Rx checksum offload is not supported on/d +/TELEMETRY: No legacy callbacks, legacy socket not created/d"]) +]) + + # OVS_DPDK_CHECK_TESTPMD() # # Check dpdk-testpmd availability. diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index ab232e06da..f635d7f925 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -3,15 +3,6 @@ m4_define([CONFIGURE_VETH_OFFLOADS], AT_BANNER([OVS-DPDK unit tests]) -m4_define([SYSTEM_DPDK_ALLOWED_LOGS],[ -\@does not exist. The Open vSwitch kernel module is probably not loaded.@d -\@does not support MTU configuration,@d -\@EAL: No \(available\|free\) .*hugepages reported@d -\@Failed to enable flow control@d -\@Rx checksum offload is not supported on@d -\@TELEMETRY: No legacy callbacks, legacy socket not created@d -]) - dnl CHECK_MEMPOOL_PARAM([mtu], [numa], [+line]) dnl dnl Waits for logs to indicate that the user has configured a mempool @@ -36,7 +27,7 @@ OVS_DPDK_START([--no-pci]) AT_CHECK([grep "DPDK Enabled - initializing..." ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "EAL" ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "DPDK Enabled - initialized" ovs-vswitchd.log], [], [stdout]) -OVS_VSWITCHD_STOP("[SYSTEM_DPDK_ALLOWED_LOGS]") +OVS_DPDK_STOP_VSWITCHD AT_CLEANUP dnl -- @@ -58,7 +49,7 @@ sleep 2 dnl Clean up AT_CHECK([ovs-vsctl del-port br10 phy0], [], [stdout], [stderr]) -OVS_VSWITCHD_STOP("[SYSTEM_DPDK_ALLOWED_LOGS]") +OVS_DPDK_STOP_VSWITCHD AT_CLEANUP dnl -- @@ -84,9 +75,8 @@ AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." ov dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) -OVS_VSWITCHD_STOP("m4_join([], [SYSTEM_DPDK_ALLOWED_LOGS], [ -\@VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) failed to connect: No such file or directory@d -])") +OVS_DPDK_STOP_VSWITCHD(["dnl +/VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d"]) AT_CLEANUP dnl -- @@ -150,12 +140,11 @@ OVS_WAIT_UNTIL([grep "vHost Device '$OVS_RUNDIR/dpdkvhostuser0' has been removed dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuser0], [], [stdout], [stderr]) -OVS_VSWITCHD_STOP("m4_join([], [SYSTEM_DPDK_ALLOWED_LOGS], [ -\@VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostuser0) recvmsg
[ovs-dev] [PATCH v9 06/11] netdev-afxdp: Postpone libbpf logging helper registration.
When using net/af_xdp DPDK driver along OVS native AF_XDP support, confusing logs are reported, like: netdev_dpdk|INFO|Device 'net_af_xdpp0,iface=ovs-p0' attached to DPDK dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 11 created. dpif_netdev|INFO|There are 1 pmd threads on numa node 0 dpdk|INFO|Device with port_id=0 already stopped dpdk(pmd-c11/id:22)|INFO|PMD thread uses DPDK lcore 1. netdev_dpdk|WARN|Rx checksum offload is not supported on port 0 netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(6) .xdp_run_config netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7) xdp_metadata netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7) xdp_metadata netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7) xdp_metadata This comes from the fact that netdev-afxdp unconditionnally registers a helper for logging libbpf messages. Making both net/af_xdp and netdev-afxdp work at the same time seems difficult, so at least, ensure that netdev-afxdp won't register this helper unless a netdev is actually allocated. Signed-off-by: David Marchand Acked-by: Eelco Chaudron --- lib/netdev-afxdp.c | 12 ++-- lib/netdev-afxdp.h | 1 - lib/netdev-linux.c | 1 - 3 files changed, 6 insertions(+), 8 deletions(-) diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c index b680a14798..54029722e0 100644 --- a/lib/netdev-afxdp.c +++ b/lib/netdev-afxdp.c @@ -1193,18 +1193,18 @@ libbpf_print(enum libbpf_print_level level, return 0; } -int netdev_afxdp_init(void) -{ -libbpf_set_print(libbpf_print); -return 0; -} - int netdev_afxdp_construct(struct netdev *netdev) { +static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; struct netdev_linux *dev = netdev_linux_cast(netdev); int ret; +if (ovsthread_once_start()) { +libbpf_set_print(libbpf_print); +ovsthread_once_done(); +} + /* Configure common netdev-linux first. */ ret = netdev_linux_construct(netdev); if (ret) { diff --git a/lib/netdev-afxdp.h b/lib/netdev-afxdp.h index bd3b9dfbea..236a37cc84 100644 --- a/lib/netdev-afxdp.h +++ b/lib/netdev-afxdp.h @@ -47,7 +47,6 @@ struct xsk_socket_info; int netdev_afxdp_rxq_construct(struct netdev_rxq *rxq_); void netdev_afxdp_rxq_destruct(struct netdev_rxq *rxq_); -int netdev_afxdp_init(void); int netdev_afxdp_construct(struct netdev *netdev_); void netdev_afxdp_destruct(struct netdev *netdev_); int netdev_afxdp_verify_mtu_size(const struct netdev *netdev, int mtu); diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 70521e3c7f..18b62d90c2 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -3754,7 +3754,6 @@ const struct netdev_class netdev_internal_class = { #ifdef HAVE_AF_XDP #define NETDEV_AFXDP_CLASS_COMMON \ -.init = netdev_afxdp_init, \ .construct = netdev_afxdp_construct,\ .destruct = netdev_afxdp_destruct, \ .get_stats = netdev_afxdp_get_stats,\ -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v9 04/11] tests: Define a macro to skip tc relying tests.
Some unit tests expect that a OVS port has an associated netdevice on which they can hook tc. This will not be possible when testing the userspace datapath with DPDK. Introduce a helper (which will be overriden in system-dpdk tests) and use it in the existing tests. Acked-by: Aaron Conole Signed-off-by: David Marchand Acked-by: Eelco Chaudron --- Changes since v4: - as the traffic tests now use net/af_xdp, it is not required to skip tests relying on tcpdump anymore, --- tests/system-common-macros.at| 6 ++ tests/system-offloads-traffic.at | 6 +++--- tests/system-traffic.at | 6 +++--- 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at index 0077a8609c..0113aae8bd 100644 --- a/tests/system-common-macros.at +++ b/tests/system-common-macros.at @@ -297,6 +297,12 @@ m4_define([OVS_START_L7], # m4_define([OFPROTO_CLEAR_DURATION_IDLE], [[sed -e 's/duration=.*s,/duration=,/g' -e 's/idle_age=[0-9]*,/idle_age=,/g']]) +# OVS_CHECK_TC_QDISC() +# +# Macro to skip tests when tc qdisc can't be applied on a OVS port. +m4_define([OVS_CHECK_TC_QDISC], +[AT_SKIP_IF([test $HAVE_TC = no])]) + # OVS_CHECK_TUNNEL_TSO() # # Macro to be used in general tunneling tests that could be also diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at index 5ad6b4bfdf..0bedee7530 100644 --- a/tests/system-offloads-traffic.at +++ b/tests/system-offloads-traffic.at @@ -20,7 +20,7 @@ m4_define([OVS_CHECK_ACTIONS], [ m4_define([CHECK_TC_INGRESS_PPS], [ -AT_SKIP_IF([test $HAVE_TC = "no"]) +OVS_CHECK_TC_QDISC() AT_CHECK([ip link add ovs_tc_pps0 type veth peer name ovs_tc_pps1 dnl || exit 77]) on_exit 'ip link del ovs_tc_pps0' @@ -95,7 +95,7 @@ AT_CLEANUP AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - offloads disabled]) AT_KEYWORDS([ingress_policing]) -AT_SKIP_IF([test $HAVE_TC = "no"]) +OVS_CHECK_TC_QDISC() OVS_TRAFFIC_VSWITCHD_START() AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw-offload=false]) AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) @@ -118,7 +118,7 @@ AT_CLEANUP AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - offloads enabled]) AT_KEYWORDS([ingress_policing]) -AT_SKIP_IF([test $HAVE_TC = "no"]) +OVS_CHECK_TC_QDISC() OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . other_config:hw-offload=true]) AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) ADD_NAMESPACES(at_ns0) diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 7ea4502028..a7d4ed83bd 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -2321,7 +2321,7 @@ AT_CLEANUP AT_BANNER([QoS]) AT_SETUP([QoS - basic configuration]) -AT_SKIP_IF([test $HAVE_TC = no]) +OVS_CHECK_TC_QDISC() OVS_TRAFFIC_VSWITCHD_START() ADD_NAMESPACES(at_ns0, at_ns1) @@ -2355,7 +2355,7 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([QoS - 64bit]) -AT_SKIP_IF([test $HAVE_TC = no]) +OVS_CHECK_TC_QDISC() AT_SKIP_IF([test $HAVE_TCA_HTB_RATE64 = no]) OVS_TRAFFIC_VSWITCHD_START() @@ -2383,7 +2383,7 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([Ingress Policing - 64-bit]) -AT_SKIP_IF([test $HAVE_TC = no]) +OVS_CHECK_TC_QDISC() AT_SKIP_IF([test $HAVE_TCA_POLICE_PKTRATE64 = no]) OVS_TRAFFIC_VSWITCHD_START() ADD_NAMESPACES(ns0) -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v9 02/11] system-dpdk: Don't require hugetlbfs.
dpdk-testpmd does not need hugetlbfs backing as we don't require multiprocess support in OVS unit tests. Switch to --in-memory and remove the (then unneeded) check on hugetlbfs presence. Acked-by: Aaron Conole Acked-by: Eelco Chaudron Signed-off-by: David Marchand --- Changes since v7: - reverted use of dynamic allocations and kept initial memory reservation: this avoids random failures when testpmd and ovs-dpdk try to allocate memory at the same time, --- tests/system-dpdk-macros.at | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index 2cfd26d840..a176a57a4b 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -7,9 +7,6 @@ m4_define([OVS_DPDK_PRE_CHECK], [dnl Check Hugepages AT_CHECK([cat /proc/meminfo], [], [stdout]) AT_SKIP_IF([grep -E 'HugePages_Free: *0' stdout], [], [stdout]) - AT_CHECK([mount], [], [stdout]) - AT_CHECK([grep 'hugetlbfs' stdout], [], [stdout], []) - ]) @@ -97,7 +94,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD], m4_define([OVS_DPDK_START_TESTPMD], [AT_CHECK([lscpu], [], [stdout]) AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE]) - eal_options="--socket-mem="$(cat NUMA_NODE)" --file-prefix page0 --single-file-segments --no-pci" + eal_options="--in-memory --socket-mem="$(cat NUMA_NODE)" --single-file-segments --no-pci" options="$1" test "$options" != "${options%% -- *}" || options="$options -- " eal_options="$eal_options ${options%% -- *}" -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v9 03/11] ci: Run DPDK tests in GitHub Actions.
Let's enhance our coverage in the CI and run DPDK system tests. A few DPDK drivers are enabled in DPDK compilation. Put DPDK build in $PATH for dpdk-testpmd to be available. sudo drops PATH= updates and -E alone does not seem to preserve this variable. Pass PATH=$PATH when running the tests, as a workaround. Since those tests are run as root, the collection of logs is updated accordingly. In GHA, only two cores are available but some test rely on testpmd using three lcores. Add a DPDK_EAL_OPTIONS environment variable and use it to map all testpmd lcores to core 1 (and leave core 0 alone for OVS main and PMD threads). Signed-off-by: David Marchand Acked-by: Aaron Conole Acked-by: Eelco Chaudron --- Changes since v1: - rebased after DPDK build has been moved out of linux-build.sh, - restored running "normal" checks in the DPDK jobs, --- .ci/dpdk-build.sh| 7 --- .ci/linux-build.sh | 15 ++- .github/workflows/build-and-test.yml | 7 --- tests/system-dpdk-macros.at | 2 +- 4 files changed, 23 insertions(+), 8 deletions(-) diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh index 02dcefef61..35540f0694 100755 --- a/.ci/dpdk-build.sh +++ b/.ci/dpdk-build.sh @@ -35,9 +35,10 @@ function build_dpdk() DPDK_OPTS="$DPDK_OPTS -Ddeveloper_mode=disabled" # OVS compilation and "normal" unit tests (run in the CI) do not depend on -# any DPDK driver being present. -# We can disable all drivers to save compilation time. -DPDK_OPTS="$DPDK_OPTS -Ddisable_drivers=*/*" +# any DPDK driver. +# check-dpdk unit tests requires testpmd and some net/ driver. +DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd" +DPDK_OPTS="$DPDK_OPTS -Denable_drivers=net/null,net/tap,net/virtio" # Install DPDK using prefix. DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build" diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh index 8227a57487..aa2ecc5050 100755 --- a/.ci/linux-build.sh +++ b/.ci/linux-build.sh @@ -22,6 +22,9 @@ function install_dpdk() # Export the following path for pkg-config to find the .pc file. export PKG_CONFIG_PATH=$DPDK_LIB/pkgconfig/:$PKG_CONFIG_PATH +# Expose dpdk binaries. +export PATH=$(pwd)/dpdk-dir/build/bin:$PATH + if [ ! -f "${VERSION_FILE}" ]; then echo "Could not find DPDK in $(pwd)/dpdk-dir" return 1 @@ -113,7 +116,7 @@ fi OPTS="${EXTRA_OPTS} ${OPTS} $*" -if [ "$TESTSUITE" ]; then +if [ "$TESTSUITE" = 'test' ]; then # 'distcheck' will reconfigure with required options. # Now we only need to prepare the Makefile without sparse-wrapped CC. configure_ovs @@ -123,6 +126,16 @@ if [ "$TESTSUITE" ]; then TESTSUITEFLAGS=-j4 RECHECK=yes else build_ovs +for testsuite in $TESTSUITE; do +run_as_root= +if [ "${testsuite##*dpdk}" != "$testsuite" ]; then +sudo sh -c 'echo 1024 > /proc/sys/vm/nr_hugepages' || true +[ "$(cat /proc/sys/vm/nr_hugepages)" = '1024' ] +export DPDK_EAL_OPTIONS="--lcores 0@1,1@1,2@1" +run_as_root="sudo -E PATH=$PATH" +fi +$run_as_root make $testsuite TESTSUITEFLAGS=-j4 RECHECK=yes +done fi exit 0 diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml index bc5494e863..4f62efb7c3 100644 --- a/.github/workflows/build-and-test.yml +++ b/.github/workflows/build-and-test.yml @@ -123,10 +123,10 @@ jobs: opts: --enable-shared - compiler: gcc -testsuite:test +testsuite:check check-dpdk dpdk: dpdk - compiler: clang -testsuite:test +testsuite:check check-dpdk dpdk: dpdk - compiler: gcc @@ -213,7 +213,8 @@ jobs: mkdir logs cp config.log ./logs/ cp -r ./*/_build/sub/tests/testsuite.* ./logs/ || true -tar -czvf logs.tgz logs/ +sudo cp -r ./tests/*testsuite.* ./logs/ || true +sudo tar -czvf logs.tgz logs/ - name: upload logs on failure if: failure() || cancelled() diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index a176a57a4b..35d14bee8f 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -94,7 +94,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD], m4_define([OVS_DPDK_START_TESTPMD], [AT_CHECK([lscpu], [], [stdout]) AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE]) - eal_options="--in-memory --socket-mem="$(cat NUMA_NODE)" --single-file-segments --no-pci" + eal_options=&qu
[ovs-dev] [PATCH v9 01/11] system-dpdk: Introduce helpers for testpmd.
Rather than copy/paste everywhere, introduce helpers to control testpmd runs. Rely on --stats-period (which outputs port stats every n seconds) so that testpmd keeps running without expecting any user input. Acked-by: Aaron Conole Acked-by: Eelco Chaudron Signed-off-by: David Marchand --- Changes since v7: - fixed -- detection in arguments passed to OVS_DPDK_START_TESTPMD, - wrote the testpmd command to a debug file, Changes since v1: - fixed OVS_DPDK_START_TESTPMD passed arguments evaluation:: $@ -> $1, --- tests/system-dpdk-macros.at | 38 + tests/system-dpdk.at| 103 +--- 2 files changed, 62 insertions(+), 79 deletions(-) diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index 3920f08a5e..2cfd26d840 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -79,3 +79,41 @@ m4_define([OVS_DPDK_START_VSWITCHD], AT_CAPTURE_FILE([ovs-vswitchd.log]) on_exit "kill_ovs_vswitchd `cat ovs-vswitchd.pid`" ]) + + +# OVS_DPDK_CHECK_TESTPMD() +# +# Check dpdk-testpmd availability. +# +m4_define([OVS_DPDK_CHECK_TESTPMD], + [AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null]) +]) + + +# OVS_DPDK_START_TESTPMD() +# +# Start dpdk-testpmd in background. +# +m4_define([OVS_DPDK_START_TESTPMD], + [AT_CHECK([lscpu], [], [stdout]) + AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE]) + eal_options="--socket-mem="$(cat NUMA_NODE)" --file-prefix page0 --single-file-segments --no-pci" + options="$1" + test "$options" != "${options%% -- *}" || options="$options -- " + eal_options="$eal_options ${options%% -- *}" + testpmd_options="-a --stats-period 2 ${options#* -- }" + echo "dpdk-testpmd $eal_options -- $testpmd_options" >testpmd.cmd + dpdk-testpmd $eal_options -- $testpmd_options >testpmd.log 2>&1 & \ + echo $! > testpmd.pid + on_exit "kill -9 `cat testpmd.pid`" +]) + + +# OVS_DPDK_STOP_TESTPMD() +# +# Stop background dpdk-testpmd. +# +m4_define([OVS_DPDK_STOP_TESTPMD], + [AT_CHECK([kill `cat testpmd.pid`]) + OVS_WAIT([kill -0 `cat testpmd.pid`], [kill -9 `cat testpmd.pid`]) +]) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index fd42aed0b3..ab232e06da 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -97,13 +97,9 @@ dnl Ping vhost-user port AT_SETUP([OVS-DPDK - ping vhost-user ports]) AT_KEYWORDS([dpdk]) OVS_DPDK_PRE_CHECK() -AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null]) +OVS_DPDK_CHECK_TESTPMD() OVS_DPDK_START([--no-pci]) -dnl Find number of sockets -AT_CHECK([lscpu], [], [stdout]) -AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE]) - dnl Add userspace bridge and attach it to OVS AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuser0 -- set Interface dpdkvhostuser0 \ @@ -125,12 +121,8 @@ ADD_NAMESPACES(ns1, ns2) dnl Add veth device ADD_VETH(tap1, ns2, br10, "172.31.110.12/24") -dnl Execute testpmd in background -on_exit "pkill -f -x -9 'tail -f /dev/null'" -tail -f /dev/null | dpdk-testpmd --socket-mem="$(cat NUMA_NODE)" --no-pci\ - --vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostuser0" \ - --vdev="net_tap0,iface=tap0" --file-prefix page0 \ - --single-file-segments -- -a >$OVS_RUNDIR/testpmd-dpdkvhostuser0.log 2>&1 & +OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostuser0" \ +--vdev="net_tap0,iface=tap0"]) OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) OVS_WAIT_UNTIL([ip link show dev tap0 | grep -qw LOWER_UP]) @@ -151,8 +143,7 @@ AT_CHECK([ip netns exec ns2 ip link show], [], [stdout], [stderr]) AT_CHECK([ip netns exec ns1 ping -c 4 -I tap0 172.31.110.12], [], [stdout], [stderr]) -dnl Clean up the testpmd now -pkill -f -x -9 'tail -f /dev/null' +OVS_DPDK_STOP_TESTPMD() dnl Wait for vhost-user handling the socket disconnect. OVS_WAIT_UNTIL([grep "vHost Device '$OVS_RUNDIR/dpdkvhostuser0' has been removed" ovs-vswitchd.log]) @@ -173,13 +164,9 @@ dnl Ping vhost-user-client port AT_SETUP([OVS-DPDK - ping vhost-user-client ports]) AT_KEYWORDS([dpdk]) OVS_DPDK_PRE_CHECK() -AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null]) +OVS_DPDK_CHECK_TESTPMD() OVS_DPDK_START([--no-pci]) -dnl Find number of sockets -AT_CHECK([lscpu], [], [stdout]) -AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) {printf "512,"}; print "512"}' >
Re: [ovs-dev] [PATCH v2 1/2] mcast-snooping: Test per port explicit flooding.
On Thu, Nov 16, 2023 at 12:32 PM Ilya Maximets wrote: > > +AT_CHECK([ > > +ovs-vsctl set bridge br0 \ > > +datapath_type=dummy \ > > +mcast_snooping_enable=true \ > > +other-config:mcast-snooping-disable-flood-unregistered=false > > Nit: > Not a full review, but in case you're sending a new version for Eelco's > comments, please, add more indentation to the 3 lines above, so they > are not on the same level with ovs-vsctl. I did not see this comment. Well, I'll wait for a full review before sending a new revision... -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 3/3] mcast-snooping: Fix comments format.
Capitalize comments and end them with a . when needed. Signed-off-by: David Marchand --- tests/mcast-snooping.at | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at index faeb7890d9..890e6aca00 100644 --- a/tests/mcast-snooping.at +++ b/tests/mcast-snooping.at @@ -31,13 +31,13 @@ dummy@ovs-dummy: hit:0 missed:0 ovs-appctl time/stop -# Send IGMPv3 query on p2 with vlan 1725 +# Send IGMPv3 query on p2 with vlan 1725. # 5c:8a:38:55:25:52 > 01:00:5e:00:00:01, ethertype 802.1Q (0x8100), length 64: vlan 1725, p 0, ethertype IPv4, # 172.17.25.1 > 224.0.0.1: igmp query v3 AT_CHECK([ovs-appctl netdev-dummy/receive p2 \ '01005e015c8a38552552810006bd080046c0002401027f00ac111901e00194041164ec1e027d']) -# Send IGMPv3 query on p2 with vlan 1728 +# Send IGMPv3 query on p2 with vlan 1728. # 5c:8a:38:55:25:52 > 01:00:5e:00:00:01, ethertype 802.1Q (0x8100), length 64: vlan 1728, p 0, ethertype IPv4, # 172.17.28.1 > 224.0.0.1: igmp query v3 AT_CHECK([ovs-appctl netdev-dummy/receive p2 \ @@ -51,13 +51,13 @@ AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl AT_CHECK([ovs-vsctl set Interface p2 options:tx_pcap=p2.pcap]) -# Send a multicast packet on p1 +# Send a multicast packet on p1. AT_CHECK([ ovs-appctl netdev-dummy/receive p1 \ 'in_port(1),eth(src=aa:55:aa:55:00:01,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=239.94.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)' ]) -# Check this packet was forwarded exactly once to p2 and has vlan tag 1725 +# Check this packet was forwarded exactly once to p2 and has vlan tag 1725. # aa:55:aa:55:00:01 > 01:00:5e:5e:01:01, ethertype 802.1Q (0x8100), length 46: vlan 1725, p 0, ethertype IPv4, # 10.0.0.1.0 > 239.94.1.1.8000: UDP, length 0 AT_CHECK([ovs-pcap p2.pcap > p2.pcap.txt 2>&1]) @@ -450,7 +450,7 @@ AT_CHECK([ ovs-appctl time/stop -# send report packets +# Send report packets. AT_CHECK([ ovs-appctl netdev-dummy/receive p1 \ '01005E010101000C29A027A181010800451C00014002CBAEAC10221EE001010112140CE9E0010101' @@ -458,7 +458,7 @@ AT_CHECK([ '01005E010101000C29A027A281020800451C00014002CBAEAC10221EE001010112140CE9E0010101' ], [0]) -# send query packets +# Send query packets. AT_CHECK([ ovs-appctl netdev-dummy/receive p3 \ '01005E010101000C29A027D181010800451C00014002CBCBAC102201E0010104EEEB' @@ -505,7 +505,7 @@ AT_CHECK([ ovs-appctl time/stop -# send report packets +# Send report packets. AT_CHECK([ ovs-appctl netdev-dummy/receive p1 \ '01005E010101000C29A027A181010800451C00014002CBAEAC10221EE001010112140CE9E0010101' @@ -513,7 +513,7 @@ AT_CHECK([ '01005E010101000C29A027A281020800451C00014002CBAEAC10221EE001010112140CE9E0010101' ], [0]) -# send query packets +# Send query packets. AT_CHECK([ ovs-appctl netdev-dummy/receive p2 \ '01005E010101000C29A027D181010800451C00014002CBCBAC102201E0010104EEEB' -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 1/3] mcast-snooping: Test per port explicit flooding.
Various options affect how the mcast snooping module work. When multicast snooping is enabled and a reporter is known, it is still possible to flood associated packets to some other port via the mcast-snooping-flood option. If flooding unregistered traffic is disabled, it is still possible to flood multicast traffic too with the mcast-snooping-flood option. IGMP reports may have to be flooded to some ports explicitly with the mcast-snooping-flood-reports option. Test those parameters. Acked-by: Simon Horman Acked-by: Paolo Valerio Signed-off-by: David Marchand --- Changes since v2: - fixed comment, Changes since v1: - fixed dest mac address, - added tests for mcast-snooping-disable-flood-unregistered=true and mcast-snooping-flood-reports, --- tests/mcast-snooping.at | 280 1 file changed, 280 insertions(+) diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at index d5b7c4774c..9797bca531 100644 --- a/tests/mcast-snooping.at +++ b/tests/mcast-snooping.at @@ -105,6 +105,286 @@ AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([mcast - check multicast per port flooding]) +OVS_VSWITCHD_START([]) + +AT_CHECK([ +ovs-vsctl set bridge br0 \ +datapath_type=dummy \ +mcast_snooping_enable=true \ +other-config:mcast-snooping-disable-flood-unregistered=false +], [0]) + +AT_CHECK([ovs-ofctl add-flow br0 action=normal]) + +AT_CHECK([ +ovs-vsctl add-port br0 p1 \ +-- set Interface p1 type=dummy other-config:hwaddr=aa:55:aa:55:00:01 ofport_request=1 \ +-- add-port br0 p2 \ +-- set Interface p2 type=dummy other-config:hwaddr=aa:55:aa:55:00:02 ofport_request=2 \ +-- add-port br0 p3 \ +-- set Interface p3 type=dummy other-config:hwaddr=aa:55:aa:55:00:03 ofport_request=3 \ +], [0]) + +ovs-appctl time/stop + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [stdout]) +AT_CHECK([grep -v 'Datapath actions:' stdout], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> unregistered multicast, flooding + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no +]) +AT_CHECK([sed -ne 's/^Datapath actions: \(.*\)$/\1/p' stdout | tr "," "\n" | sort -n], [0], [dnl +1 +2 +100 +]) + +# Send report packets. +AT_CHECK([ +ovs-appctl netdev-dummy/receive p1 \ + '01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101' +], [0]) +AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl + port VLAN GROUPAge +1 0 224.1.1.1 0 +]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding to mcast group port + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no +Datapath actions: 1 +]) + +AT_CHECK([ovs-vsctl set port p2 other_config:mcast-snooping-flood=true]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding to mcast group port + -> forwarding to mcast flood port + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no +Datapath actions: 1,2 +]) + +AT_CHECK([ovs-vsctl set port p3 other_config:mcast-snooping-flood=true]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_
[ovs-dev] [PATCH v3 2/3] mcast-snooping: Flush flood and report ports when deleting interfaces.
When a configuration change triggers an interface destruction/creation (like for example, setting ofport_request), a port object may still be referenced as a fport or a rport in the mdb. Before the fix, when flooding multicast traffic: bridge("br0") - 0. priority 32768 NORMAL -> forwarding to mcast group port >> mcast flood port is unknown, dropping -> mcast flood port is input port, dropping -> forwarding to mcast flood port Before the fix, when flooding igmp report traffic: bridge("br0") - 0. priority 32768 NORMAL >> mcast port is unknown, dropping the report -> forwarding report to mcast flagged port -> mcast port is input port, dropping the Report -> forwarding report to mcast flagged port Add relevant cleanup and update unit tests. Fixes: 4fbbf8624868 ("mcast-snooping: Flush ports mdb when VLAN configuration changed.") Acked-by: Simon Horman Acked-by: Paolo Valerio Signed-off-by: David Marchand --- Changes since v2: - christmas tree, - added some comments in tests, Changes since v1: - updated the test on report flooding, --- lib/mcast-snooping.c| 17 - tests/mcast-snooping.at | 42 + 2 files changed, 58 insertions(+), 1 deletion(-) diff --git a/lib/mcast-snooping.c b/lib/mcast-snooping.c index 029ca28558..43805ae4d5 100644 --- a/lib/mcast-snooping.c +++ b/lib/mcast-snooping.c @@ -946,8 +946,9 @@ mcast_snooping_wait(struct mcast_snooping *ms) void mcast_snooping_flush_bundle(struct mcast_snooping *ms, void *port) { -struct mcast_group *g; struct mcast_mrouter_bundle *m; +struct mcast_port_bundle *p; +struct mcast_group *g; if (!mcast_snooping_enabled(ms)) { return; @@ -971,5 +972,19 @@ mcast_snooping_flush_bundle(struct mcast_snooping *ms, void *port) } } +LIST_FOR_EACH_SAFE (p, node, >fport_list) { +if (p->port == port) { +mcast_snooping_flush_port(p); +ms->need_revalidate = true; +} +} + +LIST_FOR_EACH_SAFE (p, node, >rport_list) { +if (p->port == port) { +mcast_snooping_flush_port(p); +ms->need_revalidate = true; +} +} + ovs_rwlock_unlock(>rwlock); } diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at index 9797bca531..faeb7890d9 100644 --- a/tests/mcast-snooping.at +++ b/tests/mcast-snooping.at @@ -207,6 +207,26 @@ Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e Datapath actions: 1,2 ]) +# Change p2 ofport to force a ofbundle change and check that the mdb contains +# no stale port. +AT_CHECK([ovs-vsctl set interface p2 ofport_request=4]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding to mcast group port + -> mcast flood port is input port, dropping + -> forwarding to mcast flood port + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no +Datapath actions: 1,2 +]) + OVS_VSWITCHD_STOP AT_CLEANUP @@ -381,6 +401,28 @@ This flow is handled by the userspace slow path because it: - Uses action(s) not supported by datapath. ]) +# Change p2 ofport to force a ofbundle change and check that the mdb contains +# no stale port. +AT_CHECK([ovs-vsctl set interface p3 ofport_request=4]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(1)" '01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101'], [0], [dnl +Flow: ip,in_port=1,vlan_tci=0x,dl_src=00:0c:29:a0:27:a1,dl_dst=01:00:5e:01:01:01,nw_src=172.16.34.30,nw_dst=224.1.1.1,nw_proto=2,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=18,tp_dst=20 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding report to mcast flagged port + -> mcast port is input port, dropping the Report + -> forwarding report to mcast flagged port + +Final flow: unchanged +Megaflow: recirc_id=0,eth,ip,in_port=1,dl_src=00:0c:29:a0:27:a1,dl_dst=01:00:5e:01:01:01,nw_proto=2,nw_frag=no +Datapath actions: 2,3 +This flow is handled by the userspace slow path because it: + - Uses action(s) not supported by datapath. +]) + OVS_VSWITCHD_STOP AT_CLEANUP -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 2/2] mcast-snooping: Flush flood and report ports when deleting interfaces.
On Thu, Nov 16, 2023 at 10:38 AM Eelco Chaudron wrote: > > diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at > > index b5474cf392..1ce31168e8 100644 > > --- a/tests/mcast-snooping.at > > +++ b/tests/mcast-snooping.at > > @@ -207,6 +207,24 @@ Megaflow: > > recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e > > Datapath actions: 1,2 > > ]) > > > > +AT_CHECK([ovs-vsctl set interface p2 ofport_request=4]) > > Can we add a comment here (and below) to indicate why we do this? Just to > understand what we test here. Wdyt of: +# Change p2 ofport to force a ofbundle change and check that the mdb contains +# no stale port. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/2] mcast-snooping: Test per port explicit flooding.
On Thu, Nov 16, 2023 at 11:57 AM Eelco Chaudron wrote: > On 10 Nov 2023, at 18:52, David Marchand wrote: > > +Final flow: unchanged > > +Megaflow: > > recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no > > +]) > > +AT_CHECK([sed -ne 's/^Datapath actions: \(.*\)$/\1/p' stdout | tr "," "\n" > > | sort -n], [0], [dnl > > +1 > > +2 > > +100 > > +]) > > + > > +# send report packets > > Please add capital and dots to all comments. I don't mind, but the rest of this file is not consistent to this convention. $ git grep \\# origin/master -- tests/mcast-snooping.at ... origin/master:tests/mcast-snooping.at:# send report packets origin/master:tests/mcast-snooping.at:# send query packets origin/master:tests/mcast-snooping.at:# send report packets origin/master:tests/mcast-snooping.at:# send query packets -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 1/2] mcast-snooping: Test per port explicit flooding.
Hello Eelco, On Thu, Nov 16, 2023 at 11:57 AM Eelco Chaudron wrote: [snip] > > +bridge("br0") > > +- > > + 0. priority 32768 > > +NORMAL > > + -> forwarding to mcast group port > > + -> forwarding to mcast flood port > > + > > +Final flow: unchanged > > +Megaflow: > > recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no > > +Datapath actions: 1,2 > > > Are we sure the order here is always 1,2 vs the first test you sorted them? > Same for all the other multi-port tests below? > > I did run the test 200+ times, and it seems ok. Trying to understand this, as > I can see the first one reporting 100,1,2 and 100,2,1. struct mcast_output out = MCAST_OUTPUT_INIT; ... if (grp) { xlate_normal_mcast_send_group(ctx, ms, grp, in_xbundle, ); xlate_normal_mcast_send_fports(ctx, ms, in_xbundle, ); xlate_normal_mcast_send_mrouters(ctx, ms, in_xbundle, , ); ... mcast_output_finish(ctx, , in_xbundle, ); With: static void mcast_output_finish(struct xlate_ctx *ctx, struct mcast_output *out, struct xbundle *in_xbundle, struct xvlan *xvlan) { if (out->flood) { xlate_normal_flood(ctx, in_xbundle, xvlan); } else { for (size_t i = 0; i < out->n; i++) { output_normal(ctx, out->xbundles[i], xvlan); } } ... In this case, there is no flooding (contrary to previous tests) over all the ports from this bridge. There is only one "group" port and one "flood" port and the order is fixed. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 2/2] mcast-snooping: Flush flood and report ports when deleting interfaces.
When a configuration change triggers an interface destruction/creation (like for example, setting ofport_request), a port object may still be referenced as a fport or a rport in the mdb. Before the fix, when flooding multicast traffic: bridge("br0") - 0. priority 32768 NORMAL -> forwarding to mcast group port >> mcast flood port is unknown, dropping -> mcast flood port is input port, dropping -> forwarding to mcast flood port Before the fix, when flooding igmp report traffic: bridge("br0") - 0. priority 32768 NORMAL >> mcast port is unknown, dropping the report -> forwarding report to mcast flagged port -> mcast port is input port, dropping the Report -> forwarding report to mcast flagged port Add relevant cleanup and update unit tests. Fixes: 4fbbf8624868 ("mcast-snooping: Flush ports mdb when VLAN configuration changed.") Signed-off-by: David Marchand --- Changes since v1: - updated the test on report flooding, --- lib/mcast-snooping.c| 15 +++ tests/mcast-snooping.at | 38 ++ 2 files changed, 53 insertions(+) diff --git a/lib/mcast-snooping.c b/lib/mcast-snooping.c index 029ca28558..34755447f8 100644 --- a/lib/mcast-snooping.c +++ b/lib/mcast-snooping.c @@ -948,6 +948,7 @@ mcast_snooping_flush_bundle(struct mcast_snooping *ms, void *port) { struct mcast_group *g; struct mcast_mrouter_bundle *m; +struct mcast_port_bundle *p; if (!mcast_snooping_enabled(ms)) { return; @@ -971,5 +972,19 @@ mcast_snooping_flush_bundle(struct mcast_snooping *ms, void *port) } } +LIST_FOR_EACH_SAFE (p, node, >fport_list) { +if (p->port == port) { +mcast_snooping_flush_port(p); +ms->need_revalidate = true; +} +} + +LIST_FOR_EACH_SAFE (p, node, >rport_list) { +if (p->port == port) { +mcast_snooping_flush_port(p); +ms->need_revalidate = true; +} +} + ovs_rwlock_unlock(>rwlock); } diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at index b5474cf392..1ce31168e8 100644 --- a/tests/mcast-snooping.at +++ b/tests/mcast-snooping.at @@ -207,6 +207,24 @@ Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e Datapath actions: 1,2 ]) +AT_CHECK([ovs-vsctl set interface p2 ofport_request=4]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding to mcast group port + -> mcast flood port is input port, dropping + -> forwarding to mcast flood port + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no +Datapath actions: 1,2 +]) + OVS_VSWITCHD_STOP AT_CLEANUP @@ -381,6 +399,26 @@ This flow is handled by the userspace slow path because it: - Uses action(s) not supported by datapath. ]) +AT_CHECK([ovs-vsctl set interface p3 ofport_request=4]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(1)" '01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101'], [0], [dnl +Flow: ip,in_port=1,vlan_tci=0x,dl_src=00:0c:29:a0:27:a1,dl_dst=01:00:5e:01:01:01,nw_src=172.16.34.30,nw_dst=224.1.1.1,nw_proto=2,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=18,tp_dst=20 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding report to mcast flagged port + -> mcast port is input port, dropping the Report + -> forwarding report to mcast flagged port + +Final flow: unchanged +Megaflow: recirc_id=0,eth,ip,in_port=1,dl_src=00:0c:29:a0:27:a1,dl_dst=01:00:5e:01:01:01,nw_proto=2,nw_frag=no +Datapath actions: 2,3 +This flow is handled by the userspace slow path because it: + - Uses action(s) not supported by datapath. +]) + OVS_VSWITCHD_STOP AT_CLEANUP -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 1/2] mcast-snooping: Test per port explicit flooding.
Various options affect how the mcast snooping module work. When multicast snooping is enabled and a reporter is known, it is still possible to flood associated packets to some other port via the mcast-snooping-flood option. If flooding unregistered traffic is disabled, it is still possible to flood multicast traffic too with the mcast-snooping-flood option. IGMP reports may have to be flooded to some ports explicitly with the mcast-snooping-flood-reports option. Test those parameters. Signed-off-by: David Marchand --- Changes since v1: - fixed dest mac address, - added tests for mcast-snooping-disable-flood-unregistered=true and mcast-snooping-flood-reports, --- tests/mcast-snooping.at | 280 1 file changed, 280 insertions(+) diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at index d5b7c4774c..b5474cf392 100644 --- a/tests/mcast-snooping.at +++ b/tests/mcast-snooping.at @@ -105,6 +105,286 @@ AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([mcast - check multicast per port flooding]) +OVS_VSWITCHD_START([]) + +AT_CHECK([ +ovs-vsctl set bridge br0 \ +datapath_type=dummy \ +mcast_snooping_enable=true \ +other-config:mcast-snooping-disable-flood-unregistered=false +], [0]) + +AT_CHECK([ovs-ofctl add-flow br0 action=normal]) + +AT_CHECK([ +ovs-vsctl add-port br0 p1 \ +-- set Interface p1 type=dummy other-config:hwaddr=aa:55:aa:55:00:01 ofport_request=1 \ +-- add-port br0 p2 \ +-- set Interface p2 type=dummy other-config:hwaddr=aa:55:aa:55:00:02 ofport_request=2 \ +-- add-port br0 p3 \ +-- set Interface p3 type=dummy other-config:hwaddr=aa:55:aa:55:00:03 ofport_request=3 \ +], [0]) + +ovs-appctl time/stop + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [stdout]) +AT_CHECK([grep -v 'Datapath actions:' stdout], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> unregistered multicast, flooding + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no +]) +AT_CHECK([sed -ne 's/^Datapath actions: \(.*\)$/\1/p' stdout | tr "," "\n" | sort -n], [0], [dnl +1 +2 +100 +]) + +# send report packets +AT_CHECK([ +ovs-appctl netdev-dummy/receive p1 \ + '01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101' +], [0]) +AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl + port VLAN GROUPAge +1 0 224.1.1.1 0 +]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding to mcast group port + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no +Datapath actions: 1 +]) + +AT_CHECK([ovs-vsctl set port p2 other_config:mcast-snooping-flood=true]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding to mcast group port + -> forwarding to mcast flood port + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no +Datapath actions: 1,2 +]) + +AT_CHECK([ovs-vsctl set port p3 other_config:mcast-snooping-flood=true]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +-
Re: [ovs-dev] [PATCH 1/2] mcast-snooping: Test per port explicit flooding.
On Thu, Nov 9, 2023 at 4:33 PM Paolo Valerio wrote: > > David Marchand writes: > > > When multicast snooping is enabled and a reporter is known, it is still > > possible to flood associated packets to some other port via the > > mcast-snooping-flood option. > > > > Test this combination. > > > > Signed-off-by: David Marchand > > --- > > tests/mcast-snooping.at | 88 + > > 1 file changed, 88 insertions(+) > > > > diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at > > index d5b7c4774c..21c806ef63 100644 > > --- a/tests/mcast-snooping.at > > +++ b/tests/mcast-snooping.at > > @@ -105,6 +105,94 @@ AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl > > OVS_VSWITCHD_STOP > > AT_CLEANUP > > > > + > > +AT_SETUP([mcast - check flooding on ports]) > > +OVS_VSWITCHD_START([]) > > + > > +AT_CHECK([ > > +ovs-vsctl set bridge br0 \ > > +datapath_type=dummy \ > > +mcast_snooping_enable=true \ > > +other-config:mcast-snooping-disable-flood-unregistered=false > > +], [0]) > > + > > in the case flood unregistered is disabled packets are supposed to > be sent to flood ports. While at it, it might also be worth testing that > like in the quick example at the end I used to test it. > WDYT? It sounds reasonable yes. I was also considering testing reports flooding. WDYT? > > > +AT_CHECK([ovs-ofctl add-flow br0 action=normal]) > > + > > +AT_CHECK([ > > +ovs-vsctl add-port br0 p1 \ > > +-- set Interface p1 type=dummy other-config:hwaddr=aa:55:aa:55:00:01 > > ofport_request=1 \ > > +-- add-port br0 p2 \ > > +-- set Interface p2 type=dummy other-config:hwaddr=aa:55:aa:55:00:02 > > ofport_request=2 \ > > +-- add-port br0 p3 \ > > +-- set Interface p3 type=dummy other-config:hwaddr=aa:55:aa:55:00:03 > > ofport_request=3 \ > > +], [0]) > > + > > +ovs-appctl time/stop > > + > > +# send report packets > > +AT_CHECK([ > > +ovs-appctl netdev-dummy/receive p1 \ > > + > > '01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101' > > +], [0]) > > + > > +AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl > > + port VLAN GROUPAge > > +1 0 224.1.1.1 0 > > +]) > > + > > +AT_CHECK([ovs-appctl ofproto/trace > > "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], > > [0], [dnl > > +Flow: > > udp,in_port=3,vlan_tci=0x0000,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 > > + > > I think the mac for 224.1.1.1 maps to 01:00:5e:01:01:01. Argh.. indeed, wrong copy/paste. Thanks for the review! > > > +bridge("br0") > > +- > > + 0. priority 32768 > > +NORMAL > > + -> forwarding to mcast group port -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/2] mcast-snooping: Test per port explicit flooding.
When multicast snooping is enabled and a reporter is known, it is still possible to flood associated packets to some other port via the mcast-snooping-flood option. Test this combination. Signed-off-by: David Marchand --- tests/mcast-snooping.at | 88 + 1 file changed, 88 insertions(+) diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at index d5b7c4774c..21c806ef63 100644 --- a/tests/mcast-snooping.at +++ b/tests/mcast-snooping.at @@ -105,6 +105,94 @@ AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([mcast - check flooding on ports]) +OVS_VSWITCHD_START([]) + +AT_CHECK([ +ovs-vsctl set bridge br0 \ +datapath_type=dummy \ +mcast_snooping_enable=true \ +other-config:mcast-snooping-disable-flood-unregistered=false +], [0]) + +AT_CHECK([ovs-ofctl add-flow br0 action=normal]) + +AT_CHECK([ +ovs-vsctl add-port br0 p1 \ +-- set Interface p1 type=dummy other-config:hwaddr=aa:55:aa:55:00:01 ofport_request=1 \ +-- add-port br0 p2 \ +-- set Interface p2 type=dummy other-config:hwaddr=aa:55:aa:55:00:02 ofport_request=2 \ +-- add-port br0 p3 \ +-- set Interface p3 type=dummy other-config:hwaddr=aa:55:aa:55:00:03 ofport_request=3 \ +], [0]) + +ovs-appctl time/stop + +# send report packets +AT_CHECK([ +ovs-appctl netdev-dummy/receive p1 \ + '01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101' +], [0]) + +AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl + port VLAN GROUPAge +1 0 224.1.1.1 0 +]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding to mcast group port + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_dst=224.1.1.1,nw_frag=no +Datapath actions: 1 +]) + +AT_CHECK([ovs-vsctl set port p2 other_config:mcast-snooping-flood=true]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding to mcast group port + -> forwarding to mcast flood port + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_dst=224.1.1.1,nw_frag=no +Datapath actions: 1,2 +]) + +AT_CHECK([ovs-vsctl set port p3 other_config:mcast-snooping-flood=true]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding to mcast group port + -> forwarding to mcast flood port + -> mcast flood port is input port, dropping + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_dst=224.1.1.1,nw_frag=no +Datapath actions: 1,2 +]) + +OVS_VSWITCHD_STOP +AT_CLEANUP + + AT_SETUP([mcast - delete the port mdb when vlan configuration changed]) OVS_VSWITCHD_START([]) -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 2/2] mcast-snooping: Flush flood and report ports when deleting interfaces.
When a configuration change triggers an interface destruction/creation (like for example, setting ofport_request), a port object may still be referenced as a fport or a rport in the mdb: bridge("br0") - 0. priority 32768 NORMAL -> forwarding to mcast group port >> mcast flood port is unknown, dropping -> mcast flood port is input port, dropping -> forwarding to mcast flood port Add relevant cleanup and update a unit test. Fixes: 4fbbf8624868 ("mcast-snooping: Flush ports mdb when VLAN configuration changed.") Signed-off-by: David Marchand --- lib/mcast-snooping.c| 15 +++ tests/mcast-snooping.at | 18 ++ 2 files changed, 33 insertions(+) diff --git a/lib/mcast-snooping.c b/lib/mcast-snooping.c index 029ca28558..34755447f8 100644 --- a/lib/mcast-snooping.c +++ b/lib/mcast-snooping.c @@ -948,6 +948,7 @@ mcast_snooping_flush_bundle(struct mcast_snooping *ms, void *port) { struct mcast_group *g; struct mcast_mrouter_bundle *m; +struct mcast_port_bundle *p; if (!mcast_snooping_enabled(ms)) { return; @@ -971,5 +972,19 @@ mcast_snooping_flush_bundle(struct mcast_snooping *ms, void *port) } } +LIST_FOR_EACH_SAFE (p, node, >fport_list) { +if (p->port == port) { +mcast_snooping_flush_port(p); +ms->need_revalidate = true; +} +} + +LIST_FOR_EACH_SAFE (p, node, >rport_list) { +if (p->port == port) { +mcast_snooping_flush_port(p); +ms->need_revalidate = true; +} +} + ovs_rwlock_unlock(>rwlock); } diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at index 21c806ef63..66bb8d0c65 100644 --- a/tests/mcast-snooping.at +++ b/tests/mcast-snooping.at @@ -189,6 +189,24 @@ Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e Datapath actions: 1,2 ]) +AT_CHECK([ovs-vsctl set interface p2 ofport_request=4]) + +AT_CHECK([ovs-appctl ofproto/trace "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"], [0], [dnl +Flow: udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000 + +bridge("br0") +- + 0. priority 32768 +NORMAL + -> forwarding to mcast group port + -> mcast flood port is input port, dropping + -> forwarding to mcast flood port + +Final flow: unchanged +Megaflow: recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_dst=224.1.1.1,nw_frag=no +Datapath actions: 1,2 +]) + OVS_VSWITCHD_STOP AT_CLEANUP -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v8 2/9] system-dpdk: Don't require hugetlbfs.
On Fri, Oct 27, 2023 at 5:31 PM David Marchand wrote: > > dpdk-testpmd does not need hugetlbfs backing as we don't require > multiprocess support in OVS unit tests. > > Switch to --in-memory and remove the (then unneeded) check on > hugetlbfs presence. > > Acked-by: Aaron Conole > Acked-by: Eelco Chaudron > Signed-off-by: David Marchand > --- > Changes since v7: > - reverted use of dynamic allocations and kept initial memory > reservation: this avoids random failures when testpmd and ovs-dpdk > try to allocate memory at the same time, Just a heads up on this change. While testing the v8 series, I hit random failures on the mtu unit tests. testpmd would fail to allocate memory in hugepages. I suspect this has to do with OVS allocating memory in // of testpmd. Reverting to an initial reserved memory makes the issue disappear so I removed this change and kept existing behavior. There may still be a race underneath but at least the situation is as bad as before. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v8 8/9] system-dpdk: Rework cleanup for vhost-user client tests.
Those tests are subject to a race when a testpmd hosting the vhost-user server is stopped and OVS has enough time to detect the vhost-user socket drop and tries to reconnect to this socket. In such a situation, the tests can fail as the OVS process with the vhost-user client port complains with a warning log: 2023-09-08T13:15:18.160Z|00163|dpdk|INFO|VHOST_CONFIG: (.../005/dpdkvhostclient0) vhost peer closed 2023-09-08T13:15:18.160Z|00164|netdev_dpdk|INFO|vHost Device '.../005/dpdkvhostclient0' connection has been destroyed 2023-09-08T13:15:18.160Z|00165|dpdk|INFO|VHOST_CONFIG: (.../005/dpdkvhostclient0) vhost-user client: socket created, fd: 24 2023-09-08T13:15:18.160Z|00166|dpdk|WARN|VHOST_CONFIG: (.../005/dpdkvhostclient0) failed to connect: Connection refused 2023-09-08T13:15:18.160Z|00167|dpdk|INFO|VHOST_CONFIG: (.../005/dpdkvhostclient0) reconnecting... Invert the order of the cleanup steps. Signed-off-by: David Marchand Acked-by: Eelco Chaudron --- Changes since v6: - added this fix for spurious failures hit by Eelco, --- tests/system-dpdk.at | 15 +-- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index fd4a4b7d73..80277b24b6 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -222,10 +222,9 @@ AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_bytes $((`ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q0_good_bytes` + dnl `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q1_good_bytes`))]) -OVS_DPDK_STOP_TESTPMD() - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) +OVS_DPDK_STOP_TESTPMD() OVS_DPDK_STOP_VSWITCHD(["dnl /VHOST_CONFIG: (.*dpdkvhostclient0) recvmsg failed/d /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d @@ -649,10 +648,9 @@ AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=9000]) AT_CHECK([ovs-appctl dpctl/show], [], [stdout]) AT_CHECK([grep -E 'mtu=9000' stdout], [], [stdout]) -OVS_DPDK_STOP_TESTPMD() - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) +OVS_DPDK_STOP_TESTPMD() OVS_DPDK_STOP_VSWITCHD(["dnl /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d"]) AT_CLEANUP @@ -694,10 +692,9 @@ AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=2000]) AT_CHECK([ovs-appctl dpctl/show], [], [stdout]) AT_CHECK([grep -E 'mtu=2000' stdout], [], [stdout]) -OVS_DPDK_STOP_TESTPMD() - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) +OVS_DPDK_STOP_TESTPMD() OVS_DPDK_STOP_VSWITCHD(["dnl /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d"]) AT_CLEANUP @@ -813,10 +810,9 @@ dnl Set MTU value above upper bound and check for error AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=9711]) AT_CHECK([grep "dpdkvhostuserclient0: unsupported MTU 9711" ovs-vswitchd.log], [], [stdout]) -OVS_DPDK_STOP_TESTPMD() - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) +OVS_DPDK_STOP_TESTPMD() OVS_DPDK_STOP_VSWITCHD(["dnl /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d /dpdkvhostuserclient0: unsupported MTU 9711/d @@ -859,10 +855,9 @@ dnl Set MTU value below lower bound and check for error AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=67]) AT_CHECK([grep "dpdkvhostuserclient0: unsupported MTU 67" ovs-vswitchd.log], [], [stdout]) -OVS_DPDK_STOP_TESTPMD() - dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) +OVS_DPDK_STOP_TESTPMD() OVS_DPDK_STOP_VSWITCHD(["dnl /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d /dpdkvhostuserclient0: unsupported MTU 67/d -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v8 9/9] system-dpdk: Run traffic tests.
Integrate system-traffic.at tests as part of check-dpdk. Some tests that can't work with the userspace datapath are skipped by overriding some OVS_CHECK_* macros. ADD_VETH is implemented using the net/af_xdp DPDK driver. Signed-off-by: David Marchand Acked-by: Eelco Chaudron --- Changes since v6: - fixed some checkpatch warning, Changes since v4: - switched to net/af_xdp, this removes the tweaking needed for net/tap, and it lets existing tool relying on kernel netdevs. veth offloading still needs some tweaking, Changes since v3: - reverted --dummy-numa and opted for configuring a number of rxqs relevant to the number of NUMA sockets, Changes since v2: - added ADD_VETH_IGNORE_LOGS and moved ignored error logs to OVS_TRAFFIC_VSWITCHD_STOP, - added --no-pci to DPDK options to avoid failing the tests when running in a vm with a virtio-net device, - faked a mono numa/mono core so that OVS requests at max 2 txqs on the net/tap port, --- .ci/dpdk-build.sh| 3 +- .github/workflows/build-and-test.yml | 2 +- tests/system-dpdk-macros.at | 77 tests/system-dpdk-testsuite.at | 2 + tests/system-dpdk.at | 3 -- 5 files changed, 82 insertions(+), 5 deletions(-) diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh index 35540f0694..aa83e44643 100755 --- a/.ci/dpdk-build.sh +++ b/.ci/dpdk-build.sh @@ -38,7 +38,8 @@ function build_dpdk() # any DPDK driver. # check-dpdk unit tests requires testpmd and some net/ driver. DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd" -DPDK_OPTS="$DPDK_OPTS -Denable_drivers=net/null,net/tap,net/virtio" +enable_drivers="net/null,net/af_xdp,net/tap,net/virtio" +DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers" # Install DPDK using prefix. DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build" diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml index 4f62efb7c3..09654205e7 100644 --- a/.github/workflows/build-and-test.yml +++ b/.github/workflows/build-and-test.yml @@ -5,7 +5,7 @@ on: [push, pull_request] jobs: build-dpdk: env: - dependencies: gcc libnuma-dev ninja-build + dependencies: gcc libbpf-dev libnuma-dev ninja-build pkgconf CC: gcc DPDK_GIT: https://dpdk.org/git/dpdk-stable DPDK_VER: 22.11.1 diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index 7fedfd6515..dcdfa55741 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -127,3 +127,80 @@ m4_define([OVS_DPDK_STOP_TESTPMD], [AT_CHECK([kill `cat testpmd.pid`]) OVS_WAIT([kill -0 `cat testpmd.pid`], [kill -9 `cat testpmd.pid`]) ]) + + +# OVS_TRAFFIC_VSWITCHD_START([vsctl-args], [vsctl-output], [dbinit-aux-args]) +# +# Creates a database and starts ovsdb-server, starts ovs-vswitchd +# connected to that database, calls ovs-vsctl to create a bridge named +# br0 with predictable settings, passing 'vsctl-args' as additional +# commands to ovs-vsctl. If 'vsctl-args' causes ovs-vsctl to provide +# output (e.g. because it includes "create" commands) then 'vsctl-output' +# specifies the expected output after filtering through uuidfilt. +# 'dbinit-aux-args' are passed as additional commands to 'ovs-vsctl init' +# before starting ovs-vswitchd. +m4_define([OVS_TRAFFIC_VSWITCHD_START], + [ + OVS_DPDK_PRE_CHECK() + OVS_WAIT_WHILE([ip link show ovs-netdev]) + dnl For functional tests, no need for DPDK PCI probing. + OVS_DPDK_START([--no-pci], [--disable-system], [$3]) + dnl Add bridges, ports, etc. + OVS_WAIT_WHILE([ip link show br0]) + AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| uuidfilt])], [0], [$2]) +]) + + +# OVS_TRAFFIC_VSWITCHD_STOP([ALLOWLIST], [extra_cmds]) +# +# Gracefully stops ovs-vswitchd and ovsdb-server, checking their log files +# for messages with severity WARN or higher and signaling an error if any +# is present. The optional ALLOWLIST may contain shell-quoted "sed" +# commands to delete any warnings that are actually expected, e.g.: +# +# OVS_TRAFFIC_VSWITCHD_STOP(["/expected error/d"]) +# +# 'extra_cmds' are shell commands to be executed after OVS_VSWITCHD_STOP() is +# invoked. They can be used to perform additional cleanups such as name space +# removal. +m4_define([OVS_TRAFFIC_VSWITCHD_STOP], + [OVS_DPDK_STOP_VSWITCHD([$1]) + AT_CHECK([:; $2]) +]) + + +# Plug a veth into OVS via DPDK net/af_xdp. +m4_define([ADD_VETH], +[ AT_CHECK([ip link add $1 type veth peer name ovs-$1 || return 77]) + CONFIGURE_VETH_OFFLOADS([$1]) + AT_CHECK([ip link set $1 netns $2]) + AT_CHECK([ip link set dev ovs-$1 up]) + AT_CHECK([ovs-vsctl add-port $3 ovs-$1 -- \ +set interface ovs-$1 external-ids:iface-id="$1" -- \ +set interface ovs-$1 type=dpdk -- \ +set interface ovs-$1 options:dpd
[ovs-dev] [PATCH v8 6/9] netdev-afxdp: Postpone libbpf logging helper registration.
When using net/af_xdp DPDK driver along OVS native AF_XDP support, confusing logs are reported, like: netdev_dpdk|INFO|Device 'net_af_xdpp0,iface=ovs-p0' attached to DPDK dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 11 created. dpif_netdev|INFO|There are 1 pmd threads on numa node 0 dpdk|INFO|Device with port_id=0 already stopped dpdk(pmd-c11/id:22)|INFO|PMD thread uses DPDK lcore 1. netdev_dpdk|WARN|Rx checksum offload is not supported on port 0 netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(6) .xdp_run_config netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7) xdp_metadata netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7) xdp_metadata netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7) xdp_metadata This comes from the fact that netdev-afxdp unconditionnally registers a helper for logging libbpf messages. Making both net/af_xdp and netdev-afxdp work at the same time seems difficult, so at least, ensure that netdev-afxdp won't register this helper unless a netdev is actually allocated. Signed-off-by: David Marchand Acked-by: Eelco Chaudron --- lib/netdev-afxdp.c | 12 ++-- lib/netdev-afxdp.h | 1 - lib/netdev-linux.c | 1 - 3 files changed, 6 insertions(+), 8 deletions(-) diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c index 16f26bc306..9884ccec4f 100644 --- a/lib/netdev-afxdp.c +++ b/lib/netdev-afxdp.c @@ -1195,18 +1195,18 @@ libbpf_print(enum libbpf_print_level level, return 0; } -int netdev_afxdp_init(void) -{ -libbpf_set_print(libbpf_print); -return 0; -} - int netdev_afxdp_construct(struct netdev *netdev) { +static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; struct netdev_linux *dev = netdev_linux_cast(netdev); int ret; +if (ovsthread_once_start()) { +libbpf_set_print(libbpf_print); +ovsthread_once_done(); +} + /* Configure common netdev-linux first. */ ret = netdev_linux_construct(netdev); if (ret) { diff --git a/lib/netdev-afxdp.h b/lib/netdev-afxdp.h index e91cd102d2..6c5459f6e6 100644 --- a/lib/netdev-afxdp.h +++ b/lib/netdev-afxdp.h @@ -47,7 +47,6 @@ struct xsk_socket_info; int netdev_afxdp_rxq_construct(struct netdev_rxq *rxq_); void netdev_afxdp_rxq_destruct(struct netdev_rxq *rxq_); -int netdev_afxdp_init(void); int netdev_afxdp_construct(struct netdev *netdev_); void netdev_afxdp_destruct(struct netdev *netdev_); int netdev_afxdp_verify_mtu_size(const struct netdev *netdev, int mtu); diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index cca3408797..4538cdfe63 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -3754,7 +3754,6 @@ const struct netdev_class netdev_internal_class = { #ifdef HAVE_AF_XDP #define NETDEV_AFXDP_CLASS_COMMON \ -.init = netdev_afxdp_init, \ .construct = netdev_afxdp_construct,\ .destruct = netdev_afxdp_destruct, \ .get_stats = netdev_afxdp_get_stats,\ -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v8 7/9] system-dpdk: Remove tap interfaces from vport MTU tests.
The unit tests for changing MTU with vhost-user ports are not using those tap interfaces. Signed-off-by: David Marchand --- Changes since v7: - added this patch after getting regressions in Intel CI because of the next patch, --- tests/system-dpdk.at | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 655e25ff13..fd4a4b7d73 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -636,9 +636,7 @@ AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user client: AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' mode, using client socket" ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." ovs-vswitchd.log], [], [stdout]) -OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1" \ ---vdev="net_tap0,iface=tap0"]) - +OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"]) OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) dnl Check default MTU value in the datapath @@ -683,9 +681,7 @@ AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user client: AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' mode, using client socket" ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." ovs-vswitchd.log], [], [stdout]) -OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1" \ ---vdev="net_tap0,iface=tap0"]) - +OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"]) OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) dnl Check MTU value in the datapath @@ -805,8 +801,7 @@ AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=9702]) AT_CHECK([ovs-vsctl show], [], [stdout]) sleep 2 -OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1" \ ---vdev="net_tap0,iface=tap0"]) +OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"]) OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) @@ -852,8 +847,7 @@ AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user client: AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' mode, using client socket" ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." ovs-vswitchd.log], [], [stdout]) -OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1" \ ---vdev="net_tap0,iface=tap0"]) +OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"]) OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v8 5/9] system-dpdk: Refactor OVS daemons helpers.
Align system-dpdk existing helpers to other common OVS helpers so they can accept some optional arguments. Introduce a OVS_DPDK_STOP_VSWITCHD wrapper around OVS_VSWITCHD_STOP to catch dpdk related logs in a centralised fashion. Signed-off-by: David Marchand Acked-by: Eelco Chaudron --- Changes since v6: - did a minor cleanup on vhost-user client test log pattern (one entry concerned vhost-user ports, not vhost-user *client* ports), --- tests/system-dpdk-macros.at | 21 - tests/system-dpdk.at| 158 +++- 2 files changed, 82 insertions(+), 97 deletions(-) diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index 35d14bee8f..7fedfd6515 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -36,12 +36,13 @@ m4_define([OVS_DPDK_PRE_PHY_SKIP], # m4_define([OVS_DPDK_START], [dnl start ovs dpdk - OVS_DPDK_START_OVSDB() + OVS_DPDK_START_OVSDB($3) dnl Enable DPDK functionality AT_CHECK([ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true]) - OVS_DPDK_START_VSWITCHD($1) + OVS_DPDK_START_VSWITCHD([$1], [$2]) ]) + # OVS_DPDK_START_OVSDB() # # Create an empty database and start ovsdb-server. @@ -60,9 +61,10 @@ m4_define([OVS_DPDK_START_OVSDB], AT_CAPTURE_FILE([ovsdb-server.log]) dnl Initialize database. - AT_CHECK([ovs-vsctl --no-wait init]) + AT_CHECK([ovs-vsctl --no-wait init $1]) ]) + # OVS_DPDK_START_VSWITCHD() # # Add special configuration for dpdk-init. Start ovs-vswitchd. @@ -72,12 +74,23 @@ m4_define([OVS_DPDK_START_VSWITCHD], AT_CHECK([ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-extra="--log-level=pmd.*:error $1"]) dnl Start ovs-vswitchd. - AT_CHECK([ovs-vswitchd --detach --no-chdir --pidfile --log-file -vvconn -vofproto_dpif -vunixctl], [0], [stdout], [stderr]) + AT_CHECK([ovs-vswitchd $2 --detach --no-chdir --pidfile --log-file -vvconn -vofproto_dpif -vunixctl], [0], [stdout], [stderr]) AT_CAPTURE_FILE([ovs-vswitchd.log]) on_exit "kill_ovs_vswitchd `cat ovs-vswitchd.pid`" ]) +m4_define([OVS_DPDK_STOP_VSWITCHD], + [OVS_VSWITCHD_STOP([dnl +$1";/does not exist. The Open vSwitch kernel module is probably not loaded./d +/does not support MTU configuration,/d +/EAL: No \(available\|free\) .*hugepages reported/d +/Failed to enable flow control/d +/Rx checksum offload is not supported on/d +/TELEMETRY: No legacy callbacks, legacy socket not created/d"]) +]) + + # OVS_DPDK_CHECK_TESTPMD() # # Check dpdk-testpmd availability. diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 270587e2c0..655e25ff13 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -3,15 +3,6 @@ m4_define([CONFIGURE_VETH_OFFLOADS], AT_BANNER([OVS-DPDK unit tests]) -m4_define([SYSTEM_DPDK_ALLOWED_LOGS],[ -\@does not exist. The Open vSwitch kernel module is probably not loaded.@d -\@does not support MTU configuration,@d -\@EAL: No \(available\|free\) .*hugepages reported@d -\@Failed to enable flow control@d -\@Rx checksum offload is not supported on@d -\@TELEMETRY: No legacy callbacks, legacy socket not created@d -]) - dnl CHECK_MEMPOOL_PARAM([mtu], [numa], [+line]) dnl dnl Waits for logs to indicate that the user has configured a mempool @@ -36,7 +27,7 @@ OVS_DPDK_START([--no-pci]) AT_CHECK([grep "DPDK Enabled - initializing..." ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "EAL" ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "DPDK Enabled - initialized" ovs-vswitchd.log], [], [stdout]) -OVS_VSWITCHD_STOP("[SYSTEM_DPDK_ALLOWED_LOGS]") +OVS_DPDK_STOP_VSWITCHD AT_CLEANUP dnl -- @@ -58,7 +49,7 @@ sleep 2 dnl Clean up AT_CHECK([ovs-vsctl del-port br10 phy0], [], [stdout], [stderr]) -OVS_VSWITCHD_STOP("[SYSTEM_DPDK_ALLOWED_LOGS]") +OVS_DPDK_STOP_VSWITCHD AT_CLEANUP dnl -- @@ -84,9 +75,8 @@ AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." ov dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) -OVS_VSWITCHD_STOP("m4_join([], [SYSTEM_DPDK_ALLOWED_LOGS], [ -\@VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) failed to connect: No such file or directory@d -])") +OVS_DPDK_STOP_VSWITCHD(["dnl +/VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or directory/d"]) AT_CLEANUP dnl -- @@ -150,12 +140,11 @@ OVS_WAIT_UNTIL([grep "vHost Device '$OVS_RUNDIR/dpdkvhostuser0' has been removed dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuser0], [], [stdout], [stderr]) -OVS_VSWITCHD_STOP("m4_join([], [SYSTEM_DPDK_ALLOWED_LOGS], [ -\@VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostuser0) recvmsg
[ovs-dev] [PATCH v8 4/9] tests: Define a macro to skip tc relying tests.
Some unit tests expect that a OVS port has an associated netdevice on which they can hook tc. This will not be possible when testing the userspace datapath with DPDK. Introduce a helper (which will be overriden in system-dpdk tests) and use it in the existing tests. Acked-by: Aaron Conole Signed-off-by: David Marchand Acked-by: Eelco Chaudron --- Changes since v4: - as the traffic tests now use net/af_xdp, it is not required to skip tests relying on tcpdump anymore, --- tests/system-common-macros.at| 6 ++ tests/system-offloads-traffic.at | 6 +++--- tests/system-traffic.at | 6 +++--- 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at index 0077a8609c..0113aae8bd 100644 --- a/tests/system-common-macros.at +++ b/tests/system-common-macros.at @@ -297,6 +297,12 @@ m4_define([OVS_START_L7], # m4_define([OFPROTO_CLEAR_DURATION_IDLE], [[sed -e 's/duration=.*s,/duration=,/g' -e 's/idle_age=[0-9]*,/idle_age=,/g']]) +# OVS_CHECK_TC_QDISC() +# +# Macro to skip tests when tc qdisc can't be applied on a OVS port. +m4_define([OVS_CHECK_TC_QDISC], +[AT_SKIP_IF([test $HAVE_TC = no])]) + # OVS_CHECK_TUNNEL_TSO() # # Macro to be used in general tunneling tests that could be also diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at index 5ad6b4bfdf..0bedee7530 100644 --- a/tests/system-offloads-traffic.at +++ b/tests/system-offloads-traffic.at @@ -20,7 +20,7 @@ m4_define([OVS_CHECK_ACTIONS], [ m4_define([CHECK_TC_INGRESS_PPS], [ -AT_SKIP_IF([test $HAVE_TC = "no"]) +OVS_CHECK_TC_QDISC() AT_CHECK([ip link add ovs_tc_pps0 type veth peer name ovs_tc_pps1 dnl || exit 77]) on_exit 'ip link del ovs_tc_pps0' @@ -95,7 +95,7 @@ AT_CLEANUP AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - offloads disabled]) AT_KEYWORDS([ingress_policing]) -AT_SKIP_IF([test $HAVE_TC = "no"]) +OVS_CHECK_TC_QDISC() OVS_TRAFFIC_VSWITCHD_START() AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw-offload=false]) AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) @@ -118,7 +118,7 @@ AT_CLEANUP AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - offloads enabled]) AT_KEYWORDS([ingress_policing]) -AT_SKIP_IF([test $HAVE_TC = "no"]) +OVS_CHECK_TC_QDISC() OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . other_config:hw-offload=true]) AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) ADD_NAMESPACES(at_ns0) diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 7ea4502028..a7d4ed83bd 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -2321,7 +2321,7 @@ AT_CLEANUP AT_BANNER([QoS]) AT_SETUP([QoS - basic configuration]) -AT_SKIP_IF([test $HAVE_TC = no]) +OVS_CHECK_TC_QDISC() OVS_TRAFFIC_VSWITCHD_START() ADD_NAMESPACES(at_ns0, at_ns1) @@ -2355,7 +2355,7 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([QoS - 64bit]) -AT_SKIP_IF([test $HAVE_TC = no]) +OVS_CHECK_TC_QDISC() AT_SKIP_IF([test $HAVE_TCA_HTB_RATE64 = no]) OVS_TRAFFIC_VSWITCHD_START() @@ -2383,7 +2383,7 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([Ingress Policing - 64-bit]) -AT_SKIP_IF([test $HAVE_TC = no]) +OVS_CHECK_TC_QDISC() AT_SKIP_IF([test $HAVE_TCA_POLICE_PKTRATE64 = no]) OVS_TRAFFIC_VSWITCHD_START() ADD_NAMESPACES(ns0) -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v8 3/9] ci: Run DPDK tests in GitHub Actions.
Let's enhance our coverage in the CI and run DPDK system tests. A few DPDK drivers are enabled in DPDK compilation. Put DPDK build in $PATH for dpdk-testpmd to be available. sudo drops PATH= updates and -E alone does not seem to preserve this variable. Pass PATH=$PATH when running the tests, as a workaround. Since those tests are run as root, the collection of logs is updated accordingly. In GHA, only two cores are available but some test rely on testpmd using three lcores. Add a DPDK_EAL_OPTIONS environment variable and use it to map all testpmd lcores to core 1 (and leave core 0 alone for OVS main and PMD threads). Signed-off-by: David Marchand Acked-by: Aaron Conole Acked-by: Eelco Chaudron --- Changes since v1: - rebased after DPDK build has been moved out of linux-build.sh, - restored running "normal" checks in the DPDK jobs, --- .ci/dpdk-build.sh| 7 --- .ci/linux-build.sh | 15 ++- .github/workflows/build-and-test.yml | 7 --- tests/system-dpdk-macros.at | 2 +- 4 files changed, 23 insertions(+), 8 deletions(-) diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh index 02dcefef61..35540f0694 100755 --- a/.ci/dpdk-build.sh +++ b/.ci/dpdk-build.sh @@ -35,9 +35,10 @@ function build_dpdk() DPDK_OPTS="$DPDK_OPTS -Ddeveloper_mode=disabled" # OVS compilation and "normal" unit tests (run in the CI) do not depend on -# any DPDK driver being present. -# We can disable all drivers to save compilation time. -DPDK_OPTS="$DPDK_OPTS -Ddisable_drivers=*/*" +# any DPDK driver. +# check-dpdk unit tests requires testpmd and some net/ driver. +DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd" +DPDK_OPTS="$DPDK_OPTS -Denable_drivers=net/null,net/tap,net/virtio" # Install DPDK using prefix. DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build" diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh index 8227a57487..aa2ecc5050 100755 --- a/.ci/linux-build.sh +++ b/.ci/linux-build.sh @@ -22,6 +22,9 @@ function install_dpdk() # Export the following path for pkg-config to find the .pc file. export PKG_CONFIG_PATH=$DPDK_LIB/pkgconfig/:$PKG_CONFIG_PATH +# Expose dpdk binaries. +export PATH=$(pwd)/dpdk-dir/build/bin:$PATH + if [ ! -f "${VERSION_FILE}" ]; then echo "Could not find DPDK in $(pwd)/dpdk-dir" return 1 @@ -113,7 +116,7 @@ fi OPTS="${EXTRA_OPTS} ${OPTS} $*" -if [ "$TESTSUITE" ]; then +if [ "$TESTSUITE" = 'test' ]; then # 'distcheck' will reconfigure with required options. # Now we only need to prepare the Makefile without sparse-wrapped CC. configure_ovs @@ -123,6 +126,16 @@ if [ "$TESTSUITE" ]; then TESTSUITEFLAGS=-j4 RECHECK=yes else build_ovs +for testsuite in $TESTSUITE; do +run_as_root= +if [ "${testsuite##*dpdk}" != "$testsuite" ]; then +sudo sh -c 'echo 1024 > /proc/sys/vm/nr_hugepages' || true +[ "$(cat /proc/sys/vm/nr_hugepages)" = '1024' ] +export DPDK_EAL_OPTIONS="--lcores 0@1,1@1,2@1" +run_as_root="sudo -E PATH=$PATH" +fi +$run_as_root make $testsuite TESTSUITEFLAGS=-j4 RECHECK=yes +done fi exit 0 diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml index bc5494e863..4f62efb7c3 100644 --- a/.github/workflows/build-and-test.yml +++ b/.github/workflows/build-and-test.yml @@ -123,10 +123,10 @@ jobs: opts: --enable-shared - compiler: gcc -testsuite:test +testsuite:check check-dpdk dpdk: dpdk - compiler: clang -testsuite:test +testsuite:check check-dpdk dpdk: dpdk - compiler: gcc @@ -213,7 +213,8 @@ jobs: mkdir logs cp config.log ./logs/ cp -r ./*/_build/sub/tests/testsuite.* ./logs/ || true -tar -czvf logs.tgz logs/ +sudo cp -r ./tests/*testsuite.* ./logs/ || true +sudo tar -czvf logs.tgz logs/ - name: upload logs on failure if: failure() || cancelled() diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index a176a57a4b..35d14bee8f 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -94,7 +94,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD], m4_define([OVS_DPDK_START_TESTPMD], [AT_CHECK([lscpu], [], [stdout]) AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE]) - eal_options="--in-memory --socket-mem="$(cat NUMA_NODE)" --single-file-segments --no-pci" + eal_options=&qu
[ovs-dev] [PATCH v8 2/9] system-dpdk: Don't require hugetlbfs.
dpdk-testpmd does not need hugetlbfs backing as we don't require multiprocess support in OVS unit tests. Switch to --in-memory and remove the (then unneeded) check on hugetlbfs presence. Acked-by: Aaron Conole Acked-by: Eelco Chaudron Signed-off-by: David Marchand --- Changes since v7: - reverted use of dynamic allocations and kept initial memory reservation: this avoids random failures when testpmd and ovs-dpdk try to allocate memory at the same time, --- tests/system-dpdk-macros.at | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index 2cfd26d840..a176a57a4b 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -7,9 +7,6 @@ m4_define([OVS_DPDK_PRE_CHECK], [dnl Check Hugepages AT_CHECK([cat /proc/meminfo], [], [stdout]) AT_SKIP_IF([grep -E 'HugePages_Free: *0' stdout], [], [stdout]) - AT_CHECK([mount], [], [stdout]) - AT_CHECK([grep 'hugetlbfs' stdout], [], [stdout], []) - ]) @@ -97,7 +94,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD], m4_define([OVS_DPDK_START_TESTPMD], [AT_CHECK([lscpu], [], [stdout]) AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE]) - eal_options="--socket-mem="$(cat NUMA_NODE)" --file-prefix page0 --single-file-segments --no-pci" + eal_options="--in-memory --socket-mem="$(cat NUMA_NODE)" --single-file-segments --no-pci" options="$1" test "$options" != "${options%% -- *}" || options="$options -- " eal_options="$eal_options ${options%% -- *}" -- 2.41.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v8 1/9] system-dpdk: Introduce helpers for testpmd.
Rather than copy/paste everywhere, introduce helpers to control testpmd runs. Rely on --stats-period (which outputs port stats every n seconds) so that testpmd keeps running without expecting any user input. Acked-by: Aaron Conole Acked-by: Eelco Chaudron Signed-off-by: David Marchand --- Changes since v7: - fixed -- detection in arguments passed to OVS_DPDK_START_TESTPMD, - wrote the testpmd command to a debug file, Changes since v1: - fixed OVS_DPDK_START_TESTPMD passed arguments evaluation:: $@ -> $1, --- tests/system-dpdk-macros.at | 38 + tests/system-dpdk.at| 103 +--- 2 files changed, 62 insertions(+), 79 deletions(-) diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index 3920f08a5e..2cfd26d840 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -79,3 +79,41 @@ m4_define([OVS_DPDK_START_VSWITCHD], AT_CAPTURE_FILE([ovs-vswitchd.log]) on_exit "kill_ovs_vswitchd `cat ovs-vswitchd.pid`" ]) + + +# OVS_DPDK_CHECK_TESTPMD() +# +# Check dpdk-testpmd availability. +# +m4_define([OVS_DPDK_CHECK_TESTPMD], + [AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null]) +]) + + +# OVS_DPDK_START_TESTPMD() +# +# Start dpdk-testpmd in background. +# +m4_define([OVS_DPDK_START_TESTPMD], + [AT_CHECK([lscpu], [], [stdout]) + AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE]) + eal_options="--socket-mem="$(cat NUMA_NODE)" --file-prefix page0 --single-file-segments --no-pci" + options="$1" + test "$options" != "${options%% -- *}" || options="$options -- " + eal_options="$eal_options ${options%% -- *}" + testpmd_options="-a --stats-period 2 ${options#* -- }" + echo "dpdk-testpmd $eal_options -- $testpmd_options" >testpmd.cmd + dpdk-testpmd $eal_options -- $testpmd_options >testpmd.log 2>&1 & \ + echo $! > testpmd.pid + on_exit "kill -9 `cat testpmd.pid`" +]) + + +# OVS_DPDK_STOP_TESTPMD() +# +# Stop background dpdk-testpmd. +# +m4_define([OVS_DPDK_STOP_TESTPMD], + [AT_CHECK([kill `cat testpmd.pid`]) + OVS_WAIT([kill -0 `cat testpmd.pid`], [kill -9 `cat testpmd.pid`]) +]) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 0f58e85742..270587e2c0 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -97,13 +97,9 @@ dnl Ping vhost-user port AT_SETUP([OVS-DPDK - ping vhost-user ports]) AT_KEYWORDS([dpdk]) OVS_DPDK_PRE_CHECK() -AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null]) +OVS_DPDK_CHECK_TESTPMD() OVS_DPDK_START([--no-pci]) -dnl Find number of sockets -AT_CHECK([lscpu], [], [stdout]) -AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE]) - dnl Add userspace bridge and attach it to OVS AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuser0 -- set Interface dpdkvhostuser0 \ @@ -125,12 +121,8 @@ ADD_NAMESPACES(ns1, ns2) dnl Add veth device ADD_VETH(tap1, ns2, br10, "172.31.110.12/24") -dnl Execute testpmd in background -on_exit "pkill -f -x -9 'tail -f /dev/null'" -tail -f /dev/null | dpdk-testpmd --socket-mem="$(cat NUMA_NODE)" --no-pci\ - --vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostuser0" \ - --vdev="net_tap0,iface=tap0" --file-prefix page0 \ - --single-file-segments -- -a >$OVS_RUNDIR/testpmd-dpdkvhostuser0.log 2>&1 & +OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostuser0" \ +--vdev="net_tap0,iface=tap0"]) OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) OVS_WAIT_UNTIL([ip link show dev tap0 | grep -qw LOWER_UP]) @@ -151,8 +143,7 @@ AT_CHECK([ip netns exec ns2 ip link show], [], [stdout], [stderr]) AT_CHECK([ip netns exec ns1 ping -c 4 -I tap0 172.31.110.12], [], [stdout], [stderr]) -dnl Clean up the testpmd now -pkill -f -x -9 'tail -f /dev/null' +OVS_DPDK_STOP_TESTPMD() dnl Wait for vhost-user handling the socket disconnect. OVS_WAIT_UNTIL([grep "vHost Device '$OVS_RUNDIR/dpdkvhostuser0' has been removed" ovs-vswitchd.log]) @@ -173,13 +164,9 @@ dnl Ping vhost-user-client port AT_SETUP([OVS-DPDK - ping vhost-user-client ports]) AT_KEYWORDS([dpdk]) OVS_DPDK_PRE_CHECK() -AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null]) +OVS_DPDK_CHECK_TESTPMD() OVS_DPDK_START([--no-pci]) -dnl Find number of sockets -AT_CHECK([lscpu], [], [stdout]) -AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) {printf "512,"}; print "512"}' >
Re: [ovs-dev] [PATCH v7 8/8] system-dpdk: Run traffic tests.
On Fri, Oct 27, 2023 at 1:45 PM Eelco Chaudron wrote: > On 23 Oct 2023, at 10:18, David Marchand wrote: > > > Integrate system-traffic.at tests as part of check-dpdk. > > > > Some tests that can't work with the userspace datapath are skipped by > > overriding some OVS_CHECK_* macros. > > > > ADD_VETH is implemented using the net/af_xdp DPDK driver. > > > > Signed-off-by: David Marchand > > This patch looks as good as it did last time, so > > Acked-by: Eelco Chaudron Thanks. > > I assume you will do one more revision based on some feedback on the first > patch in the series. Yes, I will send one more revision. Apart from the comments on patch1, I have another issue to handle. We hit some random yet frequent issues because of the previous patch that reorders del-port and testpmd shutdown: this reordering opens a little window during which some packets may be received by a pmd while the port is deleted (see thread https://mail.openvswitch.org/pipermail/ovs-dev/2023-October/408956.html). -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] readthedocs: Use dirhtml builder.
On Thu, Oct 26, 2023 at 7:54 PM Ilya Maximets wrote: > > We used this builder before, but from the project configuration > on the website. ReadTheDocs doesn't allow to change it there > anymore and it doesn't allow to see the full name of the previously > used builder (!!), so I failed to migrate it to the config file. > > The result is that older link like: > https://docs.openvswitch.org/en/latest/howto/dpdk/ > Now require .html: > https://docs.openvswitch.org/en/latest/howto/dpdk.html > > Fixing now by switching the builder back. > > Fixes: e388bd73b70d ("readthedocs: Add the configuration file.") > Reported-by: Antonin Bas > Reported-by: David Marchand > Reported-at: https://github.com/openvswitch/ovs-issues/issues/310 > Signed-off-by: Ilya Maximets > --- > > The version of the docs with the change applied can be > temporarily seen here: >https://igsilya-ovs.readthedocs.io/en/latest/ > > .readthedocs.yaml | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/.readthedocs.yaml b/.readthedocs.yaml > index e481e64f1..7d505150e 100644 > --- a/.readthedocs.yaml > +++ b/.readthedocs.yaml > @@ -14,6 +14,7 @@ build: > # Build documentation in the "Documentation/" directory with Sphinx. > sphinx: >configuration: Documentation/conf.py > + builder: "dirhtml" > > # Build all formats: HTML, PDF, ePub. > formats: all > -- > 2.41.0 > Reviewed-by: David Marchand -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v7 1/8] system-dpdk: Introduce helpers for testpmd.
On Thu, Oct 26, 2023 at 10:10 AM Frode Nordahl wrote: > > On Mon, Oct 23, 2023 at 10:19 AM David Marchand > wrote: > > > > Rather than copy/paste everywhere, introduce helpers to control > > testpmd runs. > > Rely on --stats-period (which outputs port stats every n seconds) so that > > testpmd keeps running without expecting any user input. > > > > Signed-off-by: David Marchand > > Acked-by: Aaron Conole > > Acked-by: Eelco Chaudron > > --- > > Changes since v1: > > - fixed OVS_DPDK_START_TESTPMD passed arguments evaluation:: $@ -> $1, > > Thanks for working on this, is there a plan to backport this? I did not test extensively with earlier versions of DPDK, but I would expect it to work. I am all for getting those unit tests ran in previous branches. > > It appears that there is currently an issue with dpdk-testpmd that > makes it exit immediately when run in non-interactive mode [0], while > this is an upstream DPDK bug, it does affect the system-dpdk testsuite > for released versions of OVS. I suspect this DPDK commit broke it [1]. > > 0: https://bugs.launchpad.net/bugs/2040097 > 1: > https://github.com/DPDK/dpdk/commit/0fd1386c30c3ad9365d7fdd2829bf7cb2e1b9dff Yes [1] changed testpmd behavior. The issue seems related to the tap driver as I don't reproduce the early quit when using some other virtual driver. I opened an upstream bug: https://bugs.dpdk.org/show_bug.cgi?id=1305. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v7 1/8] system-dpdk: Introduce helpers for testpmd.
On Wed, Oct 25, 2023 at 2:50 PM Aaron Conole wrote: > > David Marchand writes: > > > On Mon, Oct 23, 2023 at 10:20 AM David Marchand > > wrote: > >> +# OVS_DPDK_CHECK_TESTPMD() > >> +# > >> +# Check dpdk-testpmd availability. > >> +# > >> +m4_define([OVS_DPDK_CHECK_TESTPMD], > >> + [AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null]) > >> +]) > >> + > >> + > >> +# OVS_DPDK_START_TESTPMD() > >> +# > >> +# Start dpdk-testpmd in background. > >> +# > >> +m4_define([OVS_DPDK_START_TESTPMD], > >> + [AT_CHECK([lscpu], [], [stdout]) > >> + AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while > >> (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE]) > >> + eal_options="--socket-mem="$(cat NUMA_NODE)" --file-prefix page0 > >> --single-file-segments --no-pci" > >> + options="$1" > >> + [ "$options" != "${options%% -- *}" ] || options="$options -- " > > > > I realised, looking at a generated dpdk testsuite file (while trying > > to understand a Intel CI failure), that this syntax above is wrong. > > > > It is not a big problem, since testpmd (/getopt) does not complain > > about such a trailing --. > > Yet, better to avoid [] and instead use a "if test ...; then ...; fi" > > construct. > > I think we need to use [[...]] to get the behavior you intend under m4, > but also we can use test (since '[' is usually either an alias or a > reimplementation of 'test'). I don't mind. I see both uses of "test " and some [[]] in tests/. But this file only used test so far... so I would tend to go with it. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [ovs-build] |fail| pw1853561 [ovs-dev, v7, 8/8] system-dpdk: Run traffic tests.
On Wed, Oct 25, 2023 at 1:18 PM Ilya Maximets wrote: > On 10/25/23 12:09, David Marchand wrote: > >>> 2023-10-23T15:02:13.756Z|00082|bridge|INFO|bridge br10: deleted interface > >>> dpdkvhostuserclient0 on port 1 > >>> 2023-10-23T15:02:13.756Z|00083|dpif_netdev|INFO|PMD thread on numa_id: 1, > >>> core id: 88 destroyed. > >>> 2023-10-23T15:02:13.772Z|2|dpdk(pmd-c88/id:103)|INFO|PMD thread > >>> released DPDK lcore 2. > >>> 2023-10-23T15:02:13.778Z|00084|dpif_netdev|INFO|PMD thread on numa_id: 0, > >>> core id: 21 destroyed. > >>> 2023-10-23T15:02:13.778Z|2|ofproto_dpif_xlate(pmd-c21/id:102)|WARN|received > >>> packet on unknown port 1 on bridge br10 while processing > >>> icmp6,in_port=1,vlan_tci=0x,dl_src=ca:76:e9:ff:a2:09,dl_dst=33:33:00:00:00:02,ipv6_src=fe80::c876:e9ff:feff:a209,ipv6_dst=ff02::2,ipv6_label=0x0,nw_tos=0,nw_ecn=0,nw_ttl=255,nw_frag=no,icmp_type=133,icmp_code=0 > >>> 2023-10-23T15:02:13.791Z|3|dpdk(pmd-c21/id:102)|INFO|PMD thread > >>> released DPDK lcore 1. > >>> 2023-10-23T15:02:13.801Z|00085|dpdk|INFO|VHOST_CONFIG: > >>> (/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0) free > >>> connfd 95 > >>> 2023-10-23T15:02:13.801Z|00086|netdev_dpdk|INFO|vHost Device > >>> '/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0' not > >>> found > > > > I am a bit puzzled at this report. > > It is similar to > > https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/396325.html. > > > > I understand this shows a race in OVS cleaning up sequence, with some > > packet (triggering an upcall) received by a pmd on a port that is not > > referenced in the ofproto bridge anymore. > > Why did it show up again? This is probably due to my patch 7 in the v7 > > series which lets testpmd sends packets while deleting the vhu port. > > > > The easiest (laziest?) for me is probably to drop this patch 7 and > > instead waive warnings about a vhu socket reconnection... > > The packets are coming from the kernel interface on the other side > of testpmd, right? In that case, can we just bring that interface > down before removing OVS port to prevent random ipv6 traffic from > flowing around? Another similar option might be to set admin state > DOWN on the OVS side for the vhost-user port. Putting down the tap iface should do the job yes. But now I wonder why we need such a setup with testpmd + a tap in the mtu unit tests: no packet is being actively injected by the unit tests themselves. I get that testpmd will make sure that the vhost-user client port is running in a "nominal" situation when changing the mtu, so ok to keep it. But can we remove those tap iface from testpmd (for those MTU tests)? > > > But I find it strange that there is a window in which OVS pmd threads > > still poll packets (and complain) while the ports are being removed. > > OpenFlow ports are getting removed before their backing datapath ports, > so there is always a small window where packets can arrive on datapath > ports that do not have associated OpenFlow port numbers anymore. > Reversing this might not be an option due to reference counting, but I > don't remember exactly. > > Same applies to upcalls in kenrel datapath, because packets can be queued > for upcall while the port is getting removed. And it's even trickier to > fix that for a kernel, because it's done fully asynchronously. Ok, thanks for the context / explanations. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [ovs-build] |fail| pw1853561 [ovs-dev, v7, 8/8] system-dpdk: Run traffic tests.
Forwarding to dev@ On Mon, Oct 23, 2023 at 6:05 PM wrote: > > 2023-10-23T15:02:12.622Z|00063|dpdk|INFO|VHOST_CONFIG: > > (/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0) virtio > > is now ready for processing. > > 2023-10-23T15:02:12.622Z|00064|netdev_dpdk|INFO|vHost Device > > '/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0' has > > been added on numa node 0 > > 2023-10-23T15:02:13.592Z|00074|dpif_netdev|INFO|Performing pmd to rx queue > > assignment using cycles algorithm. > > 2023-10-23T15:02:13.592Z|00075|dpif_netdev|INFO|Core 21 on numa node 0 > > assigned port 'dpdkvhostuserclient0' rx queue 0 (measured processing cycles > > 0). > > 2023-10-23T15:02:13.592Z|1|netdev_dpdk(ovs_vhost2)|INFO|State of queue > > 0 ( tx_qid 0 ) of vhost device > > '/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0' > > changed to 'enabled' > > 2023-10-23T15:02:13.592Z|2|netdev_dpdk(ovs_vhost2)|INFO|State of queue > > 1 ( rx_qid 0 ) of vhost device > > '/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0' > > changed to 'enabled' > > 2023-10-23T15:02:13.595Z|00076|unixctl|DBG|received request dpctl/show[], > > id=0 > > 2023-10-23T15:02:13.596Z|00077|unixctl|DBG|replying with success, id=0: > > "netdev@ovs-netdev: > > lookups: hit:0 missed:2 lost:0 > > flows: 2 > > port 0: ovs-netdev (tap) > > port 1: br10 (tap) > > port 2: dpdkvhostuserclient0 (dpdkvhostuserclient: > > configured_rx_queues=1, configured_tx_queues=1, mtu=9000, > > requested_rx_queues=1, requested_tx_queues=1) > > " > > 2023-10-23T15:02:13.715Z|00078|dpif_netdev|INFO|Performing pmd to rx queue > > assignment using cycles algorithm. > > 2023-10-23T15:02:13.715Z|00079|dpif_netdev|INFO|Core 21 on numa node 0 > > assigned port 'dpdkvhostuserclient0' rx queue 0 (measured processing cycles > > 0). > > 2023-10-23T15:02:13.728Z|00080|unixctl|DBG|received request dpctl/show[], > > id=0 > > 2023-10-23T15:02:13.728Z|00081|unixctl|DBG|replying with success, id=0: > > "netdev@ovs-netdev: > > lookups: hit:0 missed:2 lost:0 > > flows: 2 > > port 0: ovs-netdev (tap) > > port 1: br10 (tap) > > port 2: dpdkvhostuserclient0 (dpdkvhostuserclient: > > configured_rx_queues=1, configured_tx_queues=1, mtu=2000, > > requested_rx_queues=1, requested_tx_queues=1) > > " > > 2023-10-23T15:02:13.756Z|00082|bridge|INFO|bridge br10: deleted interface > > dpdkvhostuserclient0 on port 1 > > 2023-10-23T15:02:13.756Z|00083|dpif_netdev|INFO|PMD thread on numa_id: 1, > > core id: 88 destroyed. > > 2023-10-23T15:02:13.772Z|2|dpdk(pmd-c88/id:103)|INFO|PMD thread > > released DPDK lcore 2. > > 2023-10-23T15:02:13.778Z|00084|dpif_netdev|INFO|PMD thread on numa_id: 0, > > core id: 21 destroyed. > > 2023-10-23T15:02:13.778Z|2|ofproto_dpif_xlate(pmd-c21/id:102)|WARN|received > > packet on unknown port 1 on bridge br10 while processing > > icmp6,in_port=1,vlan_tci=0x,dl_src=ca:76:e9:ff:a2:09,dl_dst=33:33:00:00:00:02,ipv6_src=fe80::c876:e9ff:feff:a209,ipv6_dst=ff02::2,ipv6_label=0x0,nw_tos=0,nw_ecn=0,nw_ttl=255,nw_frag=no,icmp_type=133,icmp_code=0 > > 2023-10-23T15:02:13.791Z|3|dpdk(pmd-c21/id:102)|INFO|PMD thread > > released DPDK lcore 1. > > 2023-10-23T15:02:13.801Z|00085|dpdk|INFO|VHOST_CONFIG: > > (/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0) free > > connfd 95 > > 2023-10-23T15:02:13.801Z|00086|netdev_dpdk|INFO|vHost Device > > '/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0' not > > found I am a bit puzzled at this report. It is similar to https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/396325.html. I understand this shows a race in OVS cleaning up sequence, with some packet (triggering an upcall) received by a pmd on a port that is not referenced in the ofproto bridge anymore. Why did it show up again? This is probably due to my patch 7 in the v7 series which lets testpmd sends packets while deleting the vhu port. The easiest (laziest?) for me is probably to drop this patch 7 and instead waive warnings about a vhu socket reconnection... But I find it strange that there is a window in which OVS pmd threads still poll packets (and complain) while the ports are being removed. Opinions? -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev