Re: [ovs-dev] [PATCH v3 6/6] netdev-dpdk: Refactor tunnel checksum offloading.

2024-05-16 Thread David Marchand
On Wed, May 15, 2024 at 2:11 PM Kevin Traynor  wrote:
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index 1dad2ef833..31dd6f1d5a 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -2584,6 +2584,9 @@ static bool
> >  netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf 
> > *mbuf)
> >  {
> >  struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf);
> > +void *l2;
> > +void *l3;
> > +void *l4;
> >
> >  const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM |
> >   RTE_MBUF_F_TX_L4_MASK |
> > @@ -2613,11 +2616,6 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk 
> > *dev, struct rte_mbuf *mbuf)
> >  return true;
> >  }
> >
> > -ovs_assert(dp_packet_l4(pkt));
> > -
> > -/* If packet is vxlan or geneve tunnel packet, calculate outer
> > - * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated
> > - * before. */
> >  const uint64_t tunnel_type = mbuf->ol_flags & 
> > RTE_MBUF_F_TX_TUNNEL_MASK;
> >  if (OVS_UNLIKELY(tunnel_type &&
> >   tunnel_type != RTE_MBUF_F_TX_TUNNEL_GENEVE &&
> > @@ -2635,6 +2633,11 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk 
> > *dev, struct rte_mbuf *mbuf)
> >   (char *) dp_packet_eth(pkt);
> >  mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) -
> >   (char *) dp_packet_l3(pkt);
> > +
>
> > +/* Inner L2 length must account for the tunnel header length. 
> > */
> > +l2 = dp_packet_l4(pkt);
>
> Code looks ok to me, but it's tricky and the L2 settings with inner
> requests are a bit unintuitive without a notepad and thinking from the
> driver perspective backwards. Not sure there is much can be done to
> mitigate that here, other than the comment you added.

Unfortunately, I don't have a better idea.
It was already unintuitive before this patch, but to make it worse,
the logic was split across lib/netdev-dpdk.c and
lib/netdev-native-tnl.c.

Like for example this comment in dp_packet_tnl_ol_process(), which is
DPDK specific.
/* Attention please, tunnel inner l2 len is consist of udp header
 * len and tunnel header len and inner l2 len. */


>
> Did you manage to test to confirm they're working as expected ?

In general, I tested the series with CX6, E810 and ixgbe, with ipv4
traffic, ipv4 traffic tunneled in ipv4/ipv6 vxlan and ipv4 traffic
tunneled in ipv4/ipv6 geneve.
But I am not sure I covered every possible combinations.

Specifically for this case you point at (outer + inner offloads), I
tested CX6 with IPv4/IPv6 VxLAN and Geneve (for which I have traces in
my bash history).
With E810, I remember testing the same with the DPDK fixes, but I
don't have a trace of it.

I'll double check before sending a next revision.

>
> > +l3 = dp_packet_inner_l3(pkt);
> > +l4 = dp_packet_inner_l4(pkt);
>
> see below
>
> >  } else {
> >  /* If no outer offloading is requested, clear outer marks. */
> >  mbuf->ol_flags &= ~all_outer_marks;
> > @@ -2642,8 +2645,9 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
> > struct rte_mbuf *mbuf)
> >  mbuf->outer_l3_len = 0;
> >
> >  /* Skip outer headers. */
> > -mbuf->l2_len += (char *) dp_packet_l4(pkt) -
> > -(char *) dp_packet_eth(pkt);
> > +l2 = dp_packet_eth(pkt);
>
> > +l3 = dp_packet_inner_l3(pkt);
> > +l4 = dp_packet_inner_l4(pkt);
>
> You could move these outside the inner (pardon the pun) if else, but I
> could understand if you prefer to set l2/l3/l4 together for better
> readability ?

Well, as you noted, this code is not trivial.
I preferred to have all 3 pointers grouped, with a comment relating to
the group.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 5/6] netdev-dpdk: Use guest TSO segmentation size hint.

2024-05-16 Thread David Marchand
On Wed, May 15, 2024 at 2:09 PM Kevin Traynor  wrote:
>
> On 19/04/2024 15:06, David Marchand wrote:
> > In a typical setup like:
> > guest A <-virtio-> OVS A <-vxlan-> OVS B <-virtio-> guest B
> >
> > TSO packets from guest A are segmented against the OVS A physical port
> > mtu adjusted by the vxlan tunnel header size, regardless of guest A
> > interface mtu.
> >
> > As an example, let's say guest A and guest B mtu are set to 1500 bytes.
> > OVS A and OVS B physical ports mtu are set to 1600 bytes.
> > Guest A will request TCP segmentation for 1448 bytes segments.
> > On the other hand, OVS A will request 1498 bytes segments to the HW.
> > This results in OVS B dropping packets because decapsulated packets
> > are larger than the vhost-user port (serving guest B) mtu.
> >
> > 2024-04-17T14:13:01.239Z|2|netdev_dpdk(pmd-c03/id:7)|WARN|vhost0:
> >   Too big size 1564 max_packet_len 1518
> >
> > vhost-user ports expose a guest mtu by filling mbuf->tso_segsz.
> > Use it as a hint.
> >
> > This may result in segments (on the wire) slightly shorter than the
> > optimal size.
> >
> > Reported-at: https://github.com/openvswitch/ovs-issues/issues/321
> > Signed-off-by: David Marchand 
> > ---
> > Note:
> > As we trust the guest with this change, should we put a lower limit on
> > mbuf->tso_segsz?
> >
>
> There are some checks I looked at (e.g [0]), but it could be checked
> here for an earlier drop i suppose...additional comment below
>
> [0]
> https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3754

I guess you meant
https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3818
And same in v23.11, there are checks at the prepare stage:
https://git.dpdk.org/dpdk-stable/tree/drivers/net/ice/ice_rxtx.c?h=23.11#n3681

I had forgotten about those checks.
There is no limit exposed per driver from DPDK, so the simpler for OVS
is to trust them.


>
> > ---
> >  lib/netdev-dpdk.c | 11 ---
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index 661269e4b6..1dad2ef833 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -2671,14 +2671,19 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk 
> > *dev, struct rte_mbuf *mbuf)
> >
> >  if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) {
> >  struct tcp_header *th = dp_packet_l4(pkt);
> > +uint16_t link_tso_segsz;
> >  int hdr_len;
> >
> >  if (tunnel_type) {
> > -mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len -
> > -  mbuf->l4_len - mbuf->outer_l3_len;
> > +link_tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len -
> > + mbuf->l4_len - mbuf->outer_l3_len;
> >  } else {
> >  mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4;
> > -mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
> > +link_tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
> > +}
> > +
> > +if (!mbuf->tso_segsz || mbuf->tso_segsz > link_tso_segsz) {
>
> It seems like something is not right if the flag is set but tso_segsz is
> 0. It is set by vhost lib from gso_size, but I don't see a validation
> there either.

At the time I added a check on the 0 value, I thought there was a case
where RTE_MBUF_F_TX_TCP_SEG could be set with no segsz value.
But as you mention, all setters of this flag (either in vhost or in
OVS) set a segsz too.

So with segsz always set, combined with the drivers check, OVS
probably does not need any check on tso_segsz.
I intend to remove this check in a next revision.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 4/6] netdev-dpdk: Refactor TSO request code.

2024-05-13 Thread David Marchand
Hello Kevin,

Thanks for reviewing.

On Fri, May 10, 2024 at 11:50 PM Kevin Traynor  wrote:
>
> On 19/04/2024 15:06, David Marchand wrote:
> > Replace check on th == NULL with an assert() because dp_packet_l4(pkt)
> > is priorly used to compute (outer) L3 length.
> >
> > Besides, filling l4_len and tso_segsz only matters to TSO, so there is
> > no need to check for other L4 checksum offloading requests.
> >
> > Signed-off-by: David Marchand 
> > ---
> >  lib/netdev-dpdk.c | 36 +++-
> >  1 file changed, 11 insertions(+), 25 deletions(-)
> >
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index 8b6a3ed189..661269e4b6 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -2584,7 +2584,6 @@ static bool
> >  netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf 
> > *mbuf)
> >  {
> >  struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf);
> > -struct tcp_header *th;
> >
> >  const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM |
> >   RTE_MBUF_F_TX_L4_MASK |
> > @@ -2614,6 +2613,8 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
> > struct rte_mbuf *mbuf)
> >  return true;
> >  }
> >
> > +ovs_assert(dp_packet_l4(pkt));
>
> I'm not clear why you want to change this from a warning/return
> fail/drop to an assert ?

From this point in the function, there is at least one request for
checksum offloading pending.
Any L3 (or higher) checksum requested by OVS means that the packet has
been parsed/composed as either IP or IPv6 and packet->l4_ofs was set
to point after the l3 header (with miniflow_extract / *_compose()
helpers).

So getting a NULL pointer for l4 here indicates a bug in OVS.
An assert seems better than a warn/return that probably nobody notice(d).

Did I miss a case where l4_ofs can be unset?

>
> Nit: should this be in the previous patch instead ? and I see it is
> removed in a later patch.

It is not supposed to be removed in the series.
The last patch moves it later in the function.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] ci: Set platform parameter when building DPDK.

2024-05-07 Thread David Marchand
This change has no impact, since -Dmachine=default gets converted by
DPDK into -Dplatform=generic (since v21.08, see the link to DPDK commit
below). Yet, switch to explicitly setting -Dplatform and avoid the
following warning:

2024-04-18T14:50:16.8001092Z config/meson.build:113: WARNING: The
"machine" option is deprecated. Please use "cpu_instruction_set"
instead.

While at it, solve another warning and call explicitly meson setup.

2024-04-18T14:50:17.0770596Z WARNING: Running the setup command as
`meson [options]` instead of `meson setup [options]` is ambiguous
and deprecated.

Link: https://git.dpdk.org/dpdk/commit/?id=bf66003b51ec
Signed-off-by: David Marchand 
---
 .ci/dpdk-build.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh
index 23f3166a54..e1b8e3ccbb 100755
--- a/.ci/dpdk-build.sh
+++ b/.ci/dpdk-build.sh
@@ -25,9 +25,9 @@ function build_dpdk()
 pushd dpdk-src
 fi
 
-# Switching to 'default' machine to make the dpdk cache usable on
+# Switching to 'generic' platform to make the dpdk cache usable on
 # different CPUs. We can't be sure that all CI machines are exactly same.
-DPDK_OPTS="$DPDK_OPTS -Dmachine=default"
+DPDK_OPTS="$DPDK_OPTS -Dplatform=generic"
 
 # Disable building DPDK unit tests. Not needed for OVS build or tests.
 DPDK_OPTS="$DPDK_OPTS -Dtests=false"
@@ -49,7 +49,7 @@ function build_dpdk()
 # Install DPDK using prefix.
 DPDK_OPTS="$DPDK_OPTS --prefix=$DPDK_INSTALL_DIR"
 
-meson $DPDK_OPTS build
+meson setup $DPDK_OPTS build
 ninja -C build
 ninja -C build install
 popd
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 6/6] netdev-dpdk: Refactor tunnel checksum offloading.

2024-04-19 Thread David Marchand
All informations required for checksum offloading can be deduced by
already tracked dp_packet l3_ofs, l4_ofs, inner_l3_ofs and inner_l4_ofs
fields.
Remove DPDK specific l[2-4]_len from generic OVS code.

netdev-dpdk code then fills mbuf specifics step by step:
- outer_l2_len and outer_l3_len are needed for tunneling (and below
  features),
- l2_len and l3_len are needed for IP and L4 checksum (and below features),
- l4_len and tso_segsz are needed when doing TSO,

Signed-off-by: David Marchand 
---
 lib/dp-packet.h | 37 --
 lib/netdev-dpdk.c   | 35 ++---
 lib/netdev-native-tnl.c | 50 +
 3 files changed, 27 insertions(+), 95 deletions(-)

diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 3622764c47..a75b1c5cdb 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -604,25 +604,6 @@ dp_packet_get_nd_payload(const struct dp_packet *b)
 }
 
 #ifdef DPDK_NETDEV
-static inline void
-dp_packet_set_l2_len(struct dp_packet *b, size_t l2_len)
-{
-b->mbuf.l2_len = l2_len;
-}
-
-static inline void
-dp_packet_set_l3_len(struct dp_packet *b, size_t l3_len)
-{
-b->mbuf.l3_len = l3_len;
-}
-
-static inline void
-dp_packet_set_l4_len(struct dp_packet *b, size_t l4_len)
-{
-b->mbuf.l4_len = l4_len;
-}
-
-
 static inline uint64_t *
 dp_packet_ol_flags_ptr(const struct dp_packet *b)
 {
@@ -642,24 +623,6 @@ dp_packet_flow_mark_ptr(const struct dp_packet *b)
 }
 
 #else
-static inline void
-dp_packet_set_l2_len(struct dp_packet *b OVS_UNUSED, size_t l2_len OVS_UNUSED)
-{
-/* There is no implementation. */
-}
-
-static inline void
-dp_packet_set_l3_len(struct dp_packet *b OVS_UNUSED, size_t l3_len OVS_UNUSED)
-{
-/* There is no implementation. */
-}
-
-static inline void
-dp_packet_set_l4_len(struct dp_packet *b OVS_UNUSED, size_t l4_len OVS_UNUSED)
-{
-/* There is no implementation. */
-}
-
 static inline uint32_t *
 dp_packet_ol_flags_ptr(const struct dp_packet *b)
 {
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 1dad2ef833..31dd6f1d5a 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2584,6 +2584,9 @@ static bool
 netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf)
 {
 struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf);
+void *l2;
+void *l3;
+void *l4;
 
 const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM |
  RTE_MBUF_F_TX_L4_MASK |
@@ -2613,11 +2616,6 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 return true;
 }
 
-ovs_assert(dp_packet_l4(pkt));
-
-/* If packet is vxlan or geneve tunnel packet, calculate outer
- * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated
- * before. */
 const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK;
 if (OVS_UNLIKELY(tunnel_type &&
  tunnel_type != RTE_MBUF_F_TX_TUNNEL_GENEVE &&
@@ -2635,6 +2633,11 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
  (char *) dp_packet_eth(pkt);
 mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) -
  (char *) dp_packet_l3(pkt);
+
+/* Inner L2 length must account for the tunnel header length. */
+l2 = dp_packet_l4(pkt);
+l3 = dp_packet_inner_l3(pkt);
+l4 = dp_packet_inner_l4(pkt);
 } else {
 /* If no outer offloading is requested, clear outer marks. */
 mbuf->ol_flags &= ~all_outer_marks;
@@ -2642,8 +2645,9 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 mbuf->outer_l3_len = 0;
 
 /* Skip outer headers. */
-mbuf->l2_len += (char *) dp_packet_l4(pkt) -
-(char *) dp_packet_eth(pkt);
+l2 = dp_packet_eth(pkt);
+l3 = dp_packet_inner_l3(pkt);
+l4 = dp_packet_inner_l4(pkt);
 }
 } else {
 if (tunnel_type) {
@@ -2663,22 +2667,27 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 }
 mbuf->outer_l2_len = 0;
 mbuf->outer_l3_len = 0;
-mbuf->l2_len = (char *) dp_packet_l3(pkt) -
-   (char *) dp_packet_eth(pkt);
-mbuf->l3_len = (char *) dp_packet_l4(pkt) -
-   (char *) dp_packet_l3(pkt);
+
+l2 = dp_packet_eth(pkt);
+l3 = dp_packet_l3(pkt);
+l4 = dp_packet_l4(pkt);
 }
 
+ovs_assert(l4);
+
+mbuf->l2_len = (char *) l3 - (char *) l2;
+mbuf->l3_len = (char *) l4 - (char *) l3;
+
 if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) {
-struct tcp_header *th = dp_packet_l4(pkt);
+struct tcp_header *th = l4;
 uint16_t link_tso_segsz;
  

[ovs-dev] [PATCH v3 5/6] netdev-dpdk: Use guest TSO segmentation size hint.

2024-04-19 Thread David Marchand
In a typical setup like:
guest A <-virtio-> OVS A <-vxlan-> OVS B <-virtio-> guest B

TSO packets from guest A are segmented against the OVS A physical port
mtu adjusted by the vxlan tunnel header size, regardless of guest A
interface mtu.

As an example, let's say guest A and guest B mtu are set to 1500 bytes.
OVS A and OVS B physical ports mtu are set to 1600 bytes.
Guest A will request TCP segmentation for 1448 bytes segments.
On the other hand, OVS A will request 1498 bytes segments to the HW.
This results in OVS B dropping packets because decapsulated packets
are larger than the vhost-user port (serving guest B) mtu.

2024-04-17T14:13:01.239Z|2|netdev_dpdk(pmd-c03/id:7)|WARN|vhost0:
Too big size 1564 max_packet_len 1518

vhost-user ports expose a guest mtu by filling mbuf->tso_segsz.
Use it as a hint.

This may result in segments (on the wire) slightly shorter than the
optimal size.

Reported-at: https://github.com/openvswitch/ovs-issues/issues/321
Signed-off-by: David Marchand 
---
Note:
As we trust the guest with this change, should we put a lower limit on
mbuf->tso_segsz?

---
 lib/netdev-dpdk.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 661269e4b6..1dad2ef833 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2671,14 +2671,19 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 
 if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) {
 struct tcp_header *th = dp_packet_l4(pkt);
+uint16_t link_tso_segsz;
 int hdr_len;
 
 if (tunnel_type) {
-mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len -
-  mbuf->l4_len - mbuf->outer_l3_len;
+link_tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len -
+ mbuf->l4_len - mbuf->outer_l3_len;
 } else {
 mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4;
-mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
+link_tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
+}
+
+if (!mbuf->tso_segsz || mbuf->tso_segsz > link_tso_segsz) {
+mbuf->tso_segsz = link_tso_segsz;
 }
 
 hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len;
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 4/6] netdev-dpdk: Refactor TSO request code.

2024-04-19 Thread David Marchand
Replace check on th == NULL with an assert() because dp_packet_l4(pkt)
is priorly used to compute (outer) L3 length.

Besides, filling l4_len and tso_segsz only matters to TSO, so there is
no need to check for other L4 checksum offloading requests.

Signed-off-by: David Marchand 
---
 lib/netdev-dpdk.c | 36 +++-
 1 file changed, 11 insertions(+), 25 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 8b6a3ed189..661269e4b6 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2584,7 +2584,6 @@ static bool
 netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf)
 {
 struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf);
-struct tcp_header *th;
 
 const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM |
  RTE_MBUF_F_TX_L4_MASK |
@@ -2614,6 +2613,8 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 return true;
 }
 
+ovs_assert(dp_packet_l4(pkt));
+
 /* If packet is vxlan or geneve tunnel packet, calculate outer
  * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated
  * before. */
@@ -2667,22 +2668,10 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 mbuf->l3_len = (char *) dp_packet_l4(pkt) -
(char *) dp_packet_l3(pkt);
 }
-th = dp_packet_l4(pkt);
 
 if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) {
-if (!th) {
-VLOG_WARN_RL(, "%s: TCP Segmentation without L4 header"
- " pkt len: %"PRIu32"", dev->up.name, mbuf->pkt_len);
-return false;
-}
-}
-
-if ((mbuf->ol_flags & RTE_MBUF_F_TX_L4_MASK) == RTE_MBUF_F_TX_TCP_CKSUM) {
-if (!th) {
-VLOG_WARN_RL(, "%s: TCP offloading without L4 header"
- " pkt len: %"PRIu32"", dev->up.name, mbuf->pkt_len);
-return false;
-}
+struct tcp_header *th = dp_packet_l4(pkt);
+int hdr_len;
 
 if (tunnel_type) {
 mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len -
@@ -2692,16 +2681,13 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
 }
 
-if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) {
-int hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len;
-if (OVS_UNLIKELY((hdr_len +
-  mbuf->tso_segsz) > dev->max_packet_len)) {
-VLOG_WARN_RL(, "%s: Oversized TSO packet. hdr: %"PRIu32", "
- "gso: %"PRIu32", max len: %"PRIu32"",
- dev->up.name, hdr_len, mbuf->tso_segsz,
- dev->max_packet_len);
-return false;
-}
+hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len;
+if (OVS_UNLIKELY((hdr_len + mbuf->tso_segsz) > dev->max_packet_len)) {
+VLOG_WARN_RL(, "%s: Oversized TSO packet. hdr: %"PRIu32", "
+ "gso: %"PRIu32", max len: %"PRIu32"",
+ dev->up.name, hdr_len, mbuf->tso_segsz,
+ dev->max_packet_len);
+return false;
 }
 }
 
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 3/6] netdev-dpdk: Fix inner checksum when outer is not supported.

2024-04-19 Thread David Marchand
If outer checksum is not supported and OVS already set L3/L4 outer
checksums in the packet, no outer mark should be left in ol_flags
(as it confuses some driver, like net/ixgbe).

l2_len must be adjusted to account for the tunnel header.

Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
Signed-off-by: David Marchand 
---
 lib/netdev-dpdk.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f732716141..8b6a3ed189 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2629,10 +2629,21 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 }
 
 if (tunnel_type && (mbuf->ol_flags & all_inner_requests)) {
-mbuf->outer_l2_len = (char *) dp_packet_l3(pkt) -
- (char *) dp_packet_eth(pkt);
-mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) -
- (char *) dp_packet_l3(pkt);
+if (mbuf->ol_flags & all_outer_requests) {
+mbuf->outer_l2_len = (char *) dp_packet_l3(pkt) -
+ (char *) dp_packet_eth(pkt);
+mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) -
+ (char *) dp_packet_l3(pkt);
+} else {
+/* If no outer offloading is requested, clear outer marks. */
+mbuf->ol_flags &= ~all_outer_marks;
+mbuf->outer_l2_len = 0;
+mbuf->outer_l3_len = 0;
+
+/* Skip outer headers. */
+mbuf->l2_len += (char *) dp_packet_l4(pkt) -
+(char *) dp_packet_eth(pkt);
+}
 } else {
 if (tunnel_type) {
 /* No inner offload is requested, fallback to non tunnel
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 1/6] netdev-dpdk: Fallback to non tunnel checksum offloading.

2024-04-19 Thread David Marchand
The outer checksum offloading API in DPDK is ambiguous and was
implemented by Intel folks in their drivers with the assumption that
any outer offloading always goes with an inner offloading request.

With net/i40e and net/ice drivers, in the case of encapsulating a ARP
packet in a vxlan tunnel (which results in requesting outer ip checksum
with a tunnel context but no inner offloading request), a Tx failure is
triggered, associated with a port MDD event.
2024-03-27T16:02:07.084Z|00018|dpdk|WARN|ice_interrupt_handler(): OICR:
MDD event

To avoid this situation, if no checksum or segmentation offloading is
requested on the inner part of a packet, fallback to "normal" (non outer)
offloading request.

Reported-at: https://github.com/openvswitch/ovs-issues/issues/321
Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
Fixes: f81d782c1906 ("netdev-native-tnl: Mark all vxlan/geneve packets as 
tunneled.")
Signed-off-by: David Marchand 
---
Changes since v2:
- kept offloads disabled for net/i40e and net/ice as this patch does not
  fix outer udp checksum (a DPDK fix is required),
- updated commitlog with details to reproduce the issue,
- adjusted indent,

Changes since v1:
- reset inner marks before converting outer requests,
- fixed some coding style,

---
 lib/netdev-dpdk.c | 71 +++
 1 file changed, 41 insertions(+), 30 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 2111f77681..7e109903c0 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2584,16 +2584,18 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf);
 struct tcp_header *th;
 
-const uint64_t all_requests = (RTE_MBUF_F_TX_IP_CKSUM |
-   RTE_MBUF_F_TX_L4_MASK  |
-   RTE_MBUF_F_TX_OUTER_IP_CKSUM  |
-   RTE_MBUF_F_TX_OUTER_UDP_CKSUM |
-   RTE_MBUF_F_TX_TCP_SEG);
-const uint64_t all_marks = (RTE_MBUF_F_TX_IPV4 |
-RTE_MBUF_F_TX_IPV6 |
-RTE_MBUF_F_TX_OUTER_IPV4 |
-RTE_MBUF_F_TX_OUTER_IPV6 |
-RTE_MBUF_F_TX_TUNNEL_MASK);
+const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM |
+ RTE_MBUF_F_TX_L4_MASK |
+ RTE_MBUF_F_TX_TCP_SEG);
+const uint64_t all_outer_requests = (RTE_MBUF_F_TX_OUTER_IP_CKSUM |
+ RTE_MBUF_F_TX_OUTER_UDP_CKSUM);
+const uint64_t all_requests = all_inner_requests | all_outer_requests;
+const uint64_t all_inner_marks = (RTE_MBUF_F_TX_IPV4 |
+  RTE_MBUF_F_TX_IPV6);
+const uint64_t all_outer_marks = (RTE_MBUF_F_TX_OUTER_IPV4 |
+  RTE_MBUF_F_TX_OUTER_IPV6 |
+  RTE_MBUF_F_TX_TUNNEL_MASK);
+const uint64_t all_marks = all_inner_marks | all_outer_marks;
 
 if (!(mbuf->ol_flags & all_requests)) {
 /* No offloads requested, no marks should be set. */
@@ -2614,34 +2616,43 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
  * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated
  * before. */
 const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK;
-if (tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENEVE ||
-tunnel_type == RTE_MBUF_F_TX_TUNNEL_VXLAN) {
-mbuf->outer_l2_len = (char *) dp_packet_l3(pkt) -
- (char *) dp_packet_eth(pkt);
-mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) -
- (char *) dp_packet_l3(pkt);
-
-/* If neither inner checksums nor TSO is requested, inner marks
- * should not be set. */
-if (!(mbuf->ol_flags & (RTE_MBUF_F_TX_IP_CKSUM |
-RTE_MBUF_F_TX_L4_MASK  |
-RTE_MBUF_F_TX_TCP_SEG))) {
-mbuf->ol_flags &= ~(RTE_MBUF_F_TX_IPV4 |
-RTE_MBUF_F_TX_IPV6);
-}
-} else if (OVS_UNLIKELY(tunnel_type)) {
+if (OVS_UNLIKELY(tunnel_type &&
+ tunnel_type != RTE_MBUF_F_TX_TUNNEL_GENEVE &&
+ tunnel_type != RTE_MBUF_F_TX_TUNNEL_VXLAN)) {
 VLOG_WARN_RL(, "%s: Unexpected tunnel type: %#"PRIx64,
  netdev_get_name(>up), tunnel_type);
 netdev_dpdk_mbuf_dump(netdev_get_name(>up),
   "Packet with unexpected tunnel type", mbuf);
 return false;
+}
+
+if (tunnel_type && (mbuf->ol_flags & all_inner_requests)) {
+mbuf->outer_l2_len =

[ovs-dev] [PATCH v3 2/6] netdev-dpdk: Disable outer UDP checksum for net/iavf.

2024-04-19 Thread David Marchand
Same as the commit 6f93d8e62f13 ("netdev-dpdk: Disable outer UDP checksum
offload for ice/i40e driver."), disable outer UDP checksum and related
offloads for net/iavf.

Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
Signed-off-by: David Marchand 
---
Note:
- DPDK (in progress) fixes can be found at:
  https://patchwork.dpdk.org/project/dpdk/list/?series=31780=*

---
 lib/netdev-dpdk.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 7e109903c0..f732716141 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1355,12 +1355,14 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 }
 
 if (!strcmp(info.driver_name, "net_ice")
-|| !strcmp(info.driver_name, "net_i40e")) {
+|| !strcmp(info.driver_name, "net_i40e")
+|| !strcmp(info.driver_name, "net_iavf")) {
 /* FIXME: Driver advertises the capability but doesn't seem
  * to actually support it correctly.  Can remove this once
  * the driver is fixed on DPDK side. */
 VLOG_INFO("%s: disabled Tx outer udp checksum offloads for a "
-  "net/ice or net/i40e port.", netdev_get_name(>up));
+  "net/ice, net/i40e or net/iavf port.",
+  netdev_get_name(>up));
 info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM;
 info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO;
 info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO;
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1] nedev-dpdk: Fix config with dpdk net_bonding offloads.

2024-04-12 Thread David Marchand
Hello,

On Fri, Apr 12, 2024 at 8:30 AM Jun Wang  wrote:
>
> If it's a DPDK net_bonding, it may cause
> offload-related configurations to take effect,
> leading to offload failure.

I did not look at the patch for now.

What is the interest of using a net/bonding DPDK port when there is
native support of bonding in OVS?
I am not familiar with OVN setups so maybe I am missing something on this side.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-dpdk: Fallback to non tunnel offloading API.

2024-04-08 Thread David Marchand
On Fri, Apr 5, 2024 at 3:00 PM Ilya Maximets  wrote:
> >>
> >>> Basically, resolving a neighbor with ARP inside tunnels is broken on
> >>> Intel nics (even without re-enabling outer udp checksum).
> >>> This can be observed with the following debug patch at the end of
> >>> netdev_dpdk_prep_hwol_packet():
> >>>
> >>> +char buf[256];
> >>> +if (rte_get_tx_ol_flag_list(mbuf->ol_flags, buf, sizeof(buf)) != 0)
> >>> +buf[0] = '\0';
> >>> +VLOG_WARN("len=%u, ol_flags=%s, outer_l2_len=%u, outer_l3_len=%u,
> >>> l2_len=%u, l3_len=%u, l4_len=%u, tso_segsz=%u", mbuf->pkt_len, buf,
> >>> mbuf->outer_l2_len, mbuf->outer_l3_len, mbuf->l2_len, mbuf->l3_len,
> >>> mbuf->l4_len, mbuf->tso_segsz);
> >>>
> >>> Then doing a "arping" inside the tunnel triggers:
> >>> 2024-04-03T16:05:40.920Z|00014|netdev_dpdk(pmd-c03/id:8)|WARN|len=96,
> >>> ol_flags=RTE_MBUF_F_TX_L4_NO_CKSUM RTE_MBUF_F_TX_OUTER_IP_CKSUM
> >>> RTE_MBUF_F_TX_OUTER_IPV4 RTE_MBUF_F_TX_TUNNEL_VXLAN , outer_l2_len=18,
> >>> outer_l3_len=20, l2_len=0, l3_len=0, l4_len=0, tso_segsz=0
>
> The fact that l2_len and l3_len are not set here looks like an OVS
> bug though, as AFAIU, these should always be set if any Tx offload
> is requested.

The commit that introduces such Tx offloads requests is:
f81d782c19 - netdev-native-tnl: Mark all vxlan/geneve packets as
tunneled. (7 weeks ago) 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-dpdk: Fallback to non tunnel offloading API.

2024-04-04 Thread David Marchand
On Wed, Apr 3, 2024 at 8:13 PM Ilya Maximets  wrote:
> > - This patch fixes some misusage of the DPDK API.
>
> Hmm, I understand that the driver does something funny when it gets
> outer flags set without any inner flags, but how is that a misuse
> of the DPDK API?  Could you point me to the API docs that say that
> inner flags must always be set in this case or that setting only
> outer offloads is not allowed?

Setting the tunnel type (which is set along outer checksum in OVS) is
described as:

/**
 * Bits 45:48 used for the tunnel type.
 * The tunnel type must be specified for TSO or checksum on the inner part
 * of tunnel packets.
 * These flags can be used with RTE_MBUF_F_TX_TCP_SEG for TSO, or
 * RTE_MBUF_F_TX_xxx_CKSUM.
 * The mbuf fields for inner and outer header lengths are required:
 * outer_l2_len, outer_l3_len, l2_len, l3_len, l4_len and tso_segsz for TSO.
 */
#define RTE_MBUF_F_TX_TUNNEL_VXLAN   (0x1ULL << 45)
#define RTE_MBUF_F_TX_TUNNEL_GRE (0x2ULL << 45)
#define RTE_MBUF_F_TX_TUNNEL_IPIP(0x3ULL << 45)
#define RTE_MBUF_F_TX_TUNNEL_GENEVE  (0x4ULL << 45)
/** TX packet with MPLS-in-UDP RFC 7510 header. */
#define RTE_MBUF_F_TX_TUNNEL_MPLSINUDP (0x5ULL << 45)
#define RTE_MBUF_F_TX_TUNNEL_VXLAN_GPE (0x6ULL << 45)
#define RTE_MBUF_F_TX_TUNNEL_GTP   (0x7ULL << 45)
#define RTE_MBUF_F_TX_TUNNEL_ESP   (0x8ULL << 45)

It is not specified what to expect it neither TSO nor inner checksum
is requested.

In a same way, it is not described what to expect if outer API is
called with no inner offload.
Adding Ferruh and Thomas who may have one opinion.


>
> I agree that it seems safer to just downgrade all outer flags to
> inner ones on OVS side, when no inner offloads are requested, I'm
> just not sure I agree that it's an API misuse.  Especially since
> non-Intel cards seem to work fine.

I suppose you mean mlx5.
Has it been tested on other nics?


>
> > Basically, resolving a neighbor with ARP inside tunnels is broken on
> > Intel nics (even without re-enabling outer udp checksum).
> > This can be observed with the following debug patch at the end of
> > netdev_dpdk_prep_hwol_packet():
> >
> > +char buf[256];
> > +if (rte_get_tx_ol_flag_list(mbuf->ol_flags, buf, sizeof(buf)) != 0)
> > +buf[0] = '\0';
> > +VLOG_WARN("len=%u, ol_flags=%s, outer_l2_len=%u, outer_l3_len=%u,
> > l2_len=%u, l3_len=%u, l4_len=%u, tso_segsz=%u", mbuf->pkt_len, buf,
> > mbuf->outer_l2_len, mbuf->outer_l3_len, mbuf->l2_len, mbuf->l3_len,
> > mbuf->l4_len, mbuf->tso_segsz);
> >
> > Then doing a "arping" inside the tunnel triggers:
> > 2024-04-03T16:05:40.920Z|00014|netdev_dpdk(pmd-c03/id:8)|WARN|len=96,
> > ol_flags=RTE_MBUF_F_TX_L4_NO_CKSUM RTE_MBUF_F_TX_OUTER_IP_CKSUM
> > RTE_MBUF_F_TX_OUTER_IPV4 RTE_MBUF_F_TX_TUNNEL_VXLAN , outer_l2_len=18,
> > outer_l3_len=20, l2_len=0, l3_len=0, l4_len=0, tso_segsz=0
> > 2024-04-03T16:05:40.920Z|00012|dpdk|WARN|ice_interrupt_handler():
> > OICR: MDD event
> >
> > We need this fix in OVS regardless of the outer udp checksum issue.
> > I'll respin this fix in a new series, without touching UDP checksum capa.
> >
> >
> > - It does seem that X710 nics have no support for outer udp checksum
> > (according to its datasheet). Some X722 version may have support for
> > it, but net/i40e does not configure the Tx descriptor accordingly.
> > On the other hand, E810 ones seem fine (according to its datasheet).
> >
> > After more debugging, I managed to get outer udp checksum working.
> > I understand the DPDK rte_net_intel_cksum_flags_prepare() helper does
> > not set the pseudo header checksum in the outer udp header.
> > I proposed a fix in the dpdk bz.
> >
> > Waiting for the fix on DPDK side... it is still possible to add the
> > missing bits in OVS (see the branch I pointed at in the OVS issue).
>
> Since this feature never worked with ice in OVS and it is experimental,
> I tend to think that we should just disable it for ice as well until
> DPDK is fixed.
>
> A little too many fixes for that thing we have already and this one will
> involve some extra driver-specific logic that we don't have any automated
> tests for.

I don't mind waiting for the DPDK fix before re-enabling outer udp and
other offloads.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-dpdk: Fallback to non tunnel offloading API.

2024-04-03 Thread David Marchand
On Thu, Mar 28, 2024 at 10:16 AM David Marchand
 wrote:
>
> The outer checksum offloading API in DPDK is ambiguous and was
> added by Intel folks with the assumption that any outer offloading
> always goes with an inner offloading request.
>
> With net/i40e and net/ice drivers, requesting outer ip checksum with a
> tunnel context but no inner offloading request triggers a Tx failure
> associated with a port MDD event.
> 2024-03-27T16:02:07.084Z|00018|dpdk|WARN|ice_interrupt_handler(): OICR:
> MDD event
>
> To avoid this situation, if no checksum or segmentation offloading is
> requested on the inner part of a packet, fallback to "normal" (non outer)
> offloading request.
> And outer offloading can be re-enabled for net/i40e and netice.
>
> Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
> Signed-off-by: David Marchand 
> ---
> Changes since v1:
> - reset inner marks before converting outer requests,
> - fixed some coding style,
>
> ---
>  lib/netdev-dpdk.c | 83 ---
>  1 file changed, 43 insertions(+), 40 deletions(-)
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 2111f77681..ae43594a3d 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1354,18 +1354,6 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
>  info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_TCP_CKSUM;
>  }
>
> -if (!strcmp(info.driver_name, "net_ice")
> -|| !strcmp(info.driver_name, "net_i40e")) {
> -/* FIXME: Driver advertises the capability but doesn't seem
> - * to actually support it correctly.  Can remove this once
> - * the driver is fixed on DPDK side. */
> -VLOG_INFO("%s: disabled Tx outer udp checksum offloads for a "
> -  "net/ice or net/i40e port.", netdev_get_name(>up));
> -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM;
> -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO;
> -info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO;
> -}
> -

A few comments after spending some time on the topic.


- This patch fixes some misusage of the DPDK API.
Basically, resolving a neighbor with ARP inside tunnels is broken on
Intel nics (even without re-enabling outer udp checksum).
This can be observed with the following debug patch at the end of
netdev_dpdk_prep_hwol_packet():

+char buf[256];
+if (rte_get_tx_ol_flag_list(mbuf->ol_flags, buf, sizeof(buf)) != 0)
+buf[0] = '\0';
+VLOG_WARN("len=%u, ol_flags=%s, outer_l2_len=%u, outer_l3_len=%u,
l2_len=%u, l3_len=%u, l4_len=%u, tso_segsz=%u", mbuf->pkt_len, buf,
mbuf->outer_l2_len, mbuf->outer_l3_len, mbuf->l2_len, mbuf->l3_len,
mbuf->l4_len, mbuf->tso_segsz);

Then doing a "arping" inside the tunnel triggers:
2024-04-03T16:05:40.920Z|00014|netdev_dpdk(pmd-c03/id:8)|WARN|len=96,
ol_flags=RTE_MBUF_F_TX_L4_NO_CKSUM RTE_MBUF_F_TX_OUTER_IP_CKSUM
RTE_MBUF_F_TX_OUTER_IPV4 RTE_MBUF_F_TX_TUNNEL_VXLAN , outer_l2_len=18,
outer_l3_len=20, l2_len=0, l3_len=0, l4_len=0, tso_segsz=0
2024-04-03T16:05:40.920Z|00012|dpdk|WARN|ice_interrupt_handler():
OICR: MDD event

We need this fix in OVS regardless of the outer udp checksum issue.
I'll respin this fix in a new series, without touching UDP checksum capa.


- It does seem that X710 nics have no support for outer udp checksum
(according to its datasheet). Some X722 version may have support for
it, but net/i40e does not configure the Tx descriptor accordingly.
On the other hand, E810 ones seem fine (according to its datasheet).

After more debugging, I managed to get outer udp checksum working.
I understand the DPDK rte_net_intel_cksum_flags_prepare() helper does
not set the pseudo header checksum in the outer udp header.
I proposed a fix in the dpdk bz.

Waiting for the fix on DPDK side... it is still possible to add the
missing bits in OVS (see the branch I pointed at in the OVS issue).


- About the workaround (disabling outer udp checksum for net/ice and
net/i40e), the net/iavf is subject to the same bugs.
So we should disable outer udp checksum too for this driver.

However, I am not sure the iavf driver (can?) differentiates which PF
/ hw is used underneath.
So we may have no solution but to always disable this type of
offloading in OVS for net/iavf.



-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Request for Source File Location: 'ovs-vswitchd.conf.db' MAN Page

2024-03-28 Thread David Marchand
Hello,

On Thu, Mar 28, 2024 at 11:13 AM Farhan Tariq  wrote:
>
> I'm adding a new feature to OvS and need to update the 'ovs-vswitchd.conf.db' 
> MAN page. However, I'm unable to locate the source file for this MAN page to 
> add the necessary information about the new feature.
>
> Could you please provide the exact file path or location for the 
> 'ovs-vswitchd.conf.db' MAN page?

You are probably looking for vswitchd/vswitch.xml.
ovsdb man page gets generated from vswitchd/vswitch.xml (and
vswitchd/vswitch.ovsschema) content.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] netdev-dpdk: Fallback to non tunnel offloading API.

2024-03-28 Thread David Marchand
The outer checksum offloading API in DPDK is ambiguous and was
added by Intel folks with the assumption that any outer offloading
always goes with an inner offloading request.

With net/i40e and net/ice drivers, requesting outer ip checksum with a
tunnel context but no inner offloading request triggers a Tx failure
associated with a port MDD event.
2024-03-27T16:02:07.084Z|00018|dpdk|WARN|ice_interrupt_handler(): OICR:
MDD event

To avoid this situation, if no checksum or segmentation offloading is
requested on the inner part of a packet, fallback to "normal" (non outer)
offloading request.
And outer offloading can be re-enabled for net/i40e and netice.

Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
Signed-off-by: David Marchand 
---
Changes since v1:
- reset inner marks before converting outer requests,
- fixed some coding style,

---
 lib/netdev-dpdk.c | 83 ---
 1 file changed, 43 insertions(+), 40 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 2111f77681..ae43594a3d 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1354,18 +1354,6 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_TCP_CKSUM;
 }
 
-if (!strcmp(info.driver_name, "net_ice")
-|| !strcmp(info.driver_name, "net_i40e")) {
-/* FIXME: Driver advertises the capability but doesn't seem
- * to actually support it correctly.  Can remove this once
- * the driver is fixed on DPDK side. */
-VLOG_INFO("%s: disabled Tx outer udp checksum offloads for a "
-  "net/ice or net/i40e port.", netdev_get_name(>up));
-info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM;
-info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO;
-info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO;
-}
-
 if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_IPV4_CKSUM) {
 dev->hw_ol_features |= NETDEV_TX_IPV4_CKSUM_OFFLOAD;
 } else {
@@ -2584,16 +2572,18 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf);
 struct tcp_header *th;
 
-const uint64_t all_requests = (RTE_MBUF_F_TX_IP_CKSUM |
-   RTE_MBUF_F_TX_L4_MASK  |
-   RTE_MBUF_F_TX_OUTER_IP_CKSUM  |
-   RTE_MBUF_F_TX_OUTER_UDP_CKSUM |
-   RTE_MBUF_F_TX_TCP_SEG);
-const uint64_t all_marks = (RTE_MBUF_F_TX_IPV4 |
-RTE_MBUF_F_TX_IPV6 |
-RTE_MBUF_F_TX_OUTER_IPV4 |
-RTE_MBUF_F_TX_OUTER_IPV6 |
-RTE_MBUF_F_TX_TUNNEL_MASK);
+const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM |
+ RTE_MBUF_F_TX_L4_MASK |
+ RTE_MBUF_F_TX_TCP_SEG);
+const uint64_t all_outer_requests = (RTE_MBUF_F_TX_OUTER_IP_CKSUM  |
+  RTE_MBUF_F_TX_OUTER_UDP_CKSUM);
+const uint64_t all_requests = all_inner_requests | all_outer_requests;
+const uint64_t all_inner_marks = (RTE_MBUF_F_TX_IPV4 |
+  RTE_MBUF_F_TX_IPV6);
+const uint64_t all_outer_marks = (RTE_MBUF_F_TX_OUTER_IPV4 |
+  RTE_MBUF_F_TX_OUTER_IPV6 |
+  RTE_MBUF_F_TX_TUNNEL_MASK);
+const uint64_t all_marks = all_inner_marks | all_outer_marks;
 
 if (!(mbuf->ol_flags & all_requests)) {
 /* No offloads requested, no marks should be set. */
@@ -2610,32 +2600,45 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 return true;
 }
 
+const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK;
+if (OVS_UNLIKELY(tunnel_type
+ && tunnel_type != RTE_MBUF_F_TX_TUNNEL_GENEVE
+ && tunnel_type != RTE_MBUF_F_TX_TUNNEL_VXLAN)) {
+VLOG_WARN_RL(, "%s: Unexpected tunnel type: %#"PRIx64,
+ netdev_get_name(>up), tunnel_type);
+netdev_dpdk_mbuf_dump(netdev_get_name(>up),
+  "Packet with unexpected tunnel type", mbuf);
+return false;
+}
+
 /* If packet is vxlan or geneve tunnel packet, calculate outer
  * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated
  * before. */
-const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK;
-if (tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENEVE ||
-tunnel_type == RTE_MBUF_F_TX_TUNNEL_VXLAN) {
+if ((tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENE

Re: [ovs-dev] [PATCH] netdev-dpdk: Fallback to non tunnel offloading API.

2024-03-28 Thread David Marchand
On Thu, Mar 28, 2024 at 5:40 AM junwan...@cestc.cn  wrote:
>
> I validated this modification on my x710 network card, but I found that
> the outer UDP checksum of the transmitted packets is incorrect, leading
> to communication abnormalities. I think it's necessary to disable the outer
> UDP checksum because although the capability reported by DPDK
> indicates support, in reality, the hardware doesn't actually support 
> offloading,
> resulting in outer UDP checksum errors.
>
> tx_geneve_tso_offload="false", tx_ip_csum_offload="true", 
> tx_out_ip_csum_offload="true",
> tx_out_udp_csum_offload="true", tx_sctp_csum_offload="true", 
> tx_tcp_csum_offload="true",
> tx_tcp_seg_offload="false", tx_udp_csum_offload="true", 
> tx_vxlan_tso_offload="false"

Well, good timing, thanks for the report.

I was testing ipv6 in ipv4 (which seemed to work) and I realised
something is wrong at the outer -> inner conversion by looking at the
ol_flags in my debug prints.
Now, trying the opposite (ipv4 in ipv6), I think I reproduce your
issue with a E810 nic:

04:50:46.211854 50:7c:6f:3c:0c:26 > 50:7c:6f:3c:10:5a, ethertype IPv6
(0x86dd), length 168: (hlim 64, next-header UDP (17) payload length:
114) 2001:4e48::2.39854 > 2001::1.geneve: [bad udp cksum 0x89ad ->
0xfadd!] Geneve, Flags [none], vni 0x0, proto TEB (0x6558)
52:54:00:00:11:01 > 4e:a9:1d:ce:85:4a, ethertype IPv4 (0x0800),
length 98: (tos 0x0, ttl 64, id 9408, offset 0, flags [DF], proto ICMP
(1), length 84)
172.31.22.2 > 172.31.22.1: ICMP echo request, id 1442, seq 9, length 64

Please have a try with the v2 (I'll post soon).

If it still fails, can you provide a reproducer (ideally without OVN
to reduce the scope)?


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-dpdk: Fallback to non tunnel offloading API.

2024-03-28 Thread David Marchand
On Wed, Mar 27, 2024 at 5:51 PM David Marchand
 wrote:
>  /* If packet is vxlan or geneve tunnel packet, calculate outer
>   * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated
>   * before. */
> -const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK;
> -if (tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENEVE ||
> -tunnel_type == RTE_MBUF_F_TX_TUNNEL_VXLAN) {
> +if ((tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENEVE ||
> + tunnel_type == RTE_MBUF_F_TX_TUNNEL_VXLAN) &&
> +mbuf->ol_flags & all_inner_requests) {
> +
>  mbuf->outer_l2_len = (char *) dp_packet_l3(pkt) -
>   (char *) dp_packet_eth(pkt);
>  mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) -
>   (char *) dp_packet_l3(pkt);
> +} else {
> +if (OVS_UNLIKELY(!(mbuf->ol_flags & all_inner_requests))) {
> +/* If no inner offloading is requesting, fallback to non 
> tunneling
> + * checksum offloads. */

Inner marks must be reset before converting outer marks.

Otherwise, this results (with IPv4 traffic encapsulated in IPv6 geneve
tunnel) in such a ol_flags combination:
RTE_MBUF_F_RX_RSS_HASH
RTE_MBUF_F_TX_UDP_CKSUM
RTE_MBUF_F_TX_IP_CKSUM
RTE_MBUF_F_TX_IPV4
RTE_MBUF_F_TX_IPV6

v2 in preparation.


>
> -/* If neither inner checksums nor TSO is requested, inner marks
> - * should not be set. */
> -if (!(mbuf->ol_flags & (RTE_MBUF_F_TX_IP_CKSUM |
> -RTE_MBUF_F_TX_L4_MASK  |
> -RTE_MBUF_F_TX_TCP_SEG))) {
> -mbuf->ol_flags &= ~(RTE_MBUF_F_TX_IPV4 |
> -RTE_MBUF_F_TX_IPV6);
> +if (mbuf->ol_flags & RTE_MBUF_F_TX_OUTER_IP_CKSUM) {
> +mbuf->ol_flags |= RTE_MBUF_F_TX_IP_CKSUM;
> +mbuf->ol_flags |= RTE_MBUF_F_TX_IPV4;
> +}
> +if (mbuf->ol_flags & RTE_MBUF_F_TX_OUTER_UDP_CKSUM) {
> +mbuf->ol_flags |= RTE_MBUF_F_TX_UDP_CKSUM;
> +mbuf->ol_flags |= (mbuf->ol_flags & 
> RTE_MBUF_F_TX_OUTER_IPV4) ?
> +  RTE_MBUF_F_TX_IPV4 : RTE_MBUF_F_TX_IPV6;
> +}
> +mbuf->ol_flags &= ~(all_outer_requests | all_outer_marks);


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] netdev-dpdk: Fallback to non tunnel offloading API.

2024-03-27 Thread David Marchand
The outer checksum offloading API in DPDK is ambiguous and was
added by Intel folks with the assumption that any outer offloading
always goes with an inner offloading request.

With net/i40e and net/ice drivers, requesting outer ip checksum with a
tunnel context but no inner offloading request triggers a Tx failure
associated with a port MDD event.
2024-03-27T16:02:07.084Z|00018|dpdk|WARN|ice_interrupt_handler(): OICR:
MDD event

To avoid this situation, if no checksum or segmentation offloading is
requested on the inner part of a packet, fallback to "normal" (non outer)
offloading request.
And outer offloading can be re-enabled for net/i40e and netice.

Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
Signed-off-by: David Marchand 
---
 lib/netdev-dpdk.c | 84 +++
 1 file changed, 42 insertions(+), 42 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 2111f77681..939817474c 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1354,18 +1354,6 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_TCP_CKSUM;
 }
 
-if (!strcmp(info.driver_name, "net_ice")
-|| !strcmp(info.driver_name, "net_i40e")) {
-/* FIXME: Driver advertises the capability but doesn't seem
- * to actually support it correctly.  Can remove this once
- * the driver is fixed on DPDK side. */
-VLOG_INFO("%s: disabled Tx outer udp checksum offloads for a "
-  "net/ice or net/i40e port.", netdev_get_name(>up));
-info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM;
-info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO;
-info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO;
-}
-
 if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_IPV4_CKSUM) {
 dev->hw_ol_features |= NETDEV_TX_IPV4_CKSUM_OFFLOAD;
 } else {
@@ -2584,20 +2572,20 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf);
 struct tcp_header *th;
 
-const uint64_t all_requests = (RTE_MBUF_F_TX_IP_CKSUM |
-   RTE_MBUF_F_TX_L4_MASK  |
-   RTE_MBUF_F_TX_OUTER_IP_CKSUM  |
-   RTE_MBUF_F_TX_OUTER_UDP_CKSUM |
-   RTE_MBUF_F_TX_TCP_SEG);
-const uint64_t all_marks = (RTE_MBUF_F_TX_IPV4 |
-RTE_MBUF_F_TX_IPV6 |
-RTE_MBUF_F_TX_OUTER_IPV4 |
-RTE_MBUF_F_TX_OUTER_IPV6 |
-RTE_MBUF_F_TX_TUNNEL_MASK);
-
-if (!(mbuf->ol_flags & all_requests)) {
+const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM |
+ RTE_MBUF_F_TX_L4_MASK |
+ RTE_MBUF_F_TX_TCP_SEG);
+const uint64_t all_outer_requests = (RTE_MBUF_F_TX_OUTER_IP_CKSUM  |
+  RTE_MBUF_F_TX_OUTER_UDP_CKSUM);
+const uint64_t all_inner_marks = (RTE_MBUF_F_TX_IPV4 |
+  RTE_MBUF_F_TX_IPV6);
+const uint64_t all_outer_marks = (RTE_MBUF_F_TX_OUTER_IPV4 |
+  RTE_MBUF_F_TX_OUTER_IPV6 |
+  RTE_MBUF_F_TX_TUNNEL_MASK);
+
+if (!(mbuf->ol_flags & (all_inner_requests | all_outer_requests))) {
 /* No offloads requested, no marks should be set. */
-mbuf->ol_flags &= ~all_marks;
+mbuf->ol_flags &= ~(all_inner_marks | all_outer_marks);
 
 uint64_t unexpected = mbuf->ol_flags & RTE_MBUF_F_TX_OFFLOAD_MASK;
 if (OVS_UNLIKELY(unexpected)) {
@@ -2610,32 +2598,44 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 return true;
 }
 
+const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK;
+if (OVS_UNLIKELY(tunnel_type
+ && tunnel_type != RTE_MBUF_F_TX_TUNNEL_GENEVE
+ && tunnel_type != RTE_MBUF_F_TX_TUNNEL_VXLAN)) {
+VLOG_WARN_RL(, "%s: Unexpected tunnel type: %#"PRIx64,
+ netdev_get_name(>up), tunnel_type);
+netdev_dpdk_mbuf_dump(netdev_get_name(>up),
+  "Packet with unexpected tunnel type", mbuf);
+return false;
+}
+
 /* If packet is vxlan or geneve tunnel packet, calculate outer
  * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated
  * before. */
-const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK;
-if (tunnel_type == RTE_MBUF_F_TX_TUNNEL_GENEVE ||
-tun

Re: [ovs-dev] [PATCH v2 2/4] netdev-linux: Favour inner packet for multi-encapsulated TSO.

2024-02-15 Thread David Marchand
On Thu, Feb 15, 2024 at 7:03 AM Mike Pattrick  wrote:
> I've made a branch where we properly account for outer and inner
> checksums, and it passes the tests mostly, except for afxdp.
>
> For afxdp we crash in dp_packet_prealloc_headroom(). netdev-afxdp has
> a hardcoded OVS_XDP_HEADROOM=128 bytes and the multiple layers of
> tunneling exceeds that. I ran a test where I set this to 256 and the
> test passes, but that seems like a non-ideal solution. We probably
> shouldn't abort() in dp_packet_resize(), as it could be possible to
> accidentally run into this.

This is exactly the point I wanted to stress with DPDK dp-packets.

The reason behind was to check this old patch of mine:
https://patchwork.ozlabs.org/project/openvswitch/patch/20220318153339.31083-1-david.march...@redhat.com/

DPDK dp-packets data are supposed to be located at
RTE_PKTMBUF_HEADROOM == 128 bytes, on rx.

But I uncovered recently that we won't hit this headroom limit with
net/af_xdp backing netdev-dpdk ports...
The net/af_xdp driver tries to be smart and avoid copies by using the
unaligned chunk af_xdp feature.
https://git.dpdk.org/dpdk/commit/?id=d8a210774e1d4c295fd93b983538da0d15312edd
A consequence is that this driver places received data with a 384
bytes headroom (RTE_PKTMBUF_HEADROOM + XDP_PACKET_HEADROOM).
Which then defeats my unit test...

This placement of data looks incorrect to me, from the DPDK mbuf API "spirit".
Applications expect a RTE_PKTMBUF_HEADROOM headroom, and they size
their buffers accordingly.
This extra headroom would mean applications need to account for this
peculiarity when using this driver...

I will need to spend more time on this, but not now.


>
> Dropping the packet is probably preferable IMO, but that is also a
> very large change, as none of the calling functions have return codes
> themselves and some of the 2rd degree call backs don't either, so many
> functions will need to change.

Or extend dp_packet_resize() for af_xdp dp-packet.
The tricky part is that the dp-packet is part of a umem buffer.
If we make a af_xdp dp-packet points at a different malloc'd data
buffer, we need to distinguish for this case when freeing this
dp-packet.
I can put this on my todolist.


>
> You can see the branch here: https://github.com/mkp-rh/ovs/tree/multitun
> And the test run here: https://github.com/mkp-rh/ovs/actions/runs/7911539363
>
> I'll clean up this a bit and address some of the other things
> mentioned, like the incorrect Fixes tag.

We don't need to fix all issues, the main point is the inner checksum
issue, as it is something that got broken in 3.3.
If we strip the 3rd layer of tunnel from my unit test, it would be
enough to reproduce without hitting af_xdp headroom limit.

Or do you think we can extend an existing test?
At least, fixes should be isolated from the new features like one
introduced in patch 1 of this series.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/4] netdev-linux: Favour inner packet for multi-encapsulated TSO.

2024-02-14 Thread David Marchand
Hello Mike,

On Mon, Feb 12, 2024 at 8:50 PM Mike Pattrick  wrote:
>
> Previously if an OVS configuration nested multiple layers of UDP tunnels
> like VXLAN or GENEVE on top of each other through netdev-linux
> interfaces, the vnet header would be incorrectly set to the outermost
> UDP tunnel layer instead of the intermediary tunnel layer.
>
> This resulted in the middle UDP tunnel not checksum offloading properly.
>
> Fixes: 3337e6d91c5b ("userspace: Enable L4 checksum offloading by default.")
> Reported-by: David Marchand 
> Signed-off-by: Mike Pattrick 

I have some trouble relating this patch to the issue I faced :-).
Could you detail a test that shows the issue you fix here?



After applying (only this patch), I still reproduce an issue with
inner checksums.
As I reported this issue to you offlist, let me put the details in public here.

I wrote a system-traffic.at unit test that stacks 3 vxlan tunnels
(separate topic, but for the context, my goal was to stress DPDK
dp-packets wrt headroom).
If I try this unit test before commit 084c8087292c ("userspace:
Support VXLAN and GENEVE TSO."), I have no issue.

The topology is as follows:
##
#
# at_ns0. init_net
#   .
# at_vxlan1 (10.1.1.1/24)   . br0 (10.1.1.100/24)
# (remote 172.31.1.100) . |
#   . at_vxlan0
#   . (remote 172.31.1.1)
#   .
# at_vxlan3 (172.31.1.1/24) . br-underlay0 (172.31.1.100/24)
# (remote 172.31.2.100) . |
#   . at_vxlan2
#   . (remote 172.31.2.1)
#   .
# at_vxlan5 (172.31.2.1/24) . br-underlay1 (172.31.2.100/24)
# (remote 172.31.3.100) . |
#   . at_vxlan4
#   . (remote 172.31.3.1)
#   .
# p0 (172.31.3.1/24). br-underlay2 (172.31.3.100/24)
# | . |
# \-.-ovs-p0
#
##

(gmail will probably bust this copy/paste, so putting a link to the
actual test: 
https://github.com/david-marchand/ovs/commit/manyvxlan~2#diff-45a77f85f9679bc66ac97300392c0d5d9f5c53264fa8a82d735a553246e71faeR400)

With this setup, I try to ping, from at_ns0 netns, the ip address of
the br tap iface plugged with the other side of each tunnel:

- Most outter level, no encapsulation, all good:
16:24:51.590966 a6:0a:bf:e2:f3:f2 > 82:cf:78:de:ed:46, ethertype IPv4
(0x0800), length 98: (tos 0x0, ttl 64, id 63550, offset 0, flags [DF],
proto ICMP (1), length 84)
172.31.3.1 > 172.31.3.100: ICMP echo request, id 26707, seq 1, length 64

16:24:51.591084 82:cf:78:de:ed:46 > a6:0a:bf:e2:f3:f2, ethertype IPv4
(0x0800), length 98: (tos 0x0, ttl 64, id 28720, offset 0, flags
[none], proto ICMP (1), length 84)
172.31.3.100 > 172.31.3.1: ICMP echo reply, id 26707, seq 1, length 64

- One tunnel encap all good:
16:24:54.140629 a6:0a:bf:e2:f3:f2 > 82:cf:78:de:ed:46, ethertype IPv4
(0x0800), length 148: (tos 0x0, ttl 64, id 61052, offset 0, flags
[none], proto UDP (17), length 134)
172.31.3.1.36831 > 172.31.3.100.vxlan: [udp sum ok] VXLAN, flags
[I] (0x08), vni 0
1e:db:ec:e5:28:6d > 9a:39:be:e8:18:4b, ethertype IPv4 (0x0800), length
98: (tos 0x0, ttl 64, id 54399, offset 0, flags [DF], proto ICMP (1),
length 84)
172.31.2.1 > 172.31.2.100: ICMP echo request, id 51488, seq 1, length 64

16:24:54.140772 82:cf:78:de:ed:46 > a6:0a:bf:e2:f3:f2, ethertype IPv4
(0x0800), length 148: (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
proto UDP (17), length 134)
172.31.3.100.39912 > 172.31.3.1.vxlan: [no cksum] VXLAN, flags [I]
(0x08), vni 0
9a:39:be:e8:18:4b > 1e:db:ec:e5:28:6d, ethertype IPv4 (0x0800), length
98: (tos 0x0, ttl 64, id 29701, offset 0, flags [none], proto ICMP
(1), length 84)
172.31.2.100 > 172.31.2.1: ICMP echo reply, id 51488, seq 1, length 64

- Two tunnels encap:
16:24:58.578900 a6:0a:bf:e2:f3:f2 > 82:cf:78:de:ed:46, ethertype IPv4
(0x0800), length 142: (tos 0x0, ttl 64, id 61719, offset 0, flags
[none], proto UDP (17), length 128)
172.31.3.1.50673 > 172.31.3.100.vxlan: [udp sum ok] VXLAN, flags
[I] (0x08), vni 0
1e:db:ec:e5:28:6d > 9a:39:be:e8:18:4b, ethertype IPv4 (0x0800), length
92: (tos 0x0, ttl 64, id 35175, offset 0, flags [none], proto UDP
(17), length 78)
172.31.2.1.44060 > 172.31.2.100.vxlan: [udp sum ok] VXLAN, flags
[I] (0x08), vni 1
62:53:3f:82:da:56 > Broadcast, ethertype ARP (0x0806), length 42:
Ethernet (len 6), IPv4 (len 4), Request who-has 172.31.1.100 tell
172.31.1.1, length 28

16:24:58.579021 82:cf:78:de:ed:46 > a6:0a:bf:e2:f3:f2, ethertype IPv4
(0x0800), length 142: (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
proto UDP (17), length 128)
172.31.3.100.56325 > 172.31.3.1.vxlan: [n

Re: [ovs-dev] [PATCH v4 3/4] dp-packet: Include inner offsets in adjustments and checks.

2024-02-12 Thread David Marchand
On Mon, Feb 12, 2024 at 7:54 AM Mike Pattrick  wrote:
>
> Include inner offsets in functions where l3 and l4 offsets are either
> modified or checked.
>
> Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
> Signed-off-by: Mike Pattrick 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 2/4] bfd: Set proper offsets and flags in BFD packets.

2024-02-12 Thread David Marchand
On Mon, Feb 12, 2024 at 7:53 AM Mike Pattrick  wrote:
>
> Previously the BFD packet creation code did not appropriately set
> offsets or flags. This contributed to issues involving encapsulation and
> the TSO code.
>
> The transition to using standard functions also means some other
> metadata like packet_type are set appropriately.
>
> Fixes: ccc096898c46 ("bfd: Implement Bidirectional Forwarding Detection.")
> Signed-off-by: Mike Pattrick 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] dpif-netdev: Do not create handler threads.

2024-02-06 Thread David Marchand
On Tue, Feb 6, 2024 at 3:47 PM Eelco Chaudron  wrote:
> On 6 Feb 2024, at 15:17, David Marchand wrote:
>
> > On Tue, Feb 6, 2024 at 2:31 PM Eelco Chaudron  wrote:
> >>
> >> Avoid unnecessary thread creation as no upcalls are generated,
> >> resulting in idle threads waiting for process termination.
> >>
> >> This optimization significantly reduces memory usage, cutting it
> >> by half on a 128 CPU/thread system during testing, with the number
> >> of threads reduced from 95 to 0.
> >>
> >> Signed-off-by: Eelco Chaudron 
> >
> > I find it weird that the dpif layer reports an information on how the
> > ofproto-dpif layer behaves.
> > The handler threads are something ofproto-dpif is responsible for.
> > The upcall receiving loop is something the ofproto-dpif owns.
> > Why should the dpif layer tells how many handlers are needed?
> >
> >
> > I would have seen a different change, where the dpif layer exports a
> > capability, like dpif_can_recv() { return !!dpif->dpif_class->recv; }.
> > ofproto-dpif would then deduce there is no handler to start at all.
>
> That was my first idea also, but then I found there is already an API call to 
> the dpif layer where it can tell the user (ofproto in this case) how many 
> threads it needs to function correctly. Here is the API definition:
>
> 369  /* Queries 'dpif' to see if a certain number of handlers are 
> required by
> 370   * the implementation.
> 371   *
> 372   * If a certain number of handlers are required, returns 'true' and 
> sets
> 373   * 'n_handlers' to that number of handler threads.
> 374   *
> 375   * If not, returns 'false'.
> 376   */
> 377  bool (*number_handlers_required)(struct dpif *dpif, uint32_t 
> *n_handlers);
>
> I guess the ‘If a certain number of handlers are required, returns 'true’’ 
> part fits here, as we need 0.

The fact that it exists does not convince me on its validity :-).
I must be missing something.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] dpif-netdev: Do not create handler threads.

2024-02-06 Thread David Marchand
On Tue, Feb 6, 2024 at 2:31 PM Eelco Chaudron  wrote:
>
> Avoid unnecessary thread creation as no upcalls are generated,
> resulting in idle threads waiting for process termination.
>
> This optimization significantly reduces memory usage, cutting it
> by half on a 128 CPU/thread system during testing, with the number
> of threads reduced from 95 to 0.
>
> Signed-off-by: Eelco Chaudron 

I find it weird that the dpif layer reports an information on how the
ofproto-dpif layer behaves.
The handler threads are something ofproto-dpif is responsible for.
The upcall receiving loop is something the ofproto-dpif owns.
Why should the dpif layer tells how many handlers are needed?


I would have seen a different change, where the dpif layer exports a
capability, like dpif_can_recv() { return !!dpif->dpif_class->recv; }.
ofproto-dpif would then deduce there is no handler to start at all.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 4/4] ofproto-dpif-monitor: Remove unneeded calls to clear packets.

2024-01-31 Thread David Marchand
On Tue, Jan 30, 2024 at 11:15 PM Mike Pattrick  wrote:
>
> Currently the monitor will call dp_packet_clear() on the dp_packet that
> is shared amongst BFD, LLDP, and CFM. However, all of these packets are
> created with eth_compose(), which already calls dp_packet_clear().
>
> Signed-off-by: Mike Pattrick 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 3/4] dp-packet: Include inner offsets in adjustments and checks.

2024-01-31 Thread David Marchand
ruct dp_packet, l4_ofs));
>
> +BUILD_ASSERT_DECL(offsetof(struct dp_packet, l4_ofs) +
> +   MEMBER_SIZEOF(struct dp_packet, l4_ofs) ==
> +   offsetof(struct dp_packet, inner_l3_ofs));
> +
> +BUILD_ASSERT_DECL(offsetof(struct dp_packet, inner_l3_ofs) +
> +   MEMBER_SIZEOF(struct dp_packet, inner_l3_ofs) ==
> +   offsetof(struct dp_packet, inner_l4_ofs));
> +
>  /* The below build assert makes sure it's safe to read/write 128-bits 
> starting
>   * at the l2_pad_size location. */
>  BUILD_ASSERT_DECL(sizeof(struct dp_packet) -
> @@ -125,7 +134,7 @@ avx512_dp_packet_resize_l2(struct dp_packet *b, int 
> resize_by_bytes)
>  /* Each lane represents 16 bits in a 12-bit register. In this case the
>   * first three 16-bit values, which will map to the l2_5_ofs, l3_ofs and
>   * l4_ofs fields. */
> -const uint8_t k_lanes = 0b1110;
> +const uint8_t k_lanes = 0b10;
>
>  /* Set all 16-bit words in the 128-bits v_offset register to the value we
>   * need to add/substract from the l2_5_ofs, l3_ofs, and l4_ofs fields. */

Touching this part scares me.
I think some comments are wrong, and otherwise I hope Intel CI will be
enough to check nothing gets broken here :-).



-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/4] bfd: Set proper offsets and flags in BFD packets.

2024-01-31 Thread David Marchand
On Tue, Jan 30, 2024 at 11:15 PM Mike Pattrick  wrote:
>
> Previously the BFD packet creation code did not appropriately set
> offsets or flags. This contributed to issues involving encapsulation and
> the TSO code.

I noted that apart from fixing the offsets / flags used to checksum
offloading, this patch also fixes the packet_type used by other
dp_packet helpers.
I see nothing fixed on that later topic though.


>
> Fixes: ccc096898c46 ("bfd: Implement Bidirectional Forwarding Detection.")
> Signed-off-by: Mike Pattrick 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/4] dp-packet: Validate correct offset for L4 inner size.

2024-01-31 Thread David Marchand
On Tue, Jan 30, 2024 at 11:15 PM Mike Pattrick  wrote:
>
> This patch fixes the correctness of dp_packet_inner_l4_size() when
> checking for the existence of an inner L4 header. Previously it checked
> for the outer L4 header.
>
> This function is currently only used when a packet is already flagged
> for tunneling, so an incorrect determination isn't possible as long as
> the flags of the packet are correct.
>
> Fixes: 85bcbbed839a ("userspace: Enable tunnel tests with TSO.")
> Signed-off-by: Mike Pattrick 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] system-common-macros: Fix conntrack matching.

2024-01-19 Thread David Marchand
On Fri, Jan 19, 2024 at 1:49 PM Ilya Maximets  wrote:
>
> On 1/18/24 14:00, David Marchand wrote:
> > Seen in GHA recently.
> > Unit tests are checking conntracks relating to a destination ip address
> > but the FORMAT_CT macro is not strict enough and would match unrelated
> > conntracks too.
> >
> > Example:
> > 148. system-traffic.at:6432: testing conntrack - DNAT with
> >   additional SNAT ...
> > [...]
> > ./system-traffic.at:6460: ovs-appctl dpctl/dump-conntrack |
> >   grep "dst=10.1.1.1" |
> >   sed -e 's/port=[0-9]*/port=/g'
> >   -e 's/id=[0-9]*/id=/g'
> >   -e 's/state=[0-9_A-Z]*/state=/g' | sort | uniq
> > [...]
> > @@ -1,2 +1,7 @@
> >  tcp,orig=(src=10.1.1.1,dst=172.1.1.2,sport=,...
> > +tcp,...,reply=(src=13.107.42.16,dst=10.1.1.10,sport=,...
> > +tcp,...,reply=(src=168.63.129.16,dst=10.1.1.10,sport=,...
> > +tcp,...,reply=(src=20.242.161.191,dst=10.1.1.10,sport=,...
> > +tcp,orig=(src=13.107.42.16,dst=10.1.1.10,sport=,...
> > +tcp,orig=(src=20.242.161.191,dst=10.1.1.10,sport=,...
> >
> > Fixes: 07659514c3c1 ("Add support for connection tracking.")
> > Signed-off-by: David Marchand 
> > ---
> >  tests/system-common-macros.at | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
> > index 01ebe364ee..07be29f673 100644
> > --- a/tests/system-common-macros.at
> > +++ b/tests/system-common-macros.at
> > @@ -256,7 +256,7 @@ m4_define([STRIP_MONITOR_CSUM], [grep "csum:" | sed 
> > 's/csum:.*/csum: /'])
> >  # and limit the output to the rows containing 'ip-addr'.
> >  #
> >  m4_define([FORMAT_CT],
> > -[[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e 
> > 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | 
> > sort | uniq]])
> > +[[grep "dst=$1\>" | sed -e 's/port=[0-9]*/port=/g' -e 
> > 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | 
> > sort | uniq]])
> >
> >  # NETNS_DAEMONIZE([namespace], [command], [pidfile])
> >  #
>
> I remembered why the macro is loose.  We wanted to be able
> to match on "subnets" by supplying only part of the address.
>
> There was at least one test that used this functionality.
> Eelco removed it though here:
>   
> https://github.com/openvswitch/ovs/commit/a80883f7682158c7a6955360ee852e8279f748e9
>
> Did you check if have any more instances of such tests?

I did not.

> They can be tricky to find, as we can supply 10.1.1.2 in order
> to match 10.1.1.240, for example.

Ok, you can discard my patch.
Thanks.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] system-common-macros: Fix conntrack matching.

2024-01-19 Thread David Marchand
On Fri, Jan 19, 2024 at 1:20 PM Simon Horman  wrote:
>
> On Thu, Jan 18, 2024 at 02:00:18PM +0100, David Marchand wrote:
> > Seen in GHA recently.
> > Unit tests are checking conntracks relating to a destination ip address
> > but the FORMAT_CT macro is not strict enough and would match unrelated
> > conntracks too.
> >
> > Example:
> > 148. system-traffic.at:6432: testing conntrack - DNAT with
> >   additional SNAT ...
> > [...]
> > ./system-traffic.at:6460: ovs-appctl dpctl/dump-conntrack |
> >   grep "dst=10.1.1.1" |
> >   sed -e 's/port=[0-9]*/port=/g'
> >   -e 's/id=[0-9]*/id=/g'
> >   -e 's/state=[0-9_A-Z]*/state=/g' | sort | uniq
> > [...]
> > @@ -1,2 +1,7 @@
> >  tcp,orig=(src=10.1.1.1,dst=172.1.1.2,sport=,...
> > +tcp,...,reply=(src=13.107.42.16,dst=10.1.1.10,sport=,...
> > +tcp,...,reply=(src=168.63.129.16,dst=10.1.1.10,sport=,...
> > +tcp,...,reply=(src=20.242.161.191,dst=10.1.1.10,sport=,...
> > +tcp,orig=(src=13.107.42.16,dst=10.1.1.10,sport=,...
> > +tcp,orig=(src=20.242.161.191,dst=10.1.1.10,sport=,...
> >
> > Fixes: 07659514c3c1 ("Add support for connection tracking.")
> > Signed-off-by: David Marchand 
> > ---
> >  tests/system-common-macros.at | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
> > index 01ebe364ee..07be29f673 100644
> > --- a/tests/system-common-macros.at
> > +++ b/tests/system-common-macros.at
> > @@ -256,7 +256,7 @@ m4_define([STRIP_MONITOR_CSUM], [grep "csum:" | sed 
> > 's/csum:.*/csum: /'])
> >  # and limit the output to the rows containing 'ip-addr'.
> >  #
> >  m4_define([FORMAT_CT],
> > -[[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e 
> > 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | 
> > sort | uniq]])
> > +[[grep "dst=$1\>" | sed -e 's/port=[0-9]*/port=/g' -e 
> > 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | 
> > sort | uniq]])
> >
> >  # NETNS_DAEMONIZE([namespace], [command], [pidfile])
> >  #
>
> Sorry, I feel I mist be missing something very obvious, but
> I'm unsure why the match is on "dst=$1\>". I would have thought
> the match would be "dst=$1," instead.

\> matches the end of a word.
Using , as a delimiter works too in this case.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] netdev-dpdk: Trigger port reconfiguration in main thread for resets.

2024-01-18 Thread David Marchand
When OVS (main thread) configures a DPDK netdev, it holds a netdev_dpdk
mutex lock.
As part of this configure operation, the net/iavf driver (used with i40e
VF devices) triggers a queue count change. The PF entity (serviced by a
kernel PF driver for example) handles this change and requests back that
the VF driver resets the VF device. The driver then completes the VF reset
operation on its side and waits for completion of the iavf-event thread
responsible for handling various VF device events.

On the other hand, handling of the VF reset request in this iavf-event
thread results in notifying the application with a port reset request
(RTE_ETH_EVENT_INTR_RESET). The OVS reset callback tries to take a hold
of the same netdev_dpdk mutex and blocks the iavf-event thread.

As a resut, the net/iavf driver (still running on OVS main thread) is
unable to complete as it is waiting for iavf-event to complete.

To break from this situation, the OVS reset callback now won't take a
netdev_dpdk mutex. Instead, the port reset request is stored in a simple
RTE_ETH_MAXPORTS array associated to a seq object.
This is enough to let the VF driver complete this port initialisation.
The OVS main thread later handles the port reset request.

More details in the DPDK upstream bz as this issue appeared following a
change in DPDK.

Link: https://bugs.dpdk.org/show_bug.cgi?id=1337
Signed-off-by: David Marchand 
---
Changes since v2:
- fixed build with clang,
- fixed indentation,
- updated NEWS,

Changes since v1:
- converted to atomic accesses on netdev_dpdk_pending_reset[],

---
 NEWS  |  7 -
 lib/netdev-dpdk.c | 76 +--
 2 files changed, 61 insertions(+), 22 deletions(-)

diff --git a/NEWS b/NEWS
index 2153b48053..a6617546c6 100644
--- a/NEWS
+++ b/NEWS
@@ -54,13 +54,6 @@ v3.3.0 - xx xxx 
- Support for multicast snooping to show the protocol responsible for
  adding/updating the entry.
 
-Known issues:
-   - DPDK: v23.11 has a change in behavior in handling i40e VF devices. This
- may block and prevent OVS from adding such devices as ports in a netdev
- datapath bridge.
- For the details, see https://bugs.dpdk.org/show_bug.cgi?id=1337 which
- describes the issue first detected in the 21.11 LTS branch.
-
 
 v3.2.0 - 17 Aug 2023
 
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index fb26825ff8..45f61930d4 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -58,6 +58,7 @@
 #include "openvswitch/match.h"
 #include "openvswitch/ofp-parse.h"
 #include "openvswitch/ofp-print.h"
+#include "openvswitch/poll-loop.h"
 #include "openvswitch/shash.h"
 #include "openvswitch/vlog.h"
 #include "ovs-numa.h"
@@ -2101,32 +2102,73 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
 return new_port_id;
 }
 
+static struct seq *netdev_dpdk_reset_seq;
+static uint64_t netdev_dpdk_last_reset_seq;
+static atomic_bool netdev_dpdk_pending_reset[RTE_MAX_ETHPORTS];
+
+static void
+netdev_dpdk_wait(const struct netdev_class *netdev_class OVS_UNUSED)
+{
+uint64_t last_reset_seq = seq_read(netdev_dpdk_reset_seq);
+
+if (netdev_dpdk_last_reset_seq == last_reset_seq) {
+seq_wait(netdev_dpdk_reset_seq, netdev_dpdk_last_reset_seq);
+} else {
+poll_immediate_wake();
+}
+}
+
+static void
+netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED)
+{
+uint64_t reset_seq = seq_read(netdev_dpdk_reset_seq);
+
+if (reset_seq != netdev_dpdk_last_reset_seq) {
+dpdk_port_t port_id;
+
+netdev_dpdk_last_reset_seq = reset_seq;
+
+for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
+struct netdev_dpdk *dev;
+bool pending_reset;
+
+atomic_read_relaxed(_dpdk_pending_reset[port_id],
+_reset);
+if (!pending_reset) {
+continue;
+}
+atomic_store_relaxed(_dpdk_pending_reset[port_id], false);
+
+ovs_mutex_lock(_mutex);
+dev = netdev_dpdk_lookup_by_port_id(port_id);
+if (dev) {
+ovs_mutex_lock(>mutex);
+dev->reset_needed = true;
+netdev_request_reconfigure(>up);
+VLOG_DBG_RL(, "%s: Device reset requested.",
+netdev_get_name(>up));
+ovs_mutex_unlock(>mutex);
+}
+ovs_mutex_unlock(_mutex);
+}
+}
+}
+
 static int
 dpdk_eth_event_callback(dpdk_port_t port_id, enum rte_eth_event_type type,
 void *param OVS_UNUSED, void *ret_param OVS_UNUSED)
 {
-struct netdev_dpdk *dev;
-
 switch ((int) type) {
 case RTE_ETH_EVENT_INTR_RESET:
-ovs_mutex_lock(_mutex);
-dev = netdev_dpdk_lookup_by_port_id(port_id);
-if (dev) {
-ovs_mute

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Trigger port reconfiguration in main thread for resets.

2024-01-18 Thread David Marchand
On Thu, Jan 18, 2024 at 4:43 PM Ilya Maximets  wrote:
>
> On 1/18/24 15:41, David Marchand wrote:
> > When OVS (main thread) configures a DPDK netdev, it holds a netdev_dpdk
> > mutex lock.
> > As part of this configure operation, the net/iavf driver (used with i40e
> > VF devices) triggers a queue count change. The PF entity (serviced by a
> > kernel PF driver for example) handles this change and requests back that
> > the VF driver resets the VF device. The driver then completes the VF reset
> > operation on its side and waits for completion of the iavf-event thread
> > responsible for handling various VF device events.
> >
> > On the other hand, handling of the VF reset request in this iavf-event
> > thread results in notifying the application with a port reset request
> > (RTE_ETH_EVENT_INTR_RESET). The OVS reset callback tries to take a hold
> > of the same netdev_dpdk mutex and blocks the iavf-event thread.
> >
> > As a resut, the net/iavf driver (still running on OVS main thread) is
> > unable to complete as it is waiting for iavf-event to complete.
> >
> > To break from this situation, the OVS reset callback now won't take a
> > netdev_dpdk mutex. Instead, the port reset request is stored in a simple
> > RTE_ETH_MAXPORTS array associated to a seq object.
> > This is enough to let the VF driver complete this port initialisation.
> > The OVS main thread later handles the port reset request.
> >
> > More details in the DPDK upstream bz as this issue appeared following a
> > change in DPDK.
> >
> > Link: https://bugs.dpdk.org/show_bug.cgi?id=1337
> > Signed-off-by: David Marchand 
> > ---
> > Changes since v1:
> > - converted to atomic accesses on netdev_dpdk_pending_reset[],
> >
> >
> > ---
> >  lib/netdev-dpdk.c | 76 +--
> >  1 file changed, 61 insertions(+), 15 deletions(-)
> >
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index fb26825ff8..6b15e4c03a 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -58,6 +58,7 @@
> >  #include "openvswitch/match.h"
> >  #include "openvswitch/ofp-parse.h"
> >  #include "openvswitch/ofp-print.h"
> > +#include "openvswitch/poll-loop.h"
> >  #include "openvswitch/shash.h"
> >  #include "openvswitch/vlog.h"
> >  #include "ovs-numa.h"
> > @@ -2101,32 +2102,73 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
> >  return new_port_id;
> >  }
> >
> > +static struct seq *netdev_dpdk_reset_seq;
> > +static uint64_t netdev_dpdk_last_reset_seq;
> > +static bool netdev_dpdk_pending_reset[RTE_MAX_ETHPORTS];
>
> Should be an atomic_bool, I suppose.

Yes, already fixed.


>
> > +
> > +static void
> > +netdev_dpdk_wait(const struct netdev_class *netdev_class OVS_UNUSED)
> > +{
> > +uint64_t last_reset_seq = seq_read(netdev_dpdk_reset_seq);
> > +
> > +if (netdev_dpdk_last_reset_seq == last_reset_seq) {
> > +seq_wait(netdev_dpdk_reset_seq, netdev_dpdk_last_reset_seq);
> > +} else {
> > +poll_immediate_wake();
> > +}
> > +}
> > +
> > +static void
> > +netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED)
> > +{
> > +uint64_t reset_seq = seq_read(netdev_dpdk_reset_seq);
> > +
> > +if (reset_seq != netdev_dpdk_last_reset_seq) {
> > +dpdk_port_t port_id;
> > +
> > +netdev_dpdk_last_reset_seq = reset_seq;
> > +
> > +for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
> > +struct netdev_dpdk *dev;
> > +bool pending_reset;
> > +
> > +atomic_read_relaxed(_dpdk_pending_reset[port_id],
> > + _reset);
>
> Indentation.

Indeed.


>
> > +if (!pending_reset) {
> > +continue;
> > +}
> > +atomic_store_relaxed(_dpdk_pending_reset[port_id], 
> > false);
> > +
> > +ovs_mutex_lock(_mutex);
> > +dev = netdev_dpdk_lookup_by_port_id(port_id);
> > +if (dev) {
> > +ovs_mutex_lock(>mutex);
> > +dev->reset_needed = true;
> > +netdev_request_reconfigure(>up);
> > +VLOG_DBG_RL(, "%s: Device reset requested.",
> > +netdev_get_name(>up));
> > +ovs_mutex_unloc

Re: [ovs-dev] [PATCH v2] netdev-dpdk: Trigger port reconfiguration in main thread for resets.

2024-01-18 Thread David Marchand
On Thu, Jan 18, 2024 at 3:41 PM David Marchand
 wrote:
> Changes since v1:
> - converted to atomic accesses on netdev_dpdk_pending_reset[],

I did not wait for the clang results during my tests... cooking v3.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] netdev-dpdk: Trigger port reconfiguration in main thread for resets.

2024-01-18 Thread David Marchand
When OVS (main thread) configures a DPDK netdev, it holds a netdev_dpdk
mutex lock.
As part of this configure operation, the net/iavf driver (used with i40e
VF devices) triggers a queue count change. The PF entity (serviced by a
kernel PF driver for example) handles this change and requests back that
the VF driver resets the VF device. The driver then completes the VF reset
operation on its side and waits for completion of the iavf-event thread
responsible for handling various VF device events.

On the other hand, handling of the VF reset request in this iavf-event
thread results in notifying the application with a port reset request
(RTE_ETH_EVENT_INTR_RESET). The OVS reset callback tries to take a hold
of the same netdev_dpdk mutex and blocks the iavf-event thread.

As a resut, the net/iavf driver (still running on OVS main thread) is
unable to complete as it is waiting for iavf-event to complete.

To break from this situation, the OVS reset callback now won't take a
netdev_dpdk mutex. Instead, the port reset request is stored in a simple
RTE_ETH_MAXPORTS array associated to a seq object.
This is enough to let the VF driver complete this port initialisation.
The OVS main thread later handles the port reset request.

More details in the DPDK upstream bz as this issue appeared following a
change in DPDK.

Link: https://bugs.dpdk.org/show_bug.cgi?id=1337
Signed-off-by: David Marchand 
---
Changes since v1:
- converted to atomic accesses on netdev_dpdk_pending_reset[],


---
 lib/netdev-dpdk.c | 76 +--
 1 file changed, 61 insertions(+), 15 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index fb26825ff8..6b15e4c03a 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -58,6 +58,7 @@
 #include "openvswitch/match.h"
 #include "openvswitch/ofp-parse.h"
 #include "openvswitch/ofp-print.h"
+#include "openvswitch/poll-loop.h"
 #include "openvswitch/shash.h"
 #include "openvswitch/vlog.h"
 #include "ovs-numa.h"
@@ -2101,32 +2102,73 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
 return new_port_id;
 }
 
+static struct seq *netdev_dpdk_reset_seq;
+static uint64_t netdev_dpdk_last_reset_seq;
+static bool netdev_dpdk_pending_reset[RTE_MAX_ETHPORTS];
+
+static void
+netdev_dpdk_wait(const struct netdev_class *netdev_class OVS_UNUSED)
+{
+uint64_t last_reset_seq = seq_read(netdev_dpdk_reset_seq);
+
+if (netdev_dpdk_last_reset_seq == last_reset_seq) {
+seq_wait(netdev_dpdk_reset_seq, netdev_dpdk_last_reset_seq);
+} else {
+poll_immediate_wake();
+}
+}
+
+static void
+netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED)
+{
+uint64_t reset_seq = seq_read(netdev_dpdk_reset_seq);
+
+if (reset_seq != netdev_dpdk_last_reset_seq) {
+dpdk_port_t port_id;
+
+netdev_dpdk_last_reset_seq = reset_seq;
+
+for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
+struct netdev_dpdk *dev;
+bool pending_reset;
+
+atomic_read_relaxed(_dpdk_pending_reset[port_id],
+ _reset);
+if (!pending_reset) {
+continue;
+}
+atomic_store_relaxed(_dpdk_pending_reset[port_id], false);
+
+ovs_mutex_lock(_mutex);
+dev = netdev_dpdk_lookup_by_port_id(port_id);
+if (dev) {
+ovs_mutex_lock(>mutex);
+dev->reset_needed = true;
+netdev_request_reconfigure(>up);
+VLOG_DBG_RL(, "%s: Device reset requested.",
+netdev_get_name(>up));
+ovs_mutex_unlock(>mutex);
+}
+ovs_mutex_unlock(_mutex);
+}
+}
+}
+
 static int
 dpdk_eth_event_callback(dpdk_port_t port_id, enum rte_eth_event_type type,
 void *param OVS_UNUSED, void *ret_param OVS_UNUSED)
 {
-struct netdev_dpdk *dev;
-
 switch ((int) type) {
 case RTE_ETH_EVENT_INTR_RESET:
-ovs_mutex_lock(_mutex);
-dev = netdev_dpdk_lookup_by_port_id(port_id);
-if (dev) {
-ovs_mutex_lock(>mutex);
-dev->reset_needed = true;
-netdev_request_reconfigure(>up);
-VLOG_DBG_RL(, "%s: Device reset requested.",
-netdev_get_name(>up));
-ovs_mutex_unlock(>mutex);
-}
-ovs_mutex_unlock(_mutex);
+atomic_store_relaxed(_dpdk_pending_reset[port_id], true);
+seq_change(netdev_dpdk_reset_seq);
 break;
 
 default:
 /* Ignore all other types. */
 break;
-   }
-   return 0;
+}
+return 0;
 }
 
 static void
@@ -5001,6 +5043,8 @@ netdev_dpdk_class_init(void)
  "[netdev]", 0, 1,
  netdev_dpdk_get_mempo

Re: [ovs-dev] [PATCH] netdev-dpdk: Trigger port reconfiguration in main thread for resets.

2024-01-18 Thread David Marchand
On Thu, Jan 18, 2024 at 2:23 PM Ilya Maximets  wrote:
>
> On 1/18/24 14:16, David Marchand wrote:
> > When OVS (main thread) configures a DPDK netdev, it holds a netdev_dpdk
> > mutex lock.
> > As part of this configure operation, the net/iavf driver (used with i40e
> > VF devices) triggers a queue count change. The PF entity (serviced by a
> > kernel PF driver for example) handles this change and requests back that
> > the VF driver resets the VF device. The driver then completes the VF reset
> > operation on its side and waits for completion of the iavf-event thread
> > responsible for handling various VF device events.
> >
> > On the other hand, handling of the VF reset request in this iavf-event
> > thread results in notifying the application with a port reset request
> > (RTE_ETH_EVENT_INTR_RESET). The OVS reset callback tries to take a hold
> > of the same netdev_dpdk mutex and blocks the iavf-event thread.
> >
> > As a resut, the net/iavf driver (still running on OVS main thread) is
> > unable to complete as it is waiting for iavf-event to complete.
> >
> > To break from this situation, the OVS reset callback now won't take a
> > netdev_dpdk mutex. Instead, the port reset request is stored in a simple
> > RTE_ETH_MAXPORTS array associated to a seq object.
> > This is enough to let the VF driver complete this port initialisation.
> > The OVS main thread later handles the port reset request.
> >
> > Link: https://bugs.dpdk.org/show_bug.cgi?id=1337
> > Signed-off-by: David Marchand 
> > ---
> >  lib/netdev-dpdk.c | 73 +--
> >  1 file changed, 58 insertions(+), 15 deletions(-)
> >
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index fb26825ff8..528850971a 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -58,6 +58,7 @@
> >  #include "openvswitch/match.h"
> >  #include "openvswitch/ofp-parse.h"
> >  #include "openvswitch/ofp-print.h"
> > +#include "openvswitch/poll-loop.h"
> >  #include "openvswitch/shash.h"
> >  #include "openvswitch/vlog.h"
> >  #include "ovs-numa.h"
> > @@ -2101,32 +2102,70 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
> >  return new_port_id;
> >  }
> >
> > +static struct seq *netdev_dpdk_reset_seq;
> > +static uint64_t netdev_dpdk_last_reset_seq;
> > +static bool netdev_dpdk_pending_reset[RTE_MAX_ETHPORTS];
> > +
> > +static void
> > +netdev_dpdk_wait(const struct netdev_class *netdev_class OVS_UNUSED)
> > +{
> > +uint64_t last_reset_seq = seq_read(netdev_dpdk_reset_seq);
> > +
> > +if (netdev_dpdk_last_reset_seq == last_reset_seq) {
> > +seq_wait(netdev_dpdk_reset_seq, netdev_dpdk_last_reset_seq);
> > +} else {
> > +poll_immediate_wake();
> > +}
> > +}
> > +
> > +static void
> > +netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED)
> > +{
> > +uint64_t reset_seq = seq_read(netdev_dpdk_reset_seq);
> > +
> > +if (reset_seq != netdev_dpdk_last_reset_seq) {
> > +dpdk_port_t port_id;
> > +
> > +netdev_dpdk_last_reset_seq = reset_seq;
> > +
> > +for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
> > +struct netdev_dpdk *dev;
> > +
> > +if (!netdev_dpdk_pending_reset[port_id]) {
> > +continue;
> > +}
> > +netdev_dpdk_pending_reset[port_id] = false;
> > +
> > +ovs_mutex_lock(_mutex);
> > +dev = netdev_dpdk_lookup_by_port_id(port_id);
> > +if (dev) {
> > +ovs_mutex_lock(>mutex);
> > +dev->reset_needed = true;
> > +netdev_request_reconfigure(>up);
> > +VLOG_DBG_RL(, "%s: Device reset requested.",
> > +netdev_get_name(>up));
> > +ovs_mutex_unlock(>mutex);
> > +}
> > +ovs_mutex_unlock(_mutex);
> > +}
> > +}
> > +}
> > +
> >  static int
> >  dpdk_eth_event_callback(dpdk_port_t port_id, enum rte_eth_event_type type,
> >  void *param OVS_UNUSED, void *ret_param OVS_UNUSED)
>
> Can we rely on this callback to always be called from a main thread?
> Otherwise, we should use atomics for an array or a separate lock.

I assumed it was not necessary after reading seq.h.

 * Thread-safety
 * =
 *
 * Fully thread safe.  seq_change() synchronizes with seq_read() and
 * seq_wait() on the same variable in release-acquire fashion.  That
 * is, all effects of the memory accesses performed by a thread prior
 * to seq_change() are visible to the threads returning from
 * seq_read() or seq_wait() observing that change.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] netdev-dpdk: Trigger port reconfiguration in main thread for resets.

2024-01-18 Thread David Marchand
When OVS (main thread) configures a DPDK netdev, it holds a netdev_dpdk
mutex lock.
As part of this configure operation, the net/iavf driver (used with i40e
VF devices) triggers a queue count change. The PF entity (serviced by a
kernel PF driver for example) handles this change and requests back that
the VF driver resets the VF device. The driver then completes the VF reset
operation on its side and waits for completion of the iavf-event thread
responsible for handling various VF device events.

On the other hand, handling of the VF reset request in this iavf-event
thread results in notifying the application with a port reset request
(RTE_ETH_EVENT_INTR_RESET). The OVS reset callback tries to take a hold
of the same netdev_dpdk mutex and blocks the iavf-event thread.

As a resut, the net/iavf driver (still running on OVS main thread) is
unable to complete as it is waiting for iavf-event to complete.

To break from this situation, the OVS reset callback now won't take a
netdev_dpdk mutex. Instead, the port reset request is stored in a simple
RTE_ETH_MAXPORTS array associated to a seq object.
This is enough to let the VF driver complete this port initialisation.
The OVS main thread later handles the port reset request.

Link: https://bugs.dpdk.org/show_bug.cgi?id=1337
Signed-off-by: David Marchand 
---
 lib/netdev-dpdk.c | 73 +--
 1 file changed, 58 insertions(+), 15 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index fb26825ff8..528850971a 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -58,6 +58,7 @@
 #include "openvswitch/match.h"
 #include "openvswitch/ofp-parse.h"
 #include "openvswitch/ofp-print.h"
+#include "openvswitch/poll-loop.h"
 #include "openvswitch/shash.h"
 #include "openvswitch/vlog.h"
 #include "ovs-numa.h"
@@ -2101,32 +2102,70 @@ netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
 return new_port_id;
 }
 
+static struct seq *netdev_dpdk_reset_seq;
+static uint64_t netdev_dpdk_last_reset_seq;
+static bool netdev_dpdk_pending_reset[RTE_MAX_ETHPORTS];
+
+static void
+netdev_dpdk_wait(const struct netdev_class *netdev_class OVS_UNUSED)
+{
+uint64_t last_reset_seq = seq_read(netdev_dpdk_reset_seq);
+
+if (netdev_dpdk_last_reset_seq == last_reset_seq) {
+seq_wait(netdev_dpdk_reset_seq, netdev_dpdk_last_reset_seq);
+} else {
+poll_immediate_wake();
+}
+}
+
+static void
+netdev_dpdk_run(const struct netdev_class *netdev_class OVS_UNUSED)
+{
+uint64_t reset_seq = seq_read(netdev_dpdk_reset_seq);
+
+if (reset_seq != netdev_dpdk_last_reset_seq) {
+dpdk_port_t port_id;
+
+netdev_dpdk_last_reset_seq = reset_seq;
+
+for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
+struct netdev_dpdk *dev;
+
+if (!netdev_dpdk_pending_reset[port_id]) {
+continue;
+}
+netdev_dpdk_pending_reset[port_id] = false;
+
+ovs_mutex_lock(_mutex);
+dev = netdev_dpdk_lookup_by_port_id(port_id);
+if (dev) {
+ovs_mutex_lock(>mutex);
+dev->reset_needed = true;
+netdev_request_reconfigure(>up);
+VLOG_DBG_RL(, "%s: Device reset requested.",
+netdev_get_name(>up));
+ovs_mutex_unlock(>mutex);
+}
+ovs_mutex_unlock(_mutex);
+}
+}
+}
+
 static int
 dpdk_eth_event_callback(dpdk_port_t port_id, enum rte_eth_event_type type,
 void *param OVS_UNUSED, void *ret_param OVS_UNUSED)
 {
-struct netdev_dpdk *dev;
-
 switch ((int) type) {
 case RTE_ETH_EVENT_INTR_RESET:
-ovs_mutex_lock(_mutex);
-dev = netdev_dpdk_lookup_by_port_id(port_id);
-if (dev) {
-ovs_mutex_lock(>mutex);
-dev->reset_needed = true;
-netdev_request_reconfigure(>up);
-VLOG_DBG_RL(, "%s: Device reset requested.",
-netdev_get_name(>up));
-ovs_mutex_unlock(>mutex);
-}
-ovs_mutex_unlock(_mutex);
+netdev_dpdk_pending_reset[port_id] = true;
+seq_change(netdev_dpdk_reset_seq);
 break;
 
 default:
 /* Ignore all other types. */
 break;
-   }
-   return 0;
+}
+return 0;
 }
 
 static void
@@ -5001,6 +5040,8 @@ netdev_dpdk_class_init(void)
  "[netdev]", 0, 1,
  netdev_dpdk_get_mempool_info, NULL);
 
+netdev_dpdk_reset_seq = seq_create();
+netdev_dpdk_last_reset_seq = seq_read(netdev_dpdk_reset_seq);
 ret = rte_eth_dev_callback_register(RTE_ETH_ALL,
 RTE_ETH_EVENT_INTR_RESET,
 dpdk_eth

[ovs-dev] [PATCH] system-common-macros: Fix conntrack matching.

2024-01-18 Thread David Marchand
Seen in GHA recently.
Unit tests are checking conntracks relating to a destination ip address
but the FORMAT_CT macro is not strict enough and would match unrelated
conntracks too.

Example:
148. system-traffic.at:6432: testing conntrack - DNAT with
additional SNAT ...
[...]
./system-traffic.at:6460: ovs-appctl dpctl/dump-conntrack |
grep "dst=10.1.1.1" |
sed -e 's/port=[0-9]*/port=/g'
-e 's/id=[0-9]*/id=/g'
-e 's/state=[0-9_A-Z]*/state=/g' | sort | uniq
[...]
@@ -1,2 +1,7 @@
 tcp,orig=(src=10.1.1.1,dst=172.1.1.2,sport=,...
+tcp,...,reply=(src=13.107.42.16,dst=10.1.1.10,sport=,...
+tcp,...,reply=(src=168.63.129.16,dst=10.1.1.10,sport=,...
+tcp,...,reply=(src=20.242.161.191,dst=10.1.1.10,sport=,...
+tcp,orig=(src=13.107.42.16,dst=10.1.1.10,sport=,...
+tcp,orig=(src=20.242.161.191,dst=10.1.1.10,sport=,...

Fixes: 07659514c3c1 ("Add support for connection tracking.")
Signed-off-by: David Marchand 
---
 tests/system-common-macros.at | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
index 01ebe364ee..07be29f673 100644
--- a/tests/system-common-macros.at
+++ b/tests/system-common-macros.at
@@ -256,7 +256,7 @@ m4_define([STRIP_MONITOR_CSUM], [grep "csum:" | sed 
's/csum:.*/csum: /'])
 # and limit the output to the rows containing 'ip-addr'.
 #
 m4_define([FORMAT_CT],
-[[grep "dst=$1" | sed -e 's/port=[0-9]*/port=/g' -e 
's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | sort | 
uniq]])
+[[grep "dst=$1\>" | sed -e 's/port=[0-9]*/port=/g' -e 
's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | sort | 
uniq]])
 
 # NETNS_DAEMONIZE([namespace], [command], [pidfile])
 #
-- 
2.43.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpdk: Update to use v23.11.

2024-01-15 Thread David Marchand
Hello,

On Mon, Jan 15, 2024 at 12:58 PM Ilya Maximets  wrote:
> OK.  I think today we have no real choice but to go with the option 1.
> We'll need a NEWS entry for that in the patch.  I'll make sure to include
> a variant of it in the release announce in February if nothing changes
> until then.
>
> But I think we should still pursue the option 2 in case the solution will
> be found before the final release in February.
>
> Though if there will be no conclusion on the long term problem until autumn,
> we should go with 3 and move 24.11 adoption to summer of 2025.  And follow
> that strategy going forward, as the current approach is not sustainable.
>
> >
> > David, let us know if you agree ? If so, maybe you can send a new
> > version of the patch with the added documentation. I can help with docs
> > or discussing further.
>
> David, could you, please, add a note in the NEWS file and send a new version
> of the patch?

I just sent a v2.
I'll continue investigating the iavf issue.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] dpdk: Update to use v23.11.

2024-01-15 Thread David Marchand
This commit adds support for DPDK v23.11.
It updates the CI script and documentation and includes the following
changes coming from the dpdk-latest branch:

- sparse: Add some compiler intrinsics for DPDK build.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=371129=*

- ci: Cache DPDK installed libraries only.
- ci: Reduce optional libraries in DPDK.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=383367=*

- system-dpdk: Ignore net/ice error log about QinQ offloading.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=385259=*

There is a known issue with i40e VF devices where OVS main thread may
block when adding such devices as dpif-netdev dpdk ports.

Signed-off-by: David Marchand 
---
 .ci/dpdk-build.sh| 28 +++-
 .ci/linux-build.sh   |  9 
 .github/workflows/build-and-test.yml |  4 ++--
 Documentation/faq/releases.rst   |  2 +-
 Documentation/intro/install/dpdk.rst | 16 +++---
 Documentation/topics/dpdk/phy.rst| 12 +-
 Documentation/topics/dpdk/vdev.rst   |  2 +-
 Documentation/topics/dpdk/vhost-user.rst |  2 +-
 Documentation/topics/testing.rst |  2 +-
 Documentation/topics/userspace-tso.rst   |  2 +-
 NEWS |  9 
 debian/control.in|  2 +-
 include/sparse/automake.mk   |  1 +
 include/sparse/ia32intrin.h  | 23 +++
 rhel/openvswitch-fedora.spec.in  |  2 +-
 tests/system-dpdk-macros.at  |  1 +
 16 files changed, 80 insertions(+), 37 deletions(-)
 create mode 100644 include/sparse/ia32intrin.h

diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh
index d4c178ee0d..23f3166a54 100755
--- a/.ci/dpdk-build.sh
+++ b/.ci/dpdk-build.sh
@@ -5,25 +5,27 @@ set -x
 
 function build_dpdk()
 {
-local VERSION_FILE="dpdk-dir/cached-version"
 local DPDK_VER=$1
 local DPDK_OPTS=""
+local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir"
+local VERSION_FILE="$DPDK_INSTALL_DIR/cached-version"
 
-rm -rf dpdk-dir
+rm -rf dpdk-src
+rm -rf $DPDK_INSTALL_DIR
 
 if [ "${DPDK_VER##refs/*/}" != "${DPDK_VER}" ]; then
-git clone --single-branch $DPDK_GIT dpdk-dir -b "${DPDK_VER##refs/*/}"
-pushd dpdk-dir
+git clone --single-branch $DPDK_GIT dpdk-src -b "${DPDK_VER##refs/*/}"
+pushd dpdk-src
 git log -1 --oneline
 else
 wget https://fast.dpdk.org/rel/dpdk-$1.tar.xz
 tar xvf dpdk-$1.tar.xz > /dev/null
 DIR_NAME=$(tar -tf dpdk-$1.tar.xz | head -1 | cut -f1 -d"/")
-mv ${DIR_NAME} dpdk-dir
-pushd dpdk-dir
+mv ${DIR_NAME} dpdk-src
+pushd dpdk-src
 fi
 
-# Switching to 'default' machine to make dpdk-dir cache usable on
+# Switching to 'default' machine to make the dpdk cache usable on
 # different CPUs. We can't be sure that all CI machines are exactly same.
 DPDK_OPTS="$DPDK_OPTS -Dmachine=default"
 
@@ -40,16 +42,22 @@ function build_dpdk()
 DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd"
 enable_drivers="net/null,net/af_xdp,net/tap,net/virtio,net/pcap"
 DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers"
+# OVS depends on the vhost library (and its dependencies).
+# net/tap depends on the gso library.
+DPDK_OPTS="$DPDK_OPTS -Denable_libs=cryptodev,dmadev,gso,vhost"
 
 # Install DPDK using prefix.
-DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build"
+DPDK_OPTS="$DPDK_OPTS --prefix=$DPDK_INSTALL_DIR"
 
 meson $DPDK_OPTS build
 ninja -C build
 ninja -C build install
-
-echo "Installed DPDK in $(pwd)"
 popd
+
+# Remove examples sources.
+rm -rf $DPDK_INSTALL_DIR/share/dpdk/examples
+
+echo "Installed DPDK in $DPDK_INSTALL_DIR"
 echo "${DPDK_VER}" > ${VERSION_FILE}
 }
 
diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 90581c10b7..cf1462a0c4 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -10,8 +10,9 @@ JOBS=${JOBS:-"-j4"}
 
 function install_dpdk()
 {
-local VERSION_FILE="dpdk-dir/cached-version"
-local DPDK_LIB=$(pwd)/dpdk-dir/build/lib/x86_64-linux-gnu
+local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir"
+local VERSION_FILE="${DPDK_INSTALL_DIR}/cached-version"
+local DPDK_LIB=${DPDK_INSTALL_DIR}/lib/x86_64-linux-gnu
 
 if [ "$DPDK_SHARED" ]; then
 EXTRA_OPTS="$EXTRA_OPTS --with-dpdk=shared"
@@ -27,13 +28,13 @@ function install_dpdk()
 export PATH=$(pwd)/dpdk-dir/build/bin:$PATH
 
 if [ ! -f "${VERSION_FILE}" ]; then
-echo "Could not find DPDK in $(pwd)/dpdk-dir"
+echo "Could not find DPDK in $DPDK_INSTA

[ovs-dev] [PATCH v6] system-dpdk: Test with mlx5 devices.

2024-01-10 Thread David Marchand
The DPDK unit test only runs if vfio or igb_uio kernel modules are loaded:
on systems with only mlx5, this test is always skipped.

Besides, the test tries to grab the first device listed by dpdk-devbind.py,
regardless of the PCI device status regarding kmod binding.

Remove dependency on this DPDK script and use a minimal script that
reads PCI sysfs.

This script is not perfect, as one can imagine PCI devices bound to
vfio-pci for virtual machines.
Plus, this script only tries to take over vfio-pci devices. mlx5 devices
can't be taken over blindly as it could mean losing connectivity to the
machine if the netdev was in use for this system.

For those two reasons, add a new environment variable DPDK_PCI_ADDR for
testers to select the PCI device of their liking.
For consistency and grep, the temporary file PCI_ADDR is renamed
to DPDK_PCI_ADDR.

Reviewed-by: Maxime Coquelin 
Acked-by: Eelco Chaudron 
Signed-off-by: David Marchand 
---
Changes since v5:
- rebased,
- moved the script to the python scripts list in automake.mk,
- bumped copyright date,

Changes since v4:
- separated from the original series,
- rebased,
- dropped mlx5 devices from the discovery script,
- documented DPDK_PCI_ADDR env variable,

Changes since v3:
- fixed nit from Maxime,

Changes since v2:
- sorted logs alphabetically,

---
 Documentation/topics/testing.rst | 11 ++---
 tests/automake.mk|  1 +
 tests/system-dpdk-find-device.py | 39 
 tests/system-dpdk-macros.at  | 10 ++--
 tests/system-dpdk.at | 14 ++--
 5 files changed, 57 insertions(+), 18 deletions(-)
 create mode 100755 tests/system-dpdk-find-device.py

diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst
index 5f6940b84d..fb9b3e77b1 100644
--- a/Documentation/topics/testing.rst
+++ b/Documentation/topics/testing.rst
@@ -343,15 +343,20 @@ To see a list of all the available tests, run::
 
 These tests support a `DPDK supported NIC`_. The tests operate on a wider set 
of
 environments, for instance, when a virtual port is used.
-They do require proper DPDK variables (``DPDK_DIR`` and ``DPDK_BUILD``).
 Moreover you need to have root privileges to load the required modules and to 
bind
-the NIC to the DPDK-compatible driver.
+a PCI device to the DPDK-compatible driver.
 
 .. _DPDK supported NIC: https://core.dpdk.org/supported/#nics
 
+The phy test will skip if no suitable PCI device is found.
+It is possible to select which PCI device is used for this test by setting the
+DPDK_PCI_ADDR environment variable, which is especially useful when testing
+with a mlx5 device::
+
+# DPDK_PCI_ADDR=:82:00.0 make check-dpdk
+
 All tests are skipped if no hugepages are configured. User must look into the 
DPDK
 manual to figure out how to `Configure hugepages`_.
-The phy test will skip if no compatible physical device is available.
 
 .. _Configure hugepages: 
https://doc.dpdk.org/guides-22.11/linux_gsg/sys_reqs.html
 
diff --git a/tests/automake.mk b/tests/automake.mk
index 2ae0aeecaf..10c9fbb01f 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -520,6 +520,7 @@ CHECK_PYFILES = \
tests/flowgen.py \
tests/genpkts.py \
tests/ovsdb-monitor-sort.py \
+   tests/system-dpdk-find-device.py \
tests/test-daemon.py \
tests/test-dpparse.py \
tests/test-json.py \
diff --git a/tests/system-dpdk-find-device.py b/tests/system-dpdk-find-device.py
new file mode 100755
index 00..ced74e7f31
--- /dev/null
+++ b/tests/system-dpdk-find-device.py
@@ -0,0 +1,39 @@
+#!/usr/bin/env python3
+# Copyright (c) 2024 Red Hat, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at:
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from pathlib import Path
+import os
+import sys
+
+# The tester might want to select a PCI device, if so, trust it.
+if 'DPDK_PCI_ADDR' in os.environ:
+print(os.environ['DPDK_PCI_ADDR'])
+sys.exit(0)
+
+for device in sorted(Path('/sys/bus/pci/devices').iterdir()):
+class_path = device / 'class'
+# Only consider Network class devices
+if class_path.read_text().strip() != '0x02':
+continue
+kmod_path = device / 'driver' / 'module'
+kmod_name = kmod_path.resolve().name
+# Only care about devices bound to vfio_pci or igb_uio.
+if kmod_name not in ['vfio_pci', 'igb_uio']:
+continue
+print(device.resolve().name)
+sys.exit(0)
+
+sys.exit(1)
diff --git a/tests/system-dpdk-macros.at

[ovs-dev] [PATCH v5 2/2] tests: Move MFEX tests to dpif-netdev.

2023-12-21 Thread David Marchand
The MFEX code and tests do not depend on DPDK anymore.
We can move the unit tests to dpif-netdev.

Reviewed-by: Maxime Coquelin 
Acked-by: Eelco Chaudron 
Acked-by: Kumar Amber 
Signed-off-by: David Marchand 
---
Changes since v4:
- rebased,
- added dummy numa configuration,

Changes since v3:
- removed documentation update,

---
 tests/dpif-netdev.at | 165 
 tests/system-dpdk.at | 197 ---
 2 files changed, 165 insertions(+), 197 deletions(-)

diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at
index d0359b5eab..c9474af0ad 100644
--- a/tests/dpif-netdev.at
+++ b/tests/dpif-netdev.at
@@ -852,3 +852,168 @@ OVS_VSWITCHD_STOP(["dnl
   /.*failed to put.*$/d
   /.*failed to flow_del.*$/d"])
 AT_CLEANUP
+
+AT_SETUP([dpif-netdev - MFEX Autovalidator])
+AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], [])
+AT_SKIP_IF([! $PYTHON3 $srcdir/genpkts.py 2000 > packets])
+OVS_VSWITCHD_START(
+  [add-port br0 p1 \
+   -- set Interface p1 type=dummy-pmd], [], [], 
[--dummy-numa="0,0,0,0,1,1,1,1"])
+
+AT_SKIP_IF([! ovs-appctl dpif-netdev/miniflow-parser-get | sed 1,4d | grep 
"True"], [], [dnl
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/dpif-impl-set dpif_avx512], [0], [dnl
+DPIF implementation set to dpif_avx512.
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set autovalidator], [0], [dnl
+Miniflow extract implementation set to autovalidator.
+])
+
+cat packets | while read line; do
+  AT_CHECK([ovs-appctl netdev-dummy/receive p1 $line], [0], [ignore])
+done
+
+OVS_WAIT_UNTIL([test `ovs-vsctl get interface p1 statistics | grep -oP 
'rx_packets=\s*\K\d+'` -ge 16000])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([dpif-netdev - MFEX Autovalidator Fuzzy])
+AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], [])
+AT_SKIP_IF([! $PYTHON3 $srcdir/genpkts.py 2000 fuzzy > packets])
+OVS_VSWITCHD_START(
+  [add-port br0 p1 \
+   -- set Interface p1 type=dummy-pmd], [], [], 
[--dummy-numa="0,0,0,0,1,1,1,1"])
+
+AT_SKIP_IF([! ovs-appctl dpif-netdev/miniflow-parser-get | sed 1,4d | grep 
"True"], [], [dnl
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/dpif-impl-set dpif_avx512], [0], [dnl
+DPIF implementation set to dpif_avx512.
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set autovalidator], [0], [dnl
+Miniflow extract implementation set to autovalidator.
+])
+
+cat packets | while read line; do
+  AT_CHECK([ovs-appctl netdev-dummy/receive p1 $line], [0], [ignore])
+done
+
+OVS_WAIT_UNTIL([test `ovs-vsctl get interface p1 statistics | grep -oP 
'rx_packets=\s*\K\d+'` -ge 16000])
+
+OVS_VSWITCHD_STOP(["dnl
+/upcall: datapath reached the dynamic limit of .* flows./d"])
+AT_CLEANUP
+
+AT_SETUP([dpif-netdev - MFEX Configuration])
+OVS_VSWITCHD_START(
+  [set Open_vSwitch . other_config:pmd-cpu-mask=0x1 \
+   -- add-port br0 p1 \
+   -- set Interface p1 type=dummy-pmd], [], [], 
[--dummy-numa="0,0,0,0,1,1,1,1"])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set scalar 1], [2],
+[], [dnl
+Error: unknown argument 1.
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 6 study 300 xyz], 
[2],
+[], [dnl
+Error: invalid study_pkt_cnt value: xyz.
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set scalar abcd], [2],
+[], [dnl
+Error: unknown argument abcd.
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 0 scalar abcd], [2],
+[], [dnl
+Error: unknown argument abcd.
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd], [2],
+[], [dnl
+Error: -pmd option requires a thread id argument.
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set tudy abcd], [2],
+[], [dnl
+Error: unknown argument abcd.
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 7 study abcd], [2],
+[], [dnl
+Error: invalid study_pkt_cnt value: abcd.
+ovs-appctl: ovs-vswitchd: server returned an error
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 0 study], [0], [dnl
+Miniflow extract implementation set to study, on pmd thread 0, studying 128 
packets.
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 0 study 512], [0], 
[dnl
+Miniflow extract implementation set to study, on pmd thread 0, studying 512 
packets.
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set study 512], [0], [dnl
+Miniflow extract implementation set to study, studying 512 packets.
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set study], [0], [dnl
+Miniflow extract implementation set to study, studying 128 packets.
+])
+
+AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set -pmd 0 autovalidator], 
[0], [

[ovs-dev] [PATCH v5 1/2] system-dpdk: Use dummy-pmd port for packet injection.

2023-12-21 Thread David Marchand
net_pcap is not always available in DPDK (like, in a dev
environment when you forgot to install the libpcap-devel).
On the other hand, OVS already has its own way to inject packets into a
bridge. Let's make use of it.

The generating script outputs a bulk of 8 packets per line (to save some
cpu spent calling ovs-appctl).

Suggested-by: Ilya Maximets 
Reviewed-by: Maxime Coquelin 
Acked-by: Eelco Chaudron 
Signed-off-by: David Marchand 
---
Changes since v4:
- rebased,
- updated genpkts.py so it outputs 8 packets per line and have
  netdev-dummy/receive called for 8 packets at a time (this little
  optimisation seems enough to avoid hitting OVS_CTL_TIMEOUT),
- stopped calling genpkts.py asynchronuously,

Changes since v3:
- dropped documentation update following rebase,
- fixed regression in fuzzy packets generation,

Changes since v2:
- updated documentation,
- cleaned tests/automake.mk,
- fixed shebang in python script,
- added missing check for scapy availability,

Changes since v1:
- renamed generator script,
- decreased packet count for fuzzy test,
- simplified wait expression for packet count,

---
 tests/automake.mk   |  6 +---
 tests/{mfex_fuzzy.py => genpkts.py} | 56 ++---
 tests/system-dpdk.at| 24 -
 3 files changed, 43 insertions(+), 43 deletions(-)
 rename tests/{mfex_fuzzy.py => genpkts.py} (66%)

diff --git a/tests/automake.mk b/tests/automake.mk
index f8a925012d..2ae0aeecaf 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -146,10 +146,6 @@ $(srcdir)/tests/fuzz-regression-list.at: tests/automake.mk
echo "TEST_FUZZ_REGRESSION([$$basename])"; \
done > $@.tmp && mv $@.tmp $@
 
-EXTRA_DIST += $(MFEX_AUTOVALIDATOR_TESTS)
-MFEX_AUTOVALIDATOR_TESTS = \
-   tests/mfex_fuzzy.py
-
 OVSDB_CLUSTER_TESTSUITE_AT = \
tests/ovsdb-cluster-testsuite.at \
tests/ovsdb-execution.at \
@@ -522,7 +518,7 @@ tests_test_type_props_SOURCES = tests/test-type-props.c
 CHECK_PYFILES = \
tests/appctl.py \
tests/flowgen.py \
-   tests/mfex_fuzzy.py \
+   tests/genpkts.py \
tests/ovsdb-monitor-sort.py \
tests/test-daemon.py \
tests/test-dpparse.py \
diff --git a/tests/mfex_fuzzy.py b/tests/genpkts.py
similarity index 66%
rename from tests/mfex_fuzzy.py
rename to tests/genpkts.py
index 50b9870641..3354e116d0 100755
--- a/tests/mfex_fuzzy.py
+++ b/tests/genpkts.py
@@ -17,51 +17,44 @@ except ModuleNotFoundError:
 from scapy.all import RandMAC, RandIP, PcapWriter, RandIP6, RandShort, fuzz
 from scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP, random
 
-# Path for the pcap file location.
-path = str(sys.argv[1])
 # The number of packets generated will be size * 8.
-size = int(sys.argv[2])
+size = int(sys.argv[1])
 # Traffic option is used to choose between fuzzy or simple packet type.
-if len(sys.argv) > 3:
-traffic_opt = str(sys.argv[3])
+if len(sys.argv) > 2:
+traffic_opt = str(sys.argv[2])
 else:
 traffic_opt = ""
 
-pktdump = PcapWriter(path, append=False, sync=True)
-
-pkt = []
-
 for i in range(0, size):
+pkt = []
+
 if traffic_opt == "fuzzy":
 
 eth = Ether(src=RandMAC(), dst=RandMAC())
 vlan = Dot1Q()
-udp = UDP(dport=RandShort(), sport=RandShort())
 ipv4 = IP(src=RandIP(), dst=RandIP(), len=random.randint(0, 100))
 ipv6 = IPv6(src=RandIP6(), dst=RandIP6(), plen=random.randint(0, 100))
+udp = UDP(dport=RandShort(), sport=RandShort())
 tcp = TCP(dport=RandShort(), sport=RandShort(), flags='S',
   dataofs=random.randint(0, 15))
 
 # IPv4 packets with fuzzing
-pkt.append(fuzz(eth / ipv4 / udp))
-pkt.append(fuzz(eth / ipv4 / tcp))
-pkt.append(fuzz(eth / vlan / ipv4 / udp))
-pkt.append(fuzz(eth / vlan / ipv4 / tcp))
+pkt.append(fuzz(eth / ipv4 / udp).build().hex())
+pkt.append(fuzz(eth / ipv4 / tcp).build().hex())
+pkt.append(fuzz(eth / vlan / ipv4 / udp).build().hex())
+pkt.append(fuzz(eth / vlan / ipv4 / tcp).build().hex())
 
 # IPv6 packets with fuzzing
-pkt.append(fuzz(eth / ipv6 / udp))
-pkt.append(fuzz(eth / ipv6 / tcp))
-pkt.append(fuzz(eth / vlan / ipv6 / udp))
-pkt.append(fuzz(eth / vlan / ipv6 / tcp))
+pkt.append(fuzz(eth / ipv6 / udp).build().hex())
+pkt.append(fuzz(eth / ipv6 / tcp).build().hex())
+pkt.append(fuzz(eth / vlan / ipv6 / udp).build().hex())
+pkt.append(fuzz(eth / vlan / ipv6 / tcp).build().hex())
 
 else:
 mac_addr_src = "52:54:00:FF:FF:{:02X}".format(i % 0xff)
 mac_addr_dst = "80:FF:FF:FF:FF:{:02X}".format(i % 0xff)
-src_port = 200 + (i % 20)
-dst_port = 1000 + (i % 20)
 eth = Ether(src=mac_addr_src, dst=mac_addr_dst)
 vlan = Dot1Q(vlan=(i % 10))
-udp = UDP(dport=src_port, s

Re: [ovs-dev] [PATCH v3 09/11] ci: Fix dpdk build cache key generation.

2023-12-15 Thread David Marchand
On Tue, Dec 5, 2023 at 4:00 PM Eelco Chaudron  wrote:
>
> When new drivers are introduced, the cache key is not accurately computed.
>
> Before the commit 1a1b3106d90e ("ci: Separate DPDK from OVS build."), the
> DPDK build process was integrated in .ci/linux-{setup,build}.sh scripts,
> where specific lines were employed to generate the key. Since it is now
> separated in .ci/dpdk-{setup,build}.sh, this patch computes the key based
> on the content of those dedicated scripts.
>
> Fixes: 4e90baca89f0 ("system-dpdk: Run traffic tests.")
> Signed-off-by: Eelco Chaudron 

Thanks for the fix.

Reviewed-by: David Marchand 

-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] dpdk: Update to use v23.11.

2023-12-13 Thread David Marchand
This commit adds support for DPDK v23.11.
It updates the CI script and documentation and includes the following
changes coming from the dpdk-latest branch:

- sparse: Add some compiler intrinsics for DPDK build.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=371129=*

- ci: Cache DPDK installed libraries only.
- ci: Reduce optional libraries in DPDK.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=383367=*

- system-dpdk: Ignore net/ice error log about QinQ offloading.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=385259=*

Signed-off-by: David Marchand 
---
 .ci/dpdk-build.sh| 28 +++-
 .ci/linux-build.sh   |  9 
 .github/workflows/build-and-test.yml |  4 ++--
 Documentation/faq/releases.rst   |  2 +-
 Documentation/intro/install/dpdk.rst | 16 +++---
 Documentation/topics/dpdk/phy.rst| 12 +-
 Documentation/topics/dpdk/vdev.rst   |  2 +-
 Documentation/topics/dpdk/vhost-user.rst |  2 +-
 Documentation/topics/testing.rst |  2 +-
 Documentation/topics/userspace-tso.rst   |  2 +-
 NEWS |  2 ++
 debian/control.in|  2 +-
 include/sparse/automake.mk   |  1 +
 include/sparse/ia32intrin.h  | 23 +++
 rhel/openvswitch-fedora.spec.in  |  2 +-
 tests/system-dpdk-macros.at  |  1 +
 16 files changed, 73 insertions(+), 37 deletions(-)
 create mode 100644 include/sparse/ia32intrin.h

diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh
index aa83e44643..698b9e1b14 100755
--- a/.ci/dpdk-build.sh
+++ b/.ci/dpdk-build.sh
@@ -5,25 +5,27 @@ set -x
 
 function build_dpdk()
 {
-local VERSION_FILE="dpdk-dir/cached-version"
 local DPDK_VER=$1
 local DPDK_OPTS=""
+local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir"
+local VERSION_FILE="$DPDK_INSTALL_DIR/cached-version"
 
-rm -rf dpdk-dir
+rm -rf dpdk-src
+rm -rf $DPDK_INSTALL_DIR
 
 if [ "${DPDK_VER##refs/*/}" != "${DPDK_VER}" ]; then
-git clone --single-branch $DPDK_GIT dpdk-dir -b "${DPDK_VER##refs/*/}"
-pushd dpdk-dir
+git clone --single-branch $DPDK_GIT dpdk-src -b "${DPDK_VER##refs/*/}"
+pushd dpdk-src
 git log -1 --oneline
 else
 wget https://fast.dpdk.org/rel/dpdk-$1.tar.xz
 tar xvf dpdk-$1.tar.xz > /dev/null
 DIR_NAME=$(tar -tf dpdk-$1.tar.xz | head -1 | cut -f1 -d"/")
-mv ${DIR_NAME} dpdk-dir
-pushd dpdk-dir
+mv ${DIR_NAME} dpdk-src
+pushd dpdk-src
 fi
 
-# Switching to 'default' machine to make dpdk-dir cache usable on
+# Switching to 'default' machine to make the dpdk cache usable on
 # different CPUs. We can't be sure that all CI machines are exactly same.
 DPDK_OPTS="$DPDK_OPTS -Dmachine=default"
 
@@ -40,16 +42,22 @@ function build_dpdk()
 DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd"
 enable_drivers="net/null,net/af_xdp,net/tap,net/virtio"
 DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers"
+# OVS depends on the vhost library (and its dependencies).
+# net/tap depends on the gso library.
+DPDK_OPTS="$DPDK_OPTS -Denable_libs=cryptodev,dmadev,gso,vhost"
 
 # Install DPDK using prefix.
-DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build"
+DPDK_OPTS="$DPDK_OPTS --prefix=$DPDK_INSTALL_DIR"
 
 meson $DPDK_OPTS build
 ninja -C build
 ninja -C build install
-
-echo "Installed DPDK in $(pwd)"
 popd
+
+# Remove examples sources.
+rm -rf $DPDK_INSTALL_DIR/share/dpdk/examples
+
+echo "Installed DPDK in $DPDK_INSTALL_DIR"
 echo "${DPDK_VER}" > ${VERSION_FILE}
 }
 
diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index aa2ecc5050..2536d57d0e 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -9,8 +9,9 @@ EXTRA_OPTS="--enable-Werror"
 
 function install_dpdk()
 {
-local VERSION_FILE="dpdk-dir/cached-version"
-local DPDK_LIB=$(pwd)/dpdk-dir/build/lib/x86_64-linux-gnu
+local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir"
+local VERSION_FILE="${DPDK_INSTALL_DIR}/cached-version"
+local DPDK_LIB=${DPDK_INSTALL_DIR}/lib/x86_64-linux-gnu
 
 if [ "$DPDK_SHARED" ]; then
 EXTRA_OPTS="$EXTRA_OPTS --with-dpdk=shared"
@@ -26,13 +27,13 @@ function install_dpdk()
 export PATH=$(pwd)/dpdk-dir/build/bin:$PATH
 
 if [ ! -f "${VERSION_FILE}" ]; then
-echo "Could not find DPDK in $(pwd)/dpdk-dir"
+echo "Could not find DPDK in $DPDK_INSTALL_DIR"
 return 1
 fi
 
 # Update the library paths.
 sudo ldconfig
-echo "Found cached DPDK $(cat ${

[ovs-dev] [dpdk-latest] system-dpdk: Ignore net/ice error log about QinQ offloading.

2023-12-05 Thread David Marchand
The net/ice DPDK driver complains with an ERROR level log message if the
hw firmware only supports SVM (Single Vlan Mode).
DVM (Dual Vlan mode) seems required when using QinQ offloading.
OVS does not care about this offloading feature and configures nothing
on that topic.

While seeing this error log, some manual tests show that
untagged/tagged/"double" tagged packets (with 0x8100 ethertype)
are still received/transmitted fine.

Ignore this log waiting for a fix on the DPDK side.

Link: https://bugs.dpdk.org/show_bug.cgi?id=1331
Signed-off-by: David Marchand 
---
 tests/system-dpdk-macros.at | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index dcdfa55741..c011487541 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -86,6 +86,7 @@ $1";/does not exist. The Open vSwitch kernel module is 
probably not loaded./d
 /does not support MTU configuration,/d
 /EAL: No \(available\|free\) .*hugepages reported/d
 /Failed to enable flow control/d
+/ice_vsi_config_outer_vlan_stripping(): Single VLAN mode (SVM) does not 
support qinq/d
 /Rx checksum offload is not supported on/d
 /TELEMETRY: No legacy callbacks, legacy socket not created/d"])
 ])
-- 
2.42.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] system-dpdk: Test with mlx5 devices.

2023-12-01 Thread David Marchand
On Tue, Nov 28, 2023 at 9:40 AM David Marchand
 wrote:
>
> On Wed, Nov 22, 2023 at 5:34 PM David Marchand
>  wrote:
> >
> > The DPDK unit test only runs if vfio or igb_uio kernel modules are loaded:
> > on systems with only mlx5, this test is always skipped.
> >
> > Besides, the test tries to grab the first device listed by dpdk-devbind.py,
> > regardless of the PCI device status regarding kmod binding.
> >
> > Remove dependency on this DPDK script and use a minimal script that
> > reads PCI sysfs.
> >
> > This script is not perfect, as one can imagine PCI devices bound to
> > vfio-pci for virtual machines.
> > Plus, this script only tries to take over vfio-pci devices. mlx5 devices
> > can't be taken over blindly as it could mean losing connectivity to the
> > machine if the netdev was in use for this system.
> >
> > For those two reasons, add a new environment variable DPDK_PCI_ADDR for
> > testers to select the PCI device of their liking.
> > For consistency and grep, the temporary file PCI_ADDR is renamed
> > to DPDK_PCI_ADDR.
> >
> > Reviewed-by: Maxime Coquelin 
> > Acked-by: Eelco Chaudron 
> > Signed-off-by: David Marchand 
>
> This patch can't be merged as is.
> I am preparing some fixes for the system-dpdk MTU tests that got
> merged since my v4.

Coming back on this.
I sent a fix 
https://patchwork.ozlabs.org/project/openvswitch/patch/20231201142931.1782046-1-david.march...@redhat.com/.
This current patch on making it possible to select a PCI device to
test is kind of orthogonal to this fix.
Yet, people will likely want to test with both applied.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] system-dpdk: Wait for MTU changes to be applied.

2023-12-01 Thread David Marchand
Because a DPDK backed netdev configuration is done in an asynchronous way,
and a MTU change requires a reconfiguration, directly checking
ovs-vswitchd logs or querying ovsdb for the interface current MTU value
is racy.

Add synchronisation points on the interface MTU value in ovsdb as it
ensures that a netdev (re)configuration did happen.
With those synchronisation points in place, error messages may be checked
in logs afterward.

Fixes: bf47829116a8 ("tests: Add OVS-DPDK MTU unit tests.")
Signed-off-by: David Marchand 
---
Changes since v1:
- dropped test output,


---
 tests/system-dpdk.at | 42 --
 1 file changed, 12 insertions(+), 30 deletions(-)

diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index 17742d20a0..af092a2000 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -511,15 +511,13 @@ dnl Add userspace bridge and attach it to OVS with 
default MTU value
 AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev])
 AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk 
options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr])
 AT_CHECK([ovs-vsctl show], [], [stdout])
-sleep 2
 
 dnl Check default MTU value in the datapath
-AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl
-1500
-])
+OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [1500])
 
 dnl Increase MTU value and check in the datapath
 AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9000])
+OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [9000])
 
 dnl Fail if MTU is not supported
 AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" 
ovs-vswitchd.log], [], [stdout])
@@ -527,10 +525,6 @@ AT_FAIL_IF([grep "Interface phy0 does not support MTU 
configuration" ovs-vswitch
 dnl Fail if error is encountered during MTU setup
 AT_FAIL_IF([grep "Interface phy0 MTU (9000) setup error" ovs-vswitchd.log], 
[], [stdout])
 
-AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl
-9000
-])
-
 
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 phy0], [], [stdout], [stderr])
@@ -555,7 +549,9 @@ AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 
datapath_type=netdev])
 AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk 
options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr])
 AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9000])
 AT_CHECK([ovs-vsctl show], [], [stdout])
-sleep 2
+
+dnl Check MTU value in the datapath
+OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [9000])
 
 dnl Fail if MTU is not supported
 AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" 
ovs-vswitchd.log], [], [stdout])
@@ -563,17 +559,9 @@ AT_FAIL_IF([grep "Interface phy0 does not support MTU 
configuration" ovs-vswitch
 dnl Fail if error is encountered during MTU setup
 AT_FAIL_IF([grep "Interface phy0 MTU (9000) setup error" ovs-vswitchd.log], 
[], [stdout])
 
-dnl Check MTU value in the datapath
-AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl
-9000
-])
-
 dnl Decrease MTU value and check in the datapath
 AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=2000])
-
-AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl
-2000
-])
+OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [2000])
 
 
 dnl Clean up
@@ -680,7 +668,9 @@ AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 
datapath_type=netdev])
 AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk 
options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr])
 AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9702])
 AT_CHECK([ovs-vsctl show], [], [stdout])
-sleep 2
+
+dnl Check MTU value in the datapath
+OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [9702])
 
 dnl Fail if MTU is not supported
 AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" 
ovs-vswitchd.log], [], [stdout])
@@ -688,11 +678,6 @@ AT_FAIL_IF([grep "Interface phy0 does not support MTU 
configuration" ovs-vswitch
 dnl Fail if error is encountered during MTU setup
 AT_FAIL_IF([grep "Interface phy0 MTU (9702) setup error" ovs-vswitchd.log], 
[], [stdout])
 
-dnl Check MTU value in the datapath
-AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl
-9702
-])
-
 dnl Set MTU value above upper bound and check for error
 AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9711])
 AT_CHECK([grep "phy0: unsupported MTU 9711" ovs-vswitchd.log], [], [stdout])
@@ -721,7 +706,9 @@ AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 
datapath_type=netdev])
 AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk 
options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr])
 AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=68])
 AT_CHECK([ovs-vsctl show], [], [stdout])
-sleep 2
+
+dnl Check MTU value in the datapath
+OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [68])
 
 dnl Fail i

[ovs-dev] [PATCH] system-dpdk: Wait for MTU changes to be applied.

2023-12-01 Thread David Marchand
Because a DPDK backed netdev configuration is done in an asynchronous way,
and a MTU change requires a reconfiguration, directly checking
ovs-vswitchd logs or querying ovsdb for the interface current MTU value
is racy.

$ DPDK_DIR=/root/ovs/dpdk-dir/v22.11 make -C build check-dpdk \
TESTSUITEFLAGS="-d 15"
...
 15: OVS-DPDK - MTU decrease phy portFAILED
(system-dpdk.at:574)

Test log:
...
../../tests/system-dpdk.at:572: ovs-vsctl set Interface phy0
mtu_request=2000
../../tests/system-dpdk.at:574: ovs-vsctl get Interface phy0
mtu
--- -   2023-12-01 08:55:46.896454338 -0500
+++ .../tests/system-dpdk-testsuite.dir/at-groups/15/stdout
2023-12-01 08:55:46.894532711 -0500
@@ -1,2 +1,2 @@
-2000
+9000

ovs-vswitchd log:
2023-12-01T13:55:44.666Z|00098|netdev_dpdk|INFO|Port 0:
50:7c:6f:3c:0c:26
2023-12-01T13:55:44.667Z|00099|netdev_dpdk|INFO|phy0: rx-steering:
default rss
...
2023-12-01T13:55:44.686Z|00102|timeval|WARN|Unreasonably long 1070ms
poll interval (339ms user, 728ms system)
2023-12-01T13:55:44.686Z|00103|timeval|WARN|faults: 1 minor, 0 major
2023-12-01T13:55:44.686Z|00104|timeval|WARN|context switches: 0 voluntary,
9 involuntary
...
2023-12-01T13:55:45.692Z|00150|poll_loop|INFO|wakeup due to [POLLIN] on
fd 54 (FIFO pipe:[1781849]) at ../vswitchd/bridge.c:421
(68% CPU usage)
2023-12-01T13:55:46.883Z|00151|netdev_dpdk|INFO|Port 0:
50:7c:6f:3c:0c:26
2023-12-01T13:55:46.884Z|00152|netdev_dpdk|INFO|phy0: rx-steering:
default rss

Add synchronisation points on the interface MTU value in ovsdb as it
ensures that a netdev (re)configuration did happen.
With those synchronisation points in place, error messages may be checked
in logs afterward.

Fixes: bf47829116a8 ("tests: Add OVS-DPDK MTU unit tests.")
Signed-off-by: David Marchand 
---
 tests/system-dpdk.at | 42 --
 1 file changed, 12 insertions(+), 30 deletions(-)

diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index 17742d20a0..af092a2000 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -511,15 +511,13 @@ dnl Add userspace bridge and attach it to OVS with 
default MTU value
 AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev])
 AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk 
options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr])
 AT_CHECK([ovs-vsctl show], [], [stdout])
-sleep 2
 
 dnl Check default MTU value in the datapath
-AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl
-1500
-])
+OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [1500])
 
 dnl Increase MTU value and check in the datapath
 AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9000])
+OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [9000])
 
 dnl Fail if MTU is not supported
 AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" 
ovs-vswitchd.log], [], [stdout])
@@ -527,10 +525,6 @@ AT_FAIL_IF([grep "Interface phy0 does not support MTU 
configuration" ovs-vswitch
 dnl Fail if error is encountered during MTU setup
 AT_FAIL_IF([grep "Interface phy0 MTU (9000) setup error" ovs-vswitchd.log], 
[], [stdout])
 
-AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl
-9000
-])
-
 
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 phy0], [], [stdout], [stderr])
@@ -555,7 +549,9 @@ AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 
datapath_type=netdev])
 AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk 
options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr])
 AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9000])
 AT_CHECK([ovs-vsctl show], [], [stdout])
-sleep 2
+
+dnl Check MTU value in the datapath
+OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [9000])
 
 dnl Fail if MTU is not supported
 AT_FAIL_IF([grep "Interface phy0 does not support MTU configuration" 
ovs-vswitchd.log], [], [stdout])
@@ -563,17 +559,9 @@ AT_FAIL_IF([grep "Interface phy0 does not support MTU 
configuration" ovs-vswitch
 dnl Fail if error is encountered during MTU setup
 AT_FAIL_IF([grep "Interface phy0 MTU (9000) setup error" ovs-vswitchd.log], 
[], [stdout])
 
-dnl Check MTU value in the datapath
-AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl
-9000
-])
-
 dnl Decrease MTU value and check in the datapath
 AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=2000])
-
-AT_CHECK([ovs-vsctl get Interface phy0 mtu], [0], [dnl
-2000
-])
+OVS_WAIT_UNTIL_EQUAL([ovs-vsctl get Interface phy0 mtu], [2000])
 
 
 dnl Clean up
@@ -680,7 +668,9 @@ AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 
datapath_type=netdev])
 AT_CHECK([ovs-vsctl add-port br10 phy0 -- set Interface phy0 type=dpdk 
options:dpdk-devargs=$(cat PCI_ADDR)], [], [stdout], [stderr])
 AT_CHECK([ovs-vsctl set Interface phy0 mtu_request=9702])
 AT_CHECK([ovs-vsctl show], [],

Re: [ovs-dev] [DPDK 23.11 ? ][ovs-build] |fail| pw1869766 cirrus: Update from FreeBSD 12 to 14.

2023-11-30 Thread David Marchand
On Wed, Nov 29, 2023 at 8:06 PM Ilya Maximets  wrote:
>
> On 11/29/23 18:51, ovs_jenk...@intel.com wrote:
> > Test-Label: intel-ovs-compilation
> > Test-Status: fail
> > http://patchwork.ozlabs.org/api/patches/1869766/
> >
> > AVX-512_compilation: failed
> > DPLCS Test: fail
> > DPIF Test: fail
> > MFEX Test: fail
> > Actions Test: fail
> > Errors in DPCLS test:
> > make check-dpdk
> > make  all-am
> > make[1]: Entering directory '/root/ovs-dev'
> > make[1]: Leaving directory '/root/ovs-dev'
> > set /bin/bash './tests/system-dpdk-testsuite' -C tests  
> > AUTOTEST_PATH='utilities:vswitchd:ovsdb:vtep:tests:ipsec::'; \
> > "$@"  -j1 || (test X'' = Xyes && "$@" --recheck)
>
> 
>
> > ## -- ##
> > ## system-dpdk-testsuite.log was created. ##
> > ## -- ##
> >
> > Please send `tests/system-dpdk-testsuite.log' and all information you think 
> > might help:
> >
> >To: 
> >Subject: [openvswitch 3.2.90] system-dpdk-testsuite: 2 6 10 14 15 18 19 
> > failed
> >
> > You may investigate any problem if you feel able to do so, in which
> > case the test suite provides a good starting point.  Its output may
> > be found below `tests/system-dpdk-testsuite.dir'.
> >
> > make: *** [Makefile:7061: check-dpdk] Error 1
> >> 2023-11-29T17:03:47.661Z|00089|dpdk|INFO|Device with port_id=0 already 
> >> stopped
> >> 2023-11-29T17:03:47.661Z|1|dpdk(pmd-c88/id:103)|INFO|PMD thread uses 
> >> DPDK lcore 2.
> >> 2023-11-29T17:03:47.778Z|00090|dpdk|ERR|ice_vsi_config_outer_vlan_stripping():
> >>  Single VLAN mode (SVM) does not support qinq
>
> Hi, Michael, others.
>
> This is a strange error in the log.  This log message was introduced
> in DPDK commit de5da9d16430 ("net/ice: support double VLAN"), which
> is only in DPDK 23.07+, while we should still be using 22.11 here.
>
> Could you, please, check?
>
> But also, is something needs to be changed for the support of 23.11
> so the tests will not fail once we upgrade?  Looks like some driver
> configuration error that should not happen.  Driver bug?
> CC: David, Kevin.

As far as I can see, this is "just" an error level log message with no impact.

My guess is that offloading of vlan is now dependant on some firmware
level feature.
Such firmware stuff is probably related to ddp files or other black box.
The net/ice driver spews this warning but does not fail to initialise the port.

OVS does not rely on vlan offloading (and it is a good thing as I
think Intel broke it a number of times with the E810 driver...).
I tested 23.11 on dpdk-latest, with untagged packets, single
encapsulated vlan packet and double vlan, and all of them are received
correctly.
Sending is also functional.


A fix is needed on DPDK side.
For now, OVS can ignore this stupid (sorry I don't have better words
and I am a bit tired of such stuff) log message.


Cc: Intel maintainers.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] system-dpdk: Test with mlx5 devices.

2023-11-28 Thread David Marchand
On Wed, Nov 22, 2023 at 5:34 PM David Marchand
 wrote:
>
> The DPDK unit test only runs if vfio or igb_uio kernel modules are loaded:
> on systems with only mlx5, this test is always skipped.
>
> Besides, the test tries to grab the first device listed by dpdk-devbind.py,
> regardless of the PCI device status regarding kmod binding.
>
> Remove dependency on this DPDK script and use a minimal script that
> reads PCI sysfs.
>
> This script is not perfect, as one can imagine PCI devices bound to
> vfio-pci for virtual machines.
> Plus, this script only tries to take over vfio-pci devices. mlx5 devices
> can't be taken over blindly as it could mean losing connectivity to the
> machine if the netdev was in use for this system.
>
> For those two reasons, add a new environment variable DPDK_PCI_ADDR for
> testers to select the PCI device of their liking.
> For consistency and grep, the temporary file PCI_ADDR is renamed
> to DPDK_PCI_ADDR.
>
> Reviewed-by: Maxime Coquelin 
> Acked-by: Eelco Chaudron 
> Signed-off-by: David Marchand 

This patch can't be merged as is.
I am preparing some fixes for the system-dpdk MTU tests that got
merged since my v4.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 6/9] ci: Fix dpdk build cache key generation.

2023-11-28 Thread David Marchand
On Mon, Nov 27, 2023 at 3:26 PM Eelco Chaudron  wrote:
> On 27 Nov 2023, at 13:53, David Marchand wrote:
> > On Mon, Nov 27, 2023 at 1:39 PM Eelco Chaudron  wrote:
> >>
> >> When new drivers are introduced, the cache key is not accurately computed.
> >> Previously, the dpdk build process was integrated into the main Linux
> >> build script, where specific lines were employed to generate the key.
> >> Since it is now separated into two distinct files, this patch will
> >> compute the key based on the content of these two files.
> >
> > I would rephrase this last sentence, as "two distinct files" and
> > "these two files" are a bit unclear.
> > Afaiu, the former refers to .ci/{linux,dpdk}-build.sh, while the
> > latter refers to .ci/dpdk-{setup,build}.sh
> >
> > Otherwise the fix lgtm.
>
> So what about changing the commit message as follows:
>
>
>   Previously, the dpdk build process was integrated into the
>   .ci/{linux,dpdk}-build.sh scripts, where specific lines were employed to
>   generate the key. Since it is now separated into two distinct files,
>   .ci/dpdk-{setup,build}.sh, this patch will compute the key based on the
>   content of these two files.

"""
When new drivers are introduced, the cache key is not accurately computed.

Before the commit 1a1b3106d90e ("ci: Separate DPDK from OVS build."),
the DPDK build
process was integrated in .ci/linux-{setup,build}.sh scripts, where
specific lines were
employed to generate the key.
Since it is now separated in .ci/dpdk-{setup,build}.sh, this patch
computes the key based on the
content of those dedicated scripts.
"""

Deal?


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 6/9] ci: Fix dpdk build cache key generation.

2023-11-27 Thread David Marchand
Hello Eelco,

On Mon, Nov 27, 2023 at 1:39 PM Eelco Chaudron  wrote:
>
> When new drivers are introduced, the cache key is not accurately computed.
> Previously, the dpdk build process was integrated into the main Linux
> build script, where specific lines were employed to generate the key.
> Since it is now separated into two distinct files, this patch will
> compute the key based on the content of these two files.

I would rephrase this last sentence, as "two distinct files" and
"these two files" are a bit unclear.
Afaiu, the former refers to .ci/{linux,dpdk}-build.sh, while the
latter refers to .ci/dpdk-{setup,build}.sh

Otherwise the fix lgtm.

>
> Fixes: 4e90baca89f0 ("system-dpdk: Run traffic tests.")
> Signed-off-by: Eelco Chaudron 
> ---
>  .github/workflows/build-and-test.yml |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/.github/workflows/build-and-test.yml 
> b/.github/workflows/build-and-test.yml
> index d74668f61..e9a2714fb 100644
> --- a/.github/workflows/build-and-test.yml
> +++ b/.github/workflows/build-and-test.yml
> @@ -30,7 +30,7 @@ jobs:
># This also allows us to use cache from any branch as long as version
># and a way we're building DPDK stays the same.
>run:  |
> -grep -irE 'RTE_|DPDK|meson|ninja' .ci/dpdk-* > dpdk-ci-signature
> +cat .ci/dpdk-* > dpdk-ci-signature
>  grep -rwE 'DPDK_GIT|DPDK_VER' .github/ >> dpdk-ci-signature
>  if [ "${DPDK_VER##refs/*/}" != "${DPDK_VER}" ]; then
>  git ls-remote --heads $DPDK_GIT $DPDK_VER >> dpdk-ci-signature
>

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH branch-3.2] dpdk: Use DPDK 22.11.3 release for OVS 3.2.

2023-11-24 Thread David Marchand
On Thu, Nov 23, 2023 at 12:51 PM Kevin Traynor  wrote:
>
> Update the CI and docs to use DPDK 22.11.3.
>
> Signed-off-by: Kevin Traynor 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH branch-3.1] dpdk: Use DPDK 22.11.3 release for OVS 3.1.

2023-11-24 Thread David Marchand
On Thu, Nov 23, 2023 at 12:51 PM Kevin Traynor  wrote:
>
> Update the CI and docs to use DPDK 22.11.3.
>
> Signed-off-by: Kevin Traynor 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH branch-3.0] dpdk: Use DPDK 21.11.5 release for OVS 3.0.

2023-11-24 Thread David Marchand
On Thu, Nov 23, 2023 at 12:51 PM Kevin Traynor  wrote:
>
> Update the CI and docs to use DPDK 21.11.5.
>
> Signed-off-by: Kevin Traynor 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH branch-2.17] dpdk: Use DPDK 21.11.5 release for OVS 2.17.

2023-11-24 Thread David Marchand
On Thu, Nov 23, 2023 at 12:50 PM Kevin Traynor  wrote:
>
> Update the CI and docs to use DPDK 21.11.5.
>
> Signed-off-by: Kevin Traynor 
> ---
>  .ci/linux-build.sh   | 2 +-
>  Documentation/faq/releases.rst   | 2 +-
>  Documentation/intro/install/dpdk.rst | 8 
>  NEWS | 3 +++
>  4 files changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
> index f5021e1a8..9464ea49c 100755
> --- a/.ci/linux-build.sh
> +++ b/.ci/linux-build.sh
> @@ -221,5 +221,5 @@ fi
>  if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then
>  if [ -z "$DPDK_VER" ]; then
> -DPDK_VER="21.11.2"
> +DPDK_VER="21.11.5"
>  fi
>  install_dpdk $DPDK_VER
> diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
> index 49895c595..0e0c589a3 100644
> --- a/Documentation/faq/releases.rst
> +++ b/Documentation/faq/releases.rst
> @@ -211,5 +211,5 @@ Q: What DPDK version does each Open vSwitch release work 
> with?
>  2.15.x   20.11.6
>  2.16.x   20.11.6
> -2.17.x   21.11.2
> +2.17.x   21.11.5
>   
>
> diff --git a/Documentation/intro/install/dpdk.rst 
> b/Documentation/intro/install/dpdk.rst
> index a284e6851..559e8eb1f 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -43,5 +43,5 @@ In addition to the requirements described in 
> :doc:`general`, building Open
>  vSwitch with DPDK will require the following:
>
> -- DPDK 21.11.2
> +- DPDK 21.11.5
>
>  - A `DPDK supported NIC`_
> @@ -74,7 +74,7 @@ Install DPDK
>
> $ cd /usr/src/
> -   $ wget https://fast.dpdk.org/rel/dpdk-21.11.2.tar.xz
> -   $ tar xf dpdk-21.11.2.tar.xz
> -   $ export DPDK_DIR=/usr/src/dpdk-stable-21.11.2
> +   $ wget https://fast.dpdk.org/rel/dpdk-21.11.5.tar.xz
> +   $ tar xf dpdk-21.11.5.tar.xz
> +   $ export DPDK_DIR=/usr/src/dpdk-stable-21.11.5
> $ cd $DPDK_DIR
>
> diff --git a/NEWS b/NEWS
> index 7d4a8c081..642beb45b 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -1,4 +1,7 @@
>  v2.17.9 - xx xxx 
>  -
> +   - Bug fixes

I see in the history that the "Bug fixes" characterization is usually
added when releasing a version.
So I am not sure it should be added in this patch.

> +   - DPDK:
> + * OVS validated with DPDK 21.11.5
>
>  v2.17.8 - 17 Oct 2023

Otherwise, it lgtm.

Reviewed-by: David Marchand 



-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5] system-dpdk: Test with mlx5 devices.

2023-11-22 Thread David Marchand
The DPDK unit test only runs if vfio or igb_uio kernel modules are loaded:
on systems with only mlx5, this test is always skipped.

Besides, the test tries to grab the first device listed by dpdk-devbind.py,
regardless of the PCI device status regarding kmod binding.

Remove dependency on this DPDK script and use a minimal script that
reads PCI sysfs.

This script is not perfect, as one can imagine PCI devices bound to
vfio-pci for virtual machines.
Plus, this script only tries to take over vfio-pci devices. mlx5 devices
can't be taken over blindly as it could mean losing connectivity to the
machine if the netdev was in use for this system.

For those two reasons, add a new environment variable DPDK_PCI_ADDR for
testers to select the PCI device of their liking.
For consistency and grep, the temporary file PCI_ADDR is renamed
to DPDK_PCI_ADDR.

Reviewed-by: Maxime Coquelin 
Acked-by: Eelco Chaudron 
Signed-off-by: David Marchand 
---
Changes since v4:
- separated from the original series,
- rebased,
- dropped mlx5 devices from the discovery script,
- documented DPDK_PCI_ADDR env variable,

Changes since v3:
- fixed nit from Maxime,

Changes since v2:
- sorted logs alphabetically,

---
 Documentation/topics/testing.rst | 11 ++---
 tests/automake.mk|  1 +
 tests/system-dpdk-find-device.py | 39 
 tests/system-dpdk-macros.at  | 10 ++--
 tests/system-dpdk.at | 14 ++--
 5 files changed, 57 insertions(+), 18 deletions(-)
 create mode 100755 tests/system-dpdk-find-device.py

diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst
index 5f6940b84d..fb9b3e77b1 100644
--- a/Documentation/topics/testing.rst
+++ b/Documentation/topics/testing.rst
@@ -343,15 +343,20 @@ To see a list of all the available tests, run::
 
 These tests support a `DPDK supported NIC`_. The tests operate on a wider set 
of
 environments, for instance, when a virtual port is used.
-They do require proper DPDK variables (``DPDK_DIR`` and ``DPDK_BUILD``).
 Moreover you need to have root privileges to load the required modules and to 
bind
-the NIC to the DPDK-compatible driver.
+a PCI device to the DPDK-compatible driver.
 
 .. _DPDK supported NIC: https://core.dpdk.org/supported/#nics
 
+The phy test will skip if no suitable PCI device is found.
+It is possible to select which PCI device is used for this test by setting the
+DPDK_PCI_ADDR environment variable, which is especially useful when testing
+with a mlx5 device::
+
+# DPDK_PCI_ADDR=:82:00.0 make check-dpdk
+
 All tests are skipped if no hugepages are configured. User must look into the 
DPDK
 manual to figure out how to `Configure hugepages`_.
-The phy test will skip if no compatible physical device is available.
 
 .. _Configure hugepages: 
https://doc.dpdk.org/guides-22.11/linux_gsg/sys_reqs.html
 
diff --git a/tests/automake.mk b/tests/automake.mk
index f8a925012d..69117514cc 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -192,6 +192,7 @@ SYSTEM_OFFLOADS_TESTSUITE_AT = \
 
 SYSTEM_DPDK_TESTSUITE_AT = \
tests/system-common-macros.at \
+   tests/system-dpdk-find-device.py \
tests/system-dpdk-macros.at \
tests/system-dpdk-testsuite.at \
tests/system-dpdk.at
diff --git a/tests/system-dpdk-find-device.py b/tests/system-dpdk-find-device.py
new file mode 100755
index 00..b0997380bc
--- /dev/null
+++ b/tests/system-dpdk-find-device.py
@@ -0,0 +1,39 @@
+#!/usr/bin/env python3
+# Copyright (c) 2021 Red Hat, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at:
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from pathlib import Path
+import os
+import sys
+
+# The tester might want to select a PCI device, if so, trust it.
+if 'DPDK_PCI_ADDR' in os.environ:
+print(os.environ['DPDK_PCI_ADDR'])
+sys.exit(0)
+
+for device in sorted(Path('/sys/bus/pci/devices').iterdir()):
+class_path = device / 'class'
+# Only consider Network class devices
+if class_path.read_text().strip() != '0x02':
+continue
+kmod_path = device / 'driver' / 'module'
+kmod_name = kmod_path.resolve().name
+# Only care about devices bound to vfio_pci or igb_uio.
+if kmod_name not in ['vfio_pci', 'igb_uio']:
+continue
+print(device.resolve().name)
+sys.exit(0)
+
+sys.exit(1)
diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index dcdfa55741..3b5a3512d4 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/s

[ovs-dev] [dpdk-latest 2/2] ci: Reduce optional libraries in DPDK.

2023-11-22 Thread David Marchand
Since DPDK v23.11, it is possible to select more easily which optional
library is enabled.

OVS needs the vhost library (and its dependencies).
The net/tap DPDK driver needs the gso library.
Other optional library can be disabled.

This reduces the cache entry for DPDK from ~7MB to ~4MB.

Signed-off-by: David Marchand 
---
 .ci/dpdk-build.sh | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh
index 8380c48637..698b9e1b14 100755
--- a/.ci/dpdk-build.sh
+++ b/.ci/dpdk-build.sh
@@ -42,6 +42,9 @@ function build_dpdk()
 DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd"
 enable_drivers="net/null,net/af_xdp,net/tap,net/virtio"
 DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers"
+# OVS depends on the vhost library (and its dependencies).
+# net/tap depends on the gso library.
+DPDK_OPTS="$DPDK_OPTS -Denable_libs=cryptodev,dmadev,gso,vhost"
 
 # Install DPDK using prefix.
 DPDK_OPTS="$DPDK_OPTS --prefix=$DPDK_INSTALL_DIR"
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [dpdk-latest 1/2] ci: Cache DPDK installed libraries only.

2023-11-22 Thread David Marchand
Rather than save the whole DPDK sources and build artefacts, checkout
sources in a separate directory and build DPDK there.
Only the installed artefacts are then going to the cache.
Example sources in the share/dpdk installed directory can be pruned too.

This makes a (v23.11-rc3) DPDK cache entry size go from ~120MB to ~7MB.

Signed-off-by: David Marchand 
---
 .ci/dpdk-build.sh  | 25 +++--
 .ci/linux-build.sh |  9 +
 2 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh
index aa83e44643..8380c48637 100755
--- a/.ci/dpdk-build.sh
+++ b/.ci/dpdk-build.sh
@@ -5,25 +5,27 @@ set -x
 
 function build_dpdk()
 {
-local VERSION_FILE="dpdk-dir/cached-version"
 local DPDK_VER=$1
 local DPDK_OPTS=""
+local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir"
+local VERSION_FILE="$DPDK_INSTALL_DIR/cached-version"
 
-rm -rf dpdk-dir
+rm -rf dpdk-src
+rm -rf $DPDK_INSTALL_DIR
 
 if [ "${DPDK_VER##refs/*/}" != "${DPDK_VER}" ]; then
-git clone --single-branch $DPDK_GIT dpdk-dir -b "${DPDK_VER##refs/*/}"
-pushd dpdk-dir
+git clone --single-branch $DPDK_GIT dpdk-src -b "${DPDK_VER##refs/*/}"
+pushd dpdk-src
 git log -1 --oneline
 else
 wget https://fast.dpdk.org/rel/dpdk-$1.tar.xz
 tar xvf dpdk-$1.tar.xz > /dev/null
 DIR_NAME=$(tar -tf dpdk-$1.tar.xz | head -1 | cut -f1 -d"/")
-mv ${DIR_NAME} dpdk-dir
-pushd dpdk-dir
+mv ${DIR_NAME} dpdk-src
+pushd dpdk-src
 fi
 
-# Switching to 'default' machine to make dpdk-dir cache usable on
+# Switching to 'default' machine to make the dpdk cache usable on
 # different CPUs. We can't be sure that all CI machines are exactly same.
 DPDK_OPTS="$DPDK_OPTS -Dmachine=default"
 
@@ -42,14 +44,17 @@ function build_dpdk()
 DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers"
 
 # Install DPDK using prefix.
-DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build"
+DPDK_OPTS="$DPDK_OPTS --prefix=$DPDK_INSTALL_DIR"
 
 meson $DPDK_OPTS build
 ninja -C build
 ninja -C build install
-
-echo "Installed DPDK in $(pwd)"
 popd
+
+# Remove examples sources.
+rm -rf $DPDK_INSTALL_DIR/share/dpdk/examples
+
+echo "Installed DPDK in $DPDK_INSTALL_DIR"
 echo "${DPDK_VER}" > ${VERSION_FILE}
 }
 
diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 170f4730f3..ea1e6599f4 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -12,8 +12,9 @@ EXTRA_OPTS="--enable-Werror"
 
 function install_dpdk()
 {
-local VERSION_FILE="dpdk-dir/cached-version"
-local DPDK_LIB=$(pwd)/dpdk-dir/build/lib/x86_64-linux-gnu
+local DPDK_INSTALL_DIR="$(pwd)/dpdk-dir"
+local VERSION_FILE="${DPDK_INSTALL_DIR}/cached-version"
+local DPDK_LIB=${DPDK_INSTALL_DIR}/lib/x86_64-linux-gnu
 
 if [ "$DPDK_SHARED" ]; then
 EXTRA_OPTS="$EXTRA_OPTS --with-dpdk=shared"
@@ -29,13 +30,13 @@ function install_dpdk()
 export PATH=$(pwd)/dpdk-dir/build/bin:$PATH
 
 if [ ! -f "${VERSION_FILE}" ]; then
-echo "Could not find DPDK in $(pwd)/dpdk-dir"
+echo "Could not find DPDK in $DPDK_INSTALL_DIR"
 return 1
 fi
 
 # Update the library paths.
 sudo ldconfig
-echo "Found cached DPDK $(cat ${VERSION_FILE}) build in $(pwd)/dpdk-dir"
+echo "Found cached DPDK $(cat ${VERSION_FILE}) build in $DPDK_INSTALL_DIR"
 }
 
 function configure_ovs()
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v9 11/11] system-dpdk: Run traffic tests.

2023-11-21 Thread David Marchand
On Mon, Nov 20, 2023 at 4:58 PM David Marchand
 wrote:
>
> Integrate system-traffic.at tests as part of check-dpdk.
>
> Some tests that can't work with the userspace datapath are skipped by
> overriding some OVS_CHECK_* macros.
>
> ADD_VETH is implemented using the net/af_xdp DPDK driver.
>
> Signed-off-by: David Marchand 
> Acked-by: Eelco Chaudron 

I ran this series 10 times with no failure in GHA.
https://github.com/david-marchand/ovs/actions/runs/6882128624 (you may
browse through the 10 attempts through the "Latest #10" button in this
page though you need to be logged in)

Intel CI seems happy this time for every patch.

The robot also reported no issue when testing per patch.

SHIP IT!


Notes:
- there is an issue with "recent" glibc which will be looked at
separately to this series.
  For the detail on this topic, starting glibc 2.37 (iow Fedora 38 /
Ubuntu 23.04), dpdk logs are split at a 128 bytes boundary which
breaks log matching in the unit tests.
- this series runs fine on the dpdk-latest branch too,


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v9 11/11] system-dpdk: Run traffic tests.

2023-11-20 Thread David Marchand
Integrate system-traffic.at tests as part of check-dpdk.

Some tests that can't work with the userspace datapath are skipped by
overriding some OVS_CHECK_* macros.

ADD_VETH is implemented using the net/af_xdp DPDK driver.

Signed-off-by: David Marchand 
Acked-by: Eelco Chaudron 
---
Changes since v6:
- fixed some checkpatch warning,

Changes since v4:
- switched to net/af_xdp, this removes the tweaking needed for net/tap,
  and it lets existing tool relying on kernel netdevs. veth offloading
  still needs some tweaking,

Changes since v3:
- reverted --dummy-numa and opted for configuring a number of rxqs
  relevant to the number of NUMA sockets,

Changes since v2:
- added ADD_VETH_IGNORE_LOGS and moved ignored error logs to
  OVS_TRAFFIC_VSWITCHD_STOP,
- added --no-pci to DPDK options to avoid failing the tests when
  running in a vm with a virtio-net device,
- faked a mono numa/mono core so that OVS requests at max 2 txqs on
  the net/tap port,

---
 .ci/dpdk-build.sh|  3 +-
 .github/workflows/build-and-test.yml |  2 +-
 tests/system-dpdk-macros.at  | 77 
 tests/system-dpdk-testsuite.at   |  2 +
 tests/system-dpdk.at |  3 --
 5 files changed, 82 insertions(+), 5 deletions(-)

diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh
index 35540f0694..aa83e44643 100755
--- a/.ci/dpdk-build.sh
+++ b/.ci/dpdk-build.sh
@@ -38,7 +38,8 @@ function build_dpdk()
 # any DPDK driver.
 # check-dpdk unit tests requires testpmd and some net/ driver.
 DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd"
-DPDK_OPTS="$DPDK_OPTS -Denable_drivers=net/null,net/tap,net/virtio"
+enable_drivers="net/null,net/af_xdp,net/tap,net/virtio"
+DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers"
 
 # Install DPDK using prefix.
 DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build"
diff --git a/.github/workflows/build-and-test.yml 
b/.github/workflows/build-and-test.yml
index 4f62efb7c3..09654205e7 100644
--- a/.github/workflows/build-and-test.yml
+++ b/.github/workflows/build-and-test.yml
@@ -5,7 +5,7 @@ on: [push, pull_request]
 jobs:
   build-dpdk:
 env:
-  dependencies: gcc libnuma-dev ninja-build
+  dependencies: gcc libbpf-dev libnuma-dev ninja-build pkgconf
   CC: gcc
   DPDK_GIT: https://dpdk.org/git/dpdk-stable
   DPDK_VER: 22.11.1
diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index 7fedfd6515..dcdfa55741 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -127,3 +127,80 @@ m4_define([OVS_DPDK_STOP_TESTPMD],
   [AT_CHECK([kill `cat testpmd.pid`])
OVS_WAIT([kill -0 `cat testpmd.pid`], [kill -9 `cat testpmd.pid`])
 ])
+
+
+# OVS_TRAFFIC_VSWITCHD_START([vsctl-args], [vsctl-output], [dbinit-aux-args])
+#
+# Creates a database and starts ovsdb-server, starts ovs-vswitchd
+# connected to that database, calls ovs-vsctl to create a bridge named
+# br0 with predictable settings, passing 'vsctl-args' as additional
+# commands to ovs-vsctl.  If 'vsctl-args' causes ovs-vsctl to provide
+# output (e.g. because it includes "create" commands) then 'vsctl-output'
+# specifies the expected output after filtering through uuidfilt.
+# 'dbinit-aux-args' are passed as additional commands to 'ovs-vsctl init'
+# before starting ovs-vswitchd.
+m4_define([OVS_TRAFFIC_VSWITCHD_START],
+  [
+   OVS_DPDK_PRE_CHECK()
+   OVS_WAIT_WHILE([ip link show ovs-netdev])
+   dnl For functional tests, no need for DPDK PCI probing.
+   OVS_DPDK_START([--no-pci], [--disable-system], [$3])
+   dnl Add bridges, ports, etc.
+   OVS_WAIT_WHILE([ip link show br0])
+   AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| 
uuidfilt])], [0], [$2])
+])
+
+
+# OVS_TRAFFIC_VSWITCHD_STOP([ALLOWLIST], [extra_cmds])
+#
+# Gracefully stops ovs-vswitchd and ovsdb-server, checking their log files
+# for messages with severity WARN or higher and signaling an error if any
+# is present.  The optional ALLOWLIST may contain shell-quoted "sed"
+# commands to delete any warnings that are actually expected, e.g.:
+#
+#   OVS_TRAFFIC_VSWITCHD_STOP(["/expected error/d"])
+#
+# 'extra_cmds' are shell commands to be executed after OVS_VSWITCHD_STOP() is
+# invoked. They can be used to perform additional cleanups such as name space
+# removal.
+m4_define([OVS_TRAFFIC_VSWITCHD_STOP],
+  [OVS_DPDK_STOP_VSWITCHD([$1])
+   AT_CHECK([:; $2])
+])
+
+
+# Plug a veth into OVS via DPDK net/af_xdp.
+m4_define([ADD_VETH],
+[ AT_CHECK([ip link add $1 type veth peer name ovs-$1 || return 77])
+  CONFIGURE_VETH_OFFLOADS([$1])
+  AT_CHECK([ip link set $1 netns $2])
+  AT_CHECK([ip link set dev ovs-$1 up])
+  AT_CHECK([ovs-vsctl add-port $3 ovs-$1 -- \
+set interface ovs-$1 external-ids:iface-id="$1" -- \
+set interface ovs-$1 type=dpdk -- \
+set interface ovs-$1 
options:dpd

[ovs-dev] [PATCH v9 09/11] system-dpdk: Refactor tests using vhost-user ports.

2023-11-20 Thread David Marchand
Introduce macros responsible for adding a vhost-user / vhost-user client
port to a userspace datapath bridge and check associated logs.

Signed-off-by: David Marchand 
---
Changes since v8:
- added this patch to avoid issues like what is fixed by the previous
  patch in this series,

---
 tests/system-dpdk.at | 123 +--
 1 file changed, 36 insertions(+), 87 deletions(-)

diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index b7ae4af406..4da2afd683 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -18,6 +18,29 @@ m4_define([CHECK_MEMPOOL_PARAM], [
| grep "User configured shared mempool set for: MTU $1, 
NUMA $2."])
 ])
 
+dnl ADD_VHOST_USER_CLIENT_PORT([bridge], [port], [socket])
+dnl Add a dpdk vhost-user client port to a bridge and check this port is ready
+dnl to be used by looking at the logs.
+m4_define([ADD_VHOST_USER_CLIENT_PORT], [
+AT_CHECK([ovs-vsctl add-port $1 $2 -- \
+  set Interface $2 type=dpdkvhostuserclient 
options:vhost-server-path=$3],
+ [], [stdout], [stderr])
+OVS_WAIT_UNTIL([grep "VHOST_CONFIG: ($3) vhost-user client: socket 
created" ovs-vswitchd.log])
+OVS_WAIT_UNTIL([grep "vHost User device '$2' created in 'client' mode, 
using client socket" ovs-vswitchd.log])
+OVS_WAIT_UNTIL([grep "VHOST_CONFIG: ($3) reconnecting..." 
ovs-vswitchd.log])
+])
+
+dnl ADD_VHOST_USER_PORT([bridge], [port], [socket])
+dnl Add a dpdk vhost-user port to a bridge and check this port is ready
+dnl to be used by looking at the logs.
+m4_define([ADD_VHOST_USER_PORT], [
+AT_CHECK([ovs-vsctl add-port $1 $2 -- set Interface $2 
type=dpdkvhostuser], [],
+ [stdout], [stderr])
+OVS_WAIT_UNTIL([grep "VHOST_CONFIG: ($3) vhost-user server: socket 
created" ovs-vswitchd.log])
+OVS_WAIT_UNTIL([grep "Socket $3 created for vhost-user port $2" 
ovs-vswitchd.log])
+OVS_WAIT_UNTIL([grep "VHOST_CONFIG: ($3) binding succeeded" 
ovs-vswitchd.log])
+])
+
 dnl --
 dnl Check if EAL init is successful
 AT_SETUP([OVS-DPDK - EAL init])
@@ -64,15 +87,10 @@ OVS_DPDK_START([--no-pci])
 
 dnl Add userspace bridge and attach it to OVS
 AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev])
-AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuserclient0 -- set Interface 
dpdkvhostuserclient0 type=dpdkvhostuserclient 
options:vhost-server-path=$OVS_RUNDIR/dpdkvhostclient0], [], [stdout], [stderr])
+ADD_VHOST_USER_CLIENT_PORT([br10], [dpdkvhostuserclient0], 
[$OVS_RUNDIR/dpdkvhostclient0])
 AT_CHECK([ovs-vsctl show], [], [stdout])
 sleep 2
 
-dnl Parse log file
-AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user 
client: socket created" ovs-vswitchd.log], [], [stdout])
-AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' 
mode, using client socket" ovs-vswitchd.log], [], [stdout])
-AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." 
ovs-vswitchd.log], [], [stdout])
-
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
 OVS_DPDK_STOP_VSWITCHD(["dnl
@@ -92,19 +110,9 @@ OVS_DPDK_START([--no-pci])
 
 dnl Add userspace bridge and attach it to OVS
 AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev])
-AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuser0 -- set Interface 
dpdkvhostuser0 \
-  type=dpdkvhostuser], [],
- [stdout], [stderr])
+ADD_VHOST_USER_PORT([br10], [dpdkvhostuser0], [$OVS_RUNDIR/dpdkvhostuser0])
 AT_CHECK([ovs-vsctl show], [], [stdout])
 
-dnl Parse log file
-AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostuser0) vhost-user server: 
socket created" \
-  ovs-vswitchd.log], [], [stdout])
-AT_CHECK([grep "Socket $OVS_RUNDIR/dpdkvhostuser0 created for vhost-user port 
dpdkvhostuser0" \
-  ovs-vswitchd.log], [], [stdout])
-AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostuser0) binding succeeded" 
ovs-vswitchd.log], [],
- [stdout])
-
 dnl Set up namespaces
 ADD_NAMESPACES(ns1, ns2)
 
@@ -158,18 +166,9 @@ OVS_DPDK_START([--no-pci])
 
 dnl Add userspace bridge and attach it to OVS
 AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev])
-AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuserclient0 -- set Interface \
-  dpdkvhostuserclient0 \
-  type=dpdkvhostuserclient \
-  options:vhost-server-path=$OVS_RUNDIR/dpdkvhostclient0], [],
- [stdout], [stderr])
+ADD_VHOST_USER_CLIENT_PORT([br10], [dpdkvhostuserclient0], 
[$OVS_RUNDIR/dpdkvhostclient0])
 AT_CHECK([ovs-vsctl show], [], [stdout])
 
-dnl Parse log file
-AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user 
client: socket created" ovs-vswitchd.log], [], [stdout])
-AT

[ovs-dev] [PATCH v9 08/11] system-dpdk: Fix race in some vhost-user client MTU test.

2023-11-20 Thread David Marchand
Adding those grep gives enough time to OVS to be ready to connect with
the testpmd virtio-user port instantiated afterward.

Fixes: bf47829116a8 ("tests: Add OVS-DPDK MTU unit tests.")
Signed-off-by: David Marchand 
---
Changes since v8:
- added this patch for random failures observed on my laptop,

---
 tests/system-dpdk.at | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index c125d8051f..b7ae4af406 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -813,6 +813,10 @@ AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 
mtu_request=9702])
 AT_CHECK([ovs-vsctl show], [], [stdout])
 sleep 2
 
+AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) vhost-user 
client: socket created" ovs-vswitchd.log], [], [stdout])
+AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' 
mode, using client socket" ovs-vswitchd.log], [], [stdout])
+AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." 
ovs-vswitchd.log], [], [stdout])
+
 
OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"])
 
 OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log])
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v9 10/11] system-dpdk: Rework cleanup for vhost-user client tests.

2023-11-20 Thread David Marchand
Those tests are subject to a race when a testpmd hosting the vhost-user
server is stopped and OVS has enough time to detect the vhost-user socket
drop and tries to reconnect to this socket.

In such a situation, the tests can fail as the OVS process with the
vhost-user client port complains with a warning log:

2023-09-08T13:15:18.160Z|00163|dpdk|INFO|VHOST_CONFIG:
(.../005/dpdkvhostclient0) vhost peer closed
2023-09-08T13:15:18.160Z|00164|netdev_dpdk|INFO|vHost Device
'.../005/dpdkvhostclient0' connection has been destroyed
2023-09-08T13:15:18.160Z|00165|dpdk|INFO|VHOST_CONFIG:
(.../005/dpdkvhostclient0) vhost-user client: socket created, fd: 24
2023-09-08T13:15:18.160Z|00166|dpdk|WARN|VHOST_CONFIG:
(.../005/dpdkvhostclient0) failed to connect: Connection refused
2023-09-08T13:15:18.160Z|00167|dpdk|INFO|VHOST_CONFIG:
(.../005/dpdkvhostclient0) reconnecting...

Invert the order of the cleanup steps.

Signed-off-by: David Marchand 
Acked-by: Eelco Chaudron 
---
Changes since v6:
- added this fix for spurious failures hit by Eelco,

---
 tests/system-dpdk.at | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index 4da2afd683..5e486d1f47 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -221,10 +221,9 @@ AT_CHECK([test `ovs-vsctl get interface 
dpdkvhostuserclient0 statistics:tx_bytes
$((`ovs-vsctl get interface dpdkvhostuserclient0 
statistics:tx_q0_good_bytes` + dnl
   `ovs-vsctl get interface dpdkvhostuserclient0 
statistics:tx_q1_good_bytes`))])
 
-OVS_DPDK_STOP_TESTPMD()
-
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
+OVS_DPDK_STOP_TESTPMD()
 OVS_DPDK_STOP_VSWITCHD(["dnl
 /VHOST_CONFIG: (.*dpdkvhostclient0) recvmsg failed/d
 /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d
@@ -619,10 +618,9 @@ AT_CHECK([ovs-vsctl get Interface dpdkvhostuserclient0 
mtu], [0], [dnl
 9000
 ])
 
-OVS_DPDK_STOP_TESTPMD()
-
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
+OVS_DPDK_STOP_TESTPMD()
 OVS_DPDK_STOP_VSWITCHD(["dnl
 /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d"])
 AT_CLEANUP
@@ -662,10 +660,9 @@ AT_CHECK([ovs-vsctl get Interface dpdkvhostuserclient0 
mtu], [0], [dnl
 2000
 ])
 
-OVS_DPDK_STOP_TESTPMD()
-
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
+OVS_DPDK_STOP_TESTPMD()
 OVS_DPDK_STOP_VSWITCHD(["dnl
 /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d"])
 AT_CLEANUP
@@ -785,10 +782,9 @@ dnl Set MTU value above upper bound and check for error
 AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=9711])
 AT_CHECK([grep "dpdkvhostuserclient0: unsupported MTU 9711" ovs-vswitchd.log], 
[], [stdout])
 
-OVS_DPDK_STOP_TESTPMD()
-
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
+OVS_DPDK_STOP_TESTPMD()
 OVS_DPDK_STOP_VSWITCHD(["dnl
 /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d
 /dpdkvhostuserclient0: unsupported MTU 9711/d
@@ -828,10 +824,9 @@ dnl Set MTU value below lower bound and check for error
 AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=67])
 AT_CHECK([grep "dpdkvhostuserclient0: unsupported MTU 67" ovs-vswitchd.log], 
[], [stdout])
 
-OVS_DPDK_STOP_TESTPMD()
-
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
+OVS_DPDK_STOP_TESTPMD()
 OVS_DPDK_STOP_VSWITCHD(["dnl
 /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d
 /dpdkvhostuserclient0: unsupported MTU 67/d
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v9 07/11] system-dpdk: Remove tap interfaces from vport MTU tests.

2023-11-20 Thread David Marchand
The unit tests for changing MTU with vhost-user ports are not using
those tap interfaces.

Signed-off-by: David Marchand 
---
Changes since v7:
- added this patch after getting regressions in Intel CI because of the
  next patch,

---
 tests/system-dpdk.at | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index f635d7f925..c125d8051f 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -640,9 +640,7 @@ AT_CHECK([grep "VHOST_CONFIG: 
($OVS_RUNDIR/dpdkvhostclient0) vhost-user client:
 AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' 
mode, using client socket" ovs-vswitchd.log], [], [stdout])
 AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." 
ovs-vswitchd.log], [], [stdout])
 
-OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"
 \
---vdev="net_tap0,iface=tap0"])
-
+OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"])
 OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log])
 OVS_WAIT_UNTIL([ovs-vsctl get Interface dpdkvhostuserclient0 link_state | grep 
-w up])
 
@@ -690,9 +688,7 @@ AT_CHECK([grep "VHOST_CONFIG: 
($OVS_RUNDIR/dpdkvhostclient0) vhost-user client:
 AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' 
mode, using client socket" ovs-vswitchd.log], [], [stdout])
 AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." 
ovs-vswitchd.log], [], [stdout])
 
-OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"
 \
---vdev="net_tap0,iface=tap0"])
-
+OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"])
 OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log])
 OVS_WAIT_UNTIL([ovs-vsctl get Interface dpdkvhostuserclient0 link_state | grep 
-w up])
 
@@ -817,8 +813,7 @@ AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 
mtu_request=9702])
 AT_CHECK([ovs-vsctl show], [], [stdout])
 sleep 2
 
-OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"
 \
---vdev="net_tap0,iface=tap0"])
+OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"])
 
 OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log])
 OVS_WAIT_UNTIL([ovs-vsctl get Interface dpdkvhostuserclient0 link_state | grep 
-w up])
@@ -866,8 +861,7 @@ AT_CHECK([grep "VHOST_CONFIG: 
($OVS_RUNDIR/dpdkvhostclient0) vhost-user client:
 AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' 
mode, using client socket" ovs-vswitchd.log], [], [stdout])
 AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." 
ovs-vswitchd.log], [], [stdout])
 
-OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"
 \
---vdev="net_tap0,iface=tap0"])
+OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"])
 
 OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log])
 OVS_WAIT_UNTIL([ovs-vsctl get Interface dpdkvhostuserclient0 link_state | grep 
-w up])
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v9 05/11] system-dpdk: Refactor OVS daemons helpers.

2023-11-20 Thread David Marchand
Align system-dpdk existing helpers to other common OVS helpers so they
can accept some optional arguments.

Introduce a OVS_DPDK_STOP_VSWITCHD wrapper around OVS_VSWITCHD_STOP to
catch dpdk related logs in a centralised fashion.

Signed-off-by: David Marchand 
Acked-by: Eelco Chaudron 
---
Changes since v6:
- did a minor cleanup on vhost-user client test log pattern (one entry
  concerned vhost-user ports, not vhost-user *client* ports),

---
 tests/system-dpdk-macros.at |  21 -
 tests/system-dpdk.at| 158 +++-
 2 files changed, 82 insertions(+), 97 deletions(-)

diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index 35d14bee8f..7fedfd6515 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -36,12 +36,13 @@ m4_define([OVS_DPDK_PRE_PHY_SKIP],
 #
 m4_define([OVS_DPDK_START],
   [dnl start ovs dpdk
-   OVS_DPDK_START_OVSDB()
+   OVS_DPDK_START_OVSDB($3)
dnl Enable DPDK functionality
AT_CHECK([ovs-vsctl --no-wait set Open_vSwitch . 
other_config:dpdk-init=true])
-   OVS_DPDK_START_VSWITCHD($1)
+   OVS_DPDK_START_VSWITCHD([$1], [$2])
 ])
 
+
 # OVS_DPDK_START_OVSDB()
 #
 # Create an empty database and start ovsdb-server.
@@ -60,9 +61,10 @@ m4_define([OVS_DPDK_START_OVSDB],
AT_CAPTURE_FILE([ovsdb-server.log])
 
dnl Initialize database.
-   AT_CHECK([ovs-vsctl --no-wait init])
+   AT_CHECK([ovs-vsctl --no-wait init $1])
 ])
 
+
 # OVS_DPDK_START_VSWITCHD()
 #
 # Add special configuration for dpdk-init. Start ovs-vswitchd.
@@ -72,12 +74,23 @@ m4_define([OVS_DPDK_START_VSWITCHD],
AT_CHECK([ovs-vsctl --no-wait set Open_vSwitch . 
other_config:dpdk-extra="--log-level=pmd.*:error $1"])
 
dnl Start ovs-vswitchd.
-   AT_CHECK([ovs-vswitchd --detach --no-chdir --pidfile --log-file -vvconn 
-vofproto_dpif -vunixctl], [0], [stdout], [stderr])
+   AT_CHECK([ovs-vswitchd $2 --detach --no-chdir --pidfile --log-file -vvconn 
-vofproto_dpif -vunixctl], [0], [stdout], [stderr])
AT_CAPTURE_FILE([ovs-vswitchd.log])
on_exit "kill_ovs_vswitchd `cat ovs-vswitchd.pid`"
 ])
 
 
+m4_define([OVS_DPDK_STOP_VSWITCHD],
+  [OVS_VSWITCHD_STOP([dnl
+$1";/does not exist. The Open vSwitch kernel module is probably not loaded./d
+/does not support MTU configuration,/d
+/EAL: No \(available\|free\) .*hugepages reported/d
+/Failed to enable flow control/d
+/Rx checksum offload is not supported on/d
+/TELEMETRY: No legacy callbacks, legacy socket not created/d"])
+])
+
+
 # OVS_DPDK_CHECK_TESTPMD()
 #
 # Check dpdk-testpmd availability.
diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index ab232e06da..f635d7f925 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -3,15 +3,6 @@ m4_define([CONFIGURE_VETH_OFFLOADS],
 
 AT_BANNER([OVS-DPDK unit tests])
 
-m4_define([SYSTEM_DPDK_ALLOWED_LOGS],[
-\@does not exist. The Open vSwitch kernel module is probably not loaded.@d
-\@does not support MTU configuration,@d
-\@EAL: No \(available\|free\) .*hugepages reported@d
-\@Failed to enable flow control@d
-\@Rx checksum offload is not supported on@d
-\@TELEMETRY: No legacy callbacks, legacy socket not created@d
-])
-
 dnl CHECK_MEMPOOL_PARAM([mtu], [numa], [+line])
 dnl
 dnl Waits for logs to indicate that the user has configured a mempool
@@ -36,7 +27,7 @@ OVS_DPDK_START([--no-pci])
 AT_CHECK([grep "DPDK Enabled - initializing..." ovs-vswitchd.log], [], 
[stdout])
 AT_CHECK([grep "EAL" ovs-vswitchd.log], [], [stdout])
 AT_CHECK([grep "DPDK Enabled - initialized" ovs-vswitchd.log], [], [stdout])
-OVS_VSWITCHD_STOP("[SYSTEM_DPDK_ALLOWED_LOGS]")
+OVS_DPDK_STOP_VSWITCHD
 AT_CLEANUP
 dnl --
 
@@ -58,7 +49,7 @@ sleep 2
 
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 phy0], [], [stdout], [stderr])
-OVS_VSWITCHD_STOP("[SYSTEM_DPDK_ALLOWED_LOGS]")
+OVS_DPDK_STOP_VSWITCHD
 AT_CLEANUP
 dnl --
 
@@ -84,9 +75,8 @@ AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) 
reconnecting..." ov
 
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
-OVS_VSWITCHD_STOP("m4_join([], [SYSTEM_DPDK_ALLOWED_LOGS], [
-\@VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) failed to connect: No such file 
or directory@d
-])")
+OVS_DPDK_STOP_VSWITCHD(["dnl
+/VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d"])
 AT_CLEANUP
 dnl --
 
@@ -150,12 +140,11 @@ OVS_WAIT_UNTIL([grep "vHost Device 
'$OVS_RUNDIR/dpdkvhostuser0' has been removed
 
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuser0], [], [stdout], [stderr])
-OVS_VSWITCHD_STOP("m4_join([], [SYSTEM_DPDK_ALLOWED_LOGS], [
-\@VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostuser0) recvmsg

[ovs-dev] [PATCH v9 06/11] netdev-afxdp: Postpone libbpf logging helper registration.

2023-11-20 Thread David Marchand
When using net/af_xdp DPDK driver along OVS native AF_XDP support,
confusing logs are reported, like:

netdev_dpdk|INFO|Device 'net_af_xdpp0,iface=ovs-p0' attached to DPDK
dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 11 created.
dpif_netdev|INFO|There are 1 pmd threads on numa node 0
dpdk|INFO|Device with port_id=0 already stopped
dpdk(pmd-c11/id:22)|INFO|PMD thread uses DPDK lcore 1.
netdev_dpdk|WARN|Rx checksum offload is not supported on port 0
netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(6)
.xdp_run_config
netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7)
xdp_metadata
netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7)
xdp_metadata
netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7)
xdp_metadata

This comes from the fact that netdev-afxdp unconditionnally registers a
helper for logging libbpf messages.
Making both net/af_xdp and netdev-afxdp work at the same time seems
difficult, so at least, ensure that netdev-afxdp won't register this
helper unless a netdev is actually allocated.

Signed-off-by: David Marchand 
Acked-by: Eelco Chaudron 
---
 lib/netdev-afxdp.c | 12 ++--
 lib/netdev-afxdp.h |  1 -
 lib/netdev-linux.c |  1 -
 3 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index b680a14798..54029722e0 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -1193,18 +1193,18 @@ libbpf_print(enum libbpf_print_level level,
 return 0;
 }
 
-int netdev_afxdp_init(void)
-{
-libbpf_set_print(libbpf_print);
-return 0;
-}
-
 int
 netdev_afxdp_construct(struct netdev *netdev)
 {
+static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
 struct netdev_linux *dev = netdev_linux_cast(netdev);
 int ret;
 
+if (ovsthread_once_start()) {
+libbpf_set_print(libbpf_print);
+ovsthread_once_done();
+}
+
 /* Configure common netdev-linux first. */
 ret = netdev_linux_construct(netdev);
 if (ret) {
diff --git a/lib/netdev-afxdp.h b/lib/netdev-afxdp.h
index bd3b9dfbea..236a37cc84 100644
--- a/lib/netdev-afxdp.h
+++ b/lib/netdev-afxdp.h
@@ -47,7 +47,6 @@ struct xsk_socket_info;
 
 int netdev_afxdp_rxq_construct(struct netdev_rxq *rxq_);
 void netdev_afxdp_rxq_destruct(struct netdev_rxq *rxq_);
-int netdev_afxdp_init(void);
 int netdev_afxdp_construct(struct netdev *netdev_);
 void netdev_afxdp_destruct(struct netdev *netdev_);
 int netdev_afxdp_verify_mtu_size(const struct netdev *netdev, int mtu);
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 70521e3c7f..18b62d90c2 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -3754,7 +3754,6 @@ const struct netdev_class netdev_internal_class = {
 
 #ifdef HAVE_AF_XDP
 #define NETDEV_AFXDP_CLASS_COMMON   \
-.init = netdev_afxdp_init,  \
 .construct = netdev_afxdp_construct,\
 .destruct = netdev_afxdp_destruct,  \
 .get_stats = netdev_afxdp_get_stats,\
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v9 04/11] tests: Define a macro to skip tc relying tests.

2023-11-20 Thread David Marchand
Some unit tests expect that a OVS port has an associated netdevice on
which they can hook tc.
This will not be possible when testing the userspace datapath with DPDK.
Introduce a helper (which will be overriden in system-dpdk tests) and
use it in the existing tests.

Acked-by: Aaron Conole 
Signed-off-by: David Marchand 
Acked-by: Eelco Chaudron 
---
Changes since v4:
- as the traffic tests now use net/af_xdp, it is not required to skip
  tests relying on tcpdump anymore,

---
 tests/system-common-macros.at| 6 ++
 tests/system-offloads-traffic.at | 6 +++---
 tests/system-traffic.at  | 6 +++---
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
index 0077a8609c..0113aae8bd 100644
--- a/tests/system-common-macros.at
+++ b/tests/system-common-macros.at
@@ -297,6 +297,12 @@ m4_define([OVS_START_L7],
 #
 m4_define([OFPROTO_CLEAR_DURATION_IDLE], [[sed -e 
's/duration=.*s,/duration=,/g' -e 
's/idle_age=[0-9]*,/idle_age=,/g']])
 
+# OVS_CHECK_TC_QDISC()
+#
+# Macro to skip tests when tc qdisc can't be applied on a OVS port.
+m4_define([OVS_CHECK_TC_QDISC],
+[AT_SKIP_IF([test $HAVE_TC = no])])
+
 # OVS_CHECK_TUNNEL_TSO()
 #
 # Macro to be used in general tunneling tests that could be also
diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at
index 5ad6b4bfdf..0bedee7530 100644
--- a/tests/system-offloads-traffic.at
+++ b/tests/system-offloads-traffic.at
@@ -20,7 +20,7 @@ m4_define([OVS_CHECK_ACTIONS], [
 
 m4_define([CHECK_TC_INGRESS_PPS],
 [
-AT_SKIP_IF([test $HAVE_TC = "no"])
+OVS_CHECK_TC_QDISC()
 AT_CHECK([ip link add ovs_tc_pps0 type veth peer name ovs_tc_pps1 dnl
   || exit 77])
 on_exit 'ip link del ovs_tc_pps0'
@@ -95,7 +95,7 @@ AT_CLEANUP
 
 AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - 
offloads disabled])
 AT_KEYWORDS([ingress_policing])
-AT_SKIP_IF([test $HAVE_TC = "no"])
+OVS_CHECK_TC_QDISC()
 OVS_TRAFFIC_VSWITCHD_START()
 AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw-offload=false])
 AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
@@ -118,7 +118,7 @@ AT_CLEANUP
 
 AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - 
offloads enabled])
 AT_KEYWORDS([ingress_policing])
-AT_SKIP_IF([test $HAVE_TC = "no"])
+OVS_CHECK_TC_QDISC()
 OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . 
other_config:hw-offload=true])
 AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
 ADD_NAMESPACES(at_ns0)
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 7ea4502028..a7d4ed83bd 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -2321,7 +2321,7 @@ AT_CLEANUP
 AT_BANNER([QoS])
 
 AT_SETUP([QoS - basic configuration])
-AT_SKIP_IF([test $HAVE_TC = no])
+OVS_CHECK_TC_QDISC()
 OVS_TRAFFIC_VSWITCHD_START()
 
 ADD_NAMESPACES(at_ns0, at_ns1)
@@ -2355,7 +2355,7 @@ OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
 AT_SETUP([QoS - 64bit])
-AT_SKIP_IF([test $HAVE_TC = no])
+OVS_CHECK_TC_QDISC()
 AT_SKIP_IF([test $HAVE_TCA_HTB_RATE64 = no])
 OVS_TRAFFIC_VSWITCHD_START()
 
@@ -2383,7 +2383,7 @@ OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
 AT_SETUP([Ingress Policing - 64-bit])
-AT_SKIP_IF([test $HAVE_TC = no])
+OVS_CHECK_TC_QDISC()
 AT_SKIP_IF([test $HAVE_TCA_POLICE_PKTRATE64 = no])
 OVS_TRAFFIC_VSWITCHD_START()
 ADD_NAMESPACES(ns0)
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v9 02/11] system-dpdk: Don't require hugetlbfs.

2023-11-20 Thread David Marchand
dpdk-testpmd does not need hugetlbfs backing as we don't require
multiprocess support in OVS unit tests.

Switch to --in-memory and remove the (then unneeded) check on
hugetlbfs presence.

Acked-by: Aaron Conole 
Acked-by: Eelco Chaudron 
Signed-off-by: David Marchand 
---
Changes since v7:
- reverted use of dynamic allocations and kept initial memory
  reservation: this avoids random failures when testpmd and ovs-dpdk
  try to allocate memory at the same time,

---
 tests/system-dpdk-macros.at | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index 2cfd26d840..a176a57a4b 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -7,9 +7,6 @@ m4_define([OVS_DPDK_PRE_CHECK],
   [dnl Check Hugepages
AT_CHECK([cat /proc/meminfo], [], [stdout])
AT_SKIP_IF([grep -E 'HugePages_Free: *0' stdout], [], [stdout])
-   AT_CHECK([mount], [], [stdout])
-   AT_CHECK([grep 'hugetlbfs' stdout], [], [stdout], [])
-
 ])
 
 
@@ -97,7 +94,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD],
 m4_define([OVS_DPDK_START_TESTPMD],
   [AT_CHECK([lscpu], [], [stdout])
AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' > NUMA_NODE])
-   eal_options="--socket-mem="$(cat NUMA_NODE)" --file-prefix page0 
--single-file-segments --no-pci"
+   eal_options="--in-memory --socket-mem="$(cat NUMA_NODE)" 
--single-file-segments --no-pci"
options="$1"
test "$options" != "${options%% -- *}" || options="$options -- "
eal_options="$eal_options ${options%% -- *}"
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v9 03/11] ci: Run DPDK tests in GitHub Actions.

2023-11-20 Thread David Marchand
Let's enhance our coverage in the CI and run DPDK system tests.

A few DPDK drivers are enabled in DPDK compilation.

Put DPDK build in $PATH for dpdk-testpmd to be available.
sudo drops PATH= updates and -E alone does not seem to preserve this
variable.
Pass PATH=$PATH when running the tests, as a workaround.
Since those tests are run as root, the collection of logs is updated
accordingly.

In GHA, only two cores are available but some test rely on testpmd using
three lcores.
Add a DPDK_EAL_OPTIONS environment variable and use it to map all
testpmd lcores to core 1 (and leave core 0 alone for OVS main and PMD
threads).

Signed-off-by: David Marchand 
Acked-by: Aaron Conole 
Acked-by: Eelco Chaudron 
---
Changes since v1:
- rebased after DPDK build has been moved out of linux-build.sh,
- restored running "normal" checks in the DPDK jobs,

---
 .ci/dpdk-build.sh|  7 ---
 .ci/linux-build.sh   | 15 ++-
 .github/workflows/build-and-test.yml |  7 ---
 tests/system-dpdk-macros.at  |  2 +-
 4 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh
index 02dcefef61..35540f0694 100755
--- a/.ci/dpdk-build.sh
+++ b/.ci/dpdk-build.sh
@@ -35,9 +35,10 @@ function build_dpdk()
 DPDK_OPTS="$DPDK_OPTS -Ddeveloper_mode=disabled"
 
 # OVS compilation and "normal" unit tests (run in the CI) do not depend on
-# any DPDK driver being present.
-# We can disable all drivers to save compilation time.
-DPDK_OPTS="$DPDK_OPTS -Ddisable_drivers=*/*"
+# any DPDK driver.
+# check-dpdk unit tests requires testpmd and some net/ driver.
+DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd"
+DPDK_OPTS="$DPDK_OPTS -Denable_drivers=net/null,net/tap,net/virtio"
 
 # Install DPDK using prefix.
 DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build"
diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 8227a57487..aa2ecc5050 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -22,6 +22,9 @@ function install_dpdk()
 # Export the following path for pkg-config to find the .pc file.
 export PKG_CONFIG_PATH=$DPDK_LIB/pkgconfig/:$PKG_CONFIG_PATH
 
+# Expose dpdk binaries.
+export PATH=$(pwd)/dpdk-dir/build/bin:$PATH
+
 if [ ! -f "${VERSION_FILE}" ]; then
 echo "Could not find DPDK in $(pwd)/dpdk-dir"
 return 1
@@ -113,7 +116,7 @@ fi
 
 OPTS="${EXTRA_OPTS} ${OPTS} $*"
 
-if [ "$TESTSUITE" ]; then
+if [ "$TESTSUITE" = 'test' ]; then
 # 'distcheck' will reconfigure with required options.
 # Now we only need to prepare the Makefile without sparse-wrapped CC.
 configure_ovs
@@ -123,6 +126,16 @@ if [ "$TESTSUITE" ]; then
 TESTSUITEFLAGS=-j4 RECHECK=yes
 else
 build_ovs
+for testsuite in $TESTSUITE; do
+run_as_root=
+if [ "${testsuite##*dpdk}" != "$testsuite" ]; then
+sudo sh -c 'echo 1024 > /proc/sys/vm/nr_hugepages' || true
+[ "$(cat /proc/sys/vm/nr_hugepages)" = '1024' ]
+export DPDK_EAL_OPTIONS="--lcores 0@1,1@1,2@1"
+run_as_root="sudo -E PATH=$PATH"
+fi
+$run_as_root make $testsuite TESTSUITEFLAGS=-j4 RECHECK=yes
+done
 fi
 
 exit 0
diff --git a/.github/workflows/build-and-test.yml 
b/.github/workflows/build-and-test.yml
index bc5494e863..4f62efb7c3 100644
--- a/.github/workflows/build-and-test.yml
+++ b/.github/workflows/build-and-test.yml
@@ -123,10 +123,10 @@ jobs:
 opts: --enable-shared
 
   - compiler: gcc
-testsuite:test
+testsuite:check check-dpdk
 dpdk: dpdk
   - compiler: clang
-testsuite:test
+testsuite:check check-dpdk
 dpdk: dpdk
 
   - compiler: gcc
@@ -213,7 +213,8 @@ jobs:
 mkdir logs
 cp config.log ./logs/
 cp -r ./*/_build/sub/tests/testsuite.* ./logs/ || true
-tar -czvf logs.tgz logs/
+sudo cp -r ./tests/*testsuite.* ./logs/ || true
+sudo tar -czvf logs.tgz logs/
 
 - name: upload logs on failure
   if: failure() || cancelled()
diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index a176a57a4b..35d14bee8f 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -94,7 +94,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD],
 m4_define([OVS_DPDK_START_TESTPMD],
   [AT_CHECK([lscpu], [], [stdout])
AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' > NUMA_NODE])
-   eal_options="--in-memory --socket-mem="$(cat NUMA_NODE)" 
--single-file-segments --no-pci"
+   eal_options=&qu

[ovs-dev] [PATCH v9 01/11] system-dpdk: Introduce helpers for testpmd.

2023-11-20 Thread David Marchand
Rather than copy/paste everywhere, introduce helpers to control
testpmd runs.
Rely on --stats-period (which outputs port stats every n seconds) so that
testpmd keeps running without expecting any user input.

Acked-by: Aaron Conole 
Acked-by: Eelco Chaudron 
Signed-off-by: David Marchand 
---
Changes since v7:
- fixed -- detection in arguments passed to OVS_DPDK_START_TESTPMD,
- wrote the testpmd command to a debug file,

Changes since v1:
- fixed OVS_DPDK_START_TESTPMD passed arguments evaluation:: $@ -> $1,

---
 tests/system-dpdk-macros.at |  38 +
 tests/system-dpdk.at| 103 +---
 2 files changed, 62 insertions(+), 79 deletions(-)

diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index 3920f08a5e..2cfd26d840 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -79,3 +79,41 @@ m4_define([OVS_DPDK_START_VSWITCHD],
AT_CAPTURE_FILE([ovs-vswitchd.log])
on_exit "kill_ovs_vswitchd `cat ovs-vswitchd.pid`"
 ])
+
+
+# OVS_DPDK_CHECK_TESTPMD()
+#
+# Check dpdk-testpmd availability.
+#
+m4_define([OVS_DPDK_CHECK_TESTPMD],
+  [AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null])
+])
+
+
+# OVS_DPDK_START_TESTPMD()
+#
+# Start dpdk-testpmd in background.
+#
+m4_define([OVS_DPDK_START_TESTPMD],
+  [AT_CHECK([lscpu], [], [stdout])
+   AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' > NUMA_NODE])
+   eal_options="--socket-mem="$(cat NUMA_NODE)" --file-prefix page0 
--single-file-segments --no-pci"
+   options="$1"
+   test "$options" != "${options%% -- *}" || options="$options -- "
+   eal_options="$eal_options ${options%% -- *}"
+   testpmd_options="-a --stats-period 2 ${options#* -- }"
+   echo "dpdk-testpmd $eal_options -- $testpmd_options" >testpmd.cmd
+   dpdk-testpmd $eal_options -- $testpmd_options >testpmd.log 2>&1 & \
+   echo $! > testpmd.pid
+   on_exit "kill -9 `cat testpmd.pid`"
+])
+
+
+# OVS_DPDK_STOP_TESTPMD()
+#
+# Stop background dpdk-testpmd.
+#
+m4_define([OVS_DPDK_STOP_TESTPMD],
+  [AT_CHECK([kill `cat testpmd.pid`])
+   OVS_WAIT([kill -0 `cat testpmd.pid`], [kill -9 `cat testpmd.pid`])
+])
diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index fd42aed0b3..ab232e06da 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -97,13 +97,9 @@ dnl Ping vhost-user port
 AT_SETUP([OVS-DPDK - ping vhost-user ports])
 AT_KEYWORDS([dpdk])
 OVS_DPDK_PRE_CHECK()
-AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null])
+OVS_DPDK_CHECK_TESTPMD()
 OVS_DPDK_START([--no-pci])
 
-dnl Find number of sockets
-AT_CHECK([lscpu], [], [stdout])
-AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' > NUMA_NODE])
-
 dnl Add userspace bridge and attach it to OVS
 AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev])
 AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuser0 -- set Interface 
dpdkvhostuser0 \
@@ -125,12 +121,8 @@ ADD_NAMESPACES(ns1, ns2)
 dnl Add veth device
 ADD_VETH(tap1, ns2, br10, "172.31.110.12/24")
 
-dnl Execute testpmd in background
-on_exit "pkill -f -x -9 'tail -f /dev/null'"
-tail -f /dev/null | dpdk-testpmd --socket-mem="$(cat NUMA_NODE)" --no-pci\
-   --vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostuser0" \
-   --vdev="net_tap0,iface=tap0" --file-prefix page0 \
-   --single-file-segments -- -a 
>$OVS_RUNDIR/testpmd-dpdkvhostuser0.log 2>&1 &
+OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostuser0"
 \
+--vdev="net_tap0,iface=tap0"])
 
 OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log])
 OVS_WAIT_UNTIL([ip link show dev tap0 | grep -qw LOWER_UP])
@@ -151,8 +143,7 @@ AT_CHECK([ip netns exec ns2 ip link show], [], [stdout], 
[stderr])
 AT_CHECK([ip netns exec ns1 ping -c 4 -I tap0 172.31.110.12], [], [stdout],
  [stderr])
 
-dnl Clean up the testpmd now
-pkill -f -x -9 'tail -f /dev/null'
+OVS_DPDK_STOP_TESTPMD()
 
 dnl Wait for vhost-user handling the socket disconnect.
 OVS_WAIT_UNTIL([grep "vHost Device '$OVS_RUNDIR/dpdkvhostuser0' has been 
removed" ovs-vswitchd.log])
@@ -173,13 +164,9 @@ dnl Ping vhost-user-client port
 AT_SETUP([OVS-DPDK - ping vhost-user-client ports])
 AT_KEYWORDS([dpdk])
 OVS_DPDK_PRE_CHECK()
-AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null])
+OVS_DPDK_CHECK_TESTPMD()
 OVS_DPDK_START([--no-pci])
 
-dnl Find number of sockets
-AT_CHECK([lscpu], [], [stdout])
-AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' >

Re: [ovs-dev] [PATCH v2 1/2] mcast-snooping: Test per port explicit flooding.

2023-11-16 Thread David Marchand
On Thu, Nov 16, 2023 at 12:32 PM Ilya Maximets  wrote:
> > +AT_CHECK([
> > +ovs-vsctl set bridge br0 \
> > +datapath_type=dummy \
> > +mcast_snooping_enable=true \
> > +other-config:mcast-snooping-disable-flood-unregistered=false
>
> Nit:
> Not a full review, but in case you're sending a new version for Eelco's
> comments, please, add more indentation to the 3 lines above, so they
> are not on the same level with ovs-vsctl.

I did not see this comment.
Well, I'll wait for a full review before sending a new revision...


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 3/3] mcast-snooping: Fix comments format.

2023-11-16 Thread David Marchand
Capitalize comments and end them with a . when needed.

Signed-off-by: David Marchand 
---
 tests/mcast-snooping.at | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at
index faeb7890d9..890e6aca00 100644
--- a/tests/mcast-snooping.at
+++ b/tests/mcast-snooping.at
@@ -31,13 +31,13 @@ dummy@ovs-dummy: hit:0 missed:0
 
 ovs-appctl time/stop
 
-# Send IGMPv3 query on p2 with vlan 1725
+# Send IGMPv3 query on p2 with vlan 1725.
 # 5c:8a:38:55:25:52 > 01:00:5e:00:00:01, ethertype 802.1Q (0x8100), length 64: 
vlan 1725, p 0, ethertype IPv4,
 # 172.17.25.1 > 224.0.0.1: igmp query v3
 AT_CHECK([ovs-appctl netdev-dummy/receive p2 \
 
'01005e015c8a38552552810006bd080046c0002401027f00ac111901e00194041164ec1e027d'])
 
-# Send IGMPv3 query on p2 with vlan 1728
+# Send IGMPv3 query on p2 with vlan 1728.
 # 5c:8a:38:55:25:52 > 01:00:5e:00:00:01, ethertype 802.1Q (0x8100), length 64: 
vlan 1728, p 0, ethertype IPv4,
 # 172.17.28.1 > 224.0.0.1: igmp query v3
 AT_CHECK([ovs-appctl netdev-dummy/receive p2 \
@@ -51,13 +51,13 @@ AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl
 
 AT_CHECK([ovs-vsctl set Interface p2 options:tx_pcap=p2.pcap])
 
-# Send a multicast packet on p1
+# Send a multicast packet on p1.
 AT_CHECK([
 ovs-appctl netdev-dummy/receive p1 \
 
'in_port(1),eth(src=aa:55:aa:55:00:01,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=239.94.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)'
 ])
 
-# Check this packet was forwarded exactly once to p2 and has vlan tag 1725
+# Check this packet was forwarded exactly once to p2 and has vlan tag 1725.
 # aa:55:aa:55:00:01 > 01:00:5e:5e:01:01, ethertype 802.1Q (0x8100), length 46: 
vlan 1725, p 0, ethertype IPv4,
 # 10.0.0.1.0 > 239.94.1.1.8000: UDP, length 0
 AT_CHECK([ovs-pcap p2.pcap > p2.pcap.txt 2>&1])
@@ -450,7 +450,7 @@ AT_CHECK([
 
 ovs-appctl time/stop
 
-# send report packets
+# Send report packets.
 AT_CHECK([
 ovs-appctl netdev-dummy/receive p1  \
 
'01005E010101000C29A027A181010800451C00014002CBAEAC10221EE001010112140CE9E0010101'
@@ -458,7 +458,7 @@ AT_CHECK([
 
'01005E010101000C29A027A281020800451C00014002CBAEAC10221EE001010112140CE9E0010101'
 ], [0])
 
-# send query packets
+# Send query packets.
 AT_CHECK([
 ovs-appctl netdev-dummy/receive p3  \

'01005E010101000C29A027D181010800451C00014002CBCBAC102201E0010104EEEB'
@@ -505,7 +505,7 @@ AT_CHECK([
 
 ovs-appctl time/stop
 
-# send report packets
+# Send report packets.
 AT_CHECK([
 ovs-appctl netdev-dummy/receive p1  \
 
'01005E010101000C29A027A181010800451C00014002CBAEAC10221EE001010112140CE9E0010101'
@@ -513,7 +513,7 @@ AT_CHECK([
 
'01005E010101000C29A027A281020800451C00014002CBAEAC10221EE001010112140CE9E0010101'
 ], [0])
 
-# send query packets
+# Send query packets.
 AT_CHECK([
 ovs-appctl netdev-dummy/receive p2  \

'01005E010101000C29A027D181010800451C00014002CBCBAC102201E0010104EEEB'
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 1/3] mcast-snooping: Test per port explicit flooding.

2023-11-16 Thread David Marchand
Various options affect how the mcast snooping module work.

When multicast snooping is enabled and a reporter is known, it is still
possible to flood associated packets to some other port via the
mcast-snooping-flood option.

If flooding unregistered traffic is disabled, it is still possible to
flood multicast traffic too with the mcast-snooping-flood option.

IGMP reports may have to be flooded to some ports explicitly with the
mcast-snooping-flood-reports option.

Test those parameters.

Acked-by: Simon Horman 
Acked-by: Paolo Valerio 
Signed-off-by: David Marchand 
---
Changes since v2:
- fixed comment,

Changes since v1:
- fixed dest mac address,
- added tests for mcast-snooping-disable-flood-unregistered=true and
  mcast-snooping-flood-reports,

---
 tests/mcast-snooping.at | 280 
 1 file changed, 280 insertions(+)

diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at
index d5b7c4774c..9797bca531 100644
--- a/tests/mcast-snooping.at
+++ b/tests/mcast-snooping.at
@@ -105,6 +105,286 @@ AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
+
+AT_SETUP([mcast - check multicast per port flooding])
+OVS_VSWITCHD_START([])
+
+AT_CHECK([
+ovs-vsctl set bridge br0 \
+datapath_type=dummy \
+mcast_snooping_enable=true \
+other-config:mcast-snooping-disable-flood-unregistered=false
+], [0])
+
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+
+AT_CHECK([
+ovs-vsctl add-port br0 p1 \
+-- set Interface p1 type=dummy other-config:hwaddr=aa:55:aa:55:00:01 
ofport_request=1 \
+-- add-port br0 p2 \
+-- set Interface p2 type=dummy other-config:hwaddr=aa:55:aa:55:00:02 
ofport_request=2 \
+-- add-port br0 p3 \
+-- set Interface p3 type=dummy other-config:hwaddr=aa:55:aa:55:00:03 
ofport_request=3 \
+], [0])
+
+ovs-appctl time/stop
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [stdout])
+AT_CHECK([grep -v 'Datapath actions:' stdout], [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> unregistered multicast, flooding
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no
+])
+AT_CHECK([sed -ne 's/^Datapath actions: \(.*\)$/\1/p' stdout | tr "," "\n" | 
sort -n], [0], [dnl
+1
+2
+100
+])
+
+# Send report packets.
+AT_CHECK([
+ovs-appctl netdev-dummy/receive p1  \
+
'01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101'
+], [0])
+AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl
+ port  VLAN  GROUPAge
+1 0  224.1.1.1   0
+])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding to mcast group port
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no
+Datapath actions: 1
+])
+
+AT_CHECK([ovs-vsctl set port p2 other_config:mcast-snooping-flood=true])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding to mcast group port
+ -> forwarding to mcast flood port
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no
+Datapath actions: 1,2
+])
+
+AT_CHECK([ovs-vsctl set port p3 other_config:mcast-snooping-flood=true])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_

[ovs-dev] [PATCH v3 2/3] mcast-snooping: Flush flood and report ports when deleting interfaces.

2023-11-16 Thread David Marchand
When a configuration change triggers an interface destruction/creation
(like for example, setting ofport_request), a port object may still be
referenced as a fport or a rport in the mdb.

Before the fix, when flooding multicast traffic:
bridge("br0")
-
 0. priority 32768
NORMAL
 -> forwarding to mcast group port
 >> mcast flood port is unknown, dropping
 -> mcast flood port is input port, dropping
 -> forwarding to mcast flood port

Before the fix, when flooding igmp report traffic:
bridge("br0")
-
 0. priority 32768
NORMAL
 >> mcast port is unknown, dropping the report
 -> forwarding report to mcast flagged port
 -> mcast port is input port, dropping the Report
 -> forwarding report to mcast flagged port

Add relevant cleanup and update unit tests.

Fixes: 4fbbf8624868 ("mcast-snooping: Flush ports mdb when VLAN configuration 
changed.")
Acked-by: Simon Horman 
Acked-by: Paolo Valerio 
Signed-off-by: David Marchand 
---
Changes since v2:
- christmas tree,
- added some comments in tests,

Changes since v1:
- updated the test on report flooding,

---
 lib/mcast-snooping.c| 17 -
 tests/mcast-snooping.at | 42 +
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/lib/mcast-snooping.c b/lib/mcast-snooping.c
index 029ca28558..43805ae4d5 100644
--- a/lib/mcast-snooping.c
+++ b/lib/mcast-snooping.c
@@ -946,8 +946,9 @@ mcast_snooping_wait(struct mcast_snooping *ms)
 void
 mcast_snooping_flush_bundle(struct mcast_snooping *ms, void *port)
 {
-struct mcast_group *g;
 struct mcast_mrouter_bundle *m;
+struct mcast_port_bundle *p;
+struct mcast_group *g;
 
 if (!mcast_snooping_enabled(ms)) {
 return;
@@ -971,5 +972,19 @@ mcast_snooping_flush_bundle(struct mcast_snooping *ms, 
void *port)
 }
 }
 
+LIST_FOR_EACH_SAFE (p, node, >fport_list) {
+if (p->port == port) {
+mcast_snooping_flush_port(p);
+ms->need_revalidate = true;
+}
+}
+
+LIST_FOR_EACH_SAFE (p, node, >rport_list) {
+if (p->port == port) {
+mcast_snooping_flush_port(p);
+ms->need_revalidate = true;
+}
+}
+
 ovs_rwlock_unlock(>rwlock);
 }
diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at
index 9797bca531..faeb7890d9 100644
--- a/tests/mcast-snooping.at
+++ b/tests/mcast-snooping.at
@@ -207,6 +207,26 @@ Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e
 Datapath actions: 1,2
 ])
 
+# Change p2 ofport to force a ofbundle change and check that the mdb contains
+# no stale port.
+AT_CHECK([ovs-vsctl set interface p2 ofport_request=4])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding to mcast group port
+ -> mcast flood port is input port, dropping
+ -> forwarding to mcast flood port
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no
+Datapath actions: 1,2
+])
+
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
@@ -381,6 +401,28 @@ This flow is handled by the userspace slow path because it:
   - Uses action(s) not supported by datapath.
 ])
 
+# Change p2 ofport to force a ofbundle change and check that the mdb contains
+# no stale port.
+AT_CHECK([ovs-vsctl set interface p3 ofport_request=4])
+
+AT_CHECK([ovs-appctl ofproto/trace "in_port(1)" 
'01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101'],
 [0], [dnl
+Flow: 
ip,in_port=1,vlan_tci=0x,dl_src=00:0c:29:a0:27:a1,dl_dst=01:00:5e:01:01:01,nw_src=172.16.34.30,nw_dst=224.1.1.1,nw_proto=2,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=18,tp_dst=20
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding report to mcast flagged port
+ -> mcast port is input port, dropping the Report
+ -> forwarding report to mcast flagged port
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,ip,in_port=1,dl_src=00:0c:29:a0:27:a1,dl_dst=01:00:5e:01:01:01,nw_proto=2,nw_frag=no
+Datapath actions: 2,3
+This flow is handled by the userspace slow path because it:
+  - Uses action(s) not supported by datapath.
+])
+
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/2] mcast-snooping: Flush flood and report ports when deleting interfaces.

2023-11-16 Thread David Marchand
On Thu, Nov 16, 2023 at 10:38 AM Eelco Chaudron  wrote:
> > diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at
> > index b5474cf392..1ce31168e8 100644
> > --- a/tests/mcast-snooping.at
> > +++ b/tests/mcast-snooping.at
> > @@ -207,6 +207,24 @@ Megaflow: 
> > recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e
> >  Datapath actions: 1,2
> >  ])
> >
> > +AT_CHECK([ovs-vsctl set interface p2 ofport_request=4])
>
> Can we add a comment here (and below) to indicate why we do this? Just to 
> understand what we test here.

Wdyt of:
+# Change p2 ofport to force a ofbundle change and check that the mdb contains
+# no stale port.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/2] mcast-snooping: Test per port explicit flooding.

2023-11-16 Thread David Marchand
On Thu, Nov 16, 2023 at 11:57 AM Eelco Chaudron  wrote:
> On 10 Nov 2023, at 18:52, David Marchand wrote:
> > +Final flow: unchanged
> > +Megaflow: 
> > recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no
> > +])
> > +AT_CHECK([sed -ne 's/^Datapath actions: \(.*\)$/\1/p' stdout | tr "," "\n" 
> > | sort -n], [0], [dnl
> > +1
> > +2
> > +100
> > +])
> > +
> > +# send report packets
>
> Please add capital and dots to all comments.

I don't mind, but the rest of this file is not consistent to this convention.

$ git grep \\# origin/master -- tests/mcast-snooping.at
...
origin/master:tests/mcast-snooping.at:# send report packets
origin/master:tests/mcast-snooping.at:# send query packets
origin/master:tests/mcast-snooping.at:# send report packets
origin/master:tests/mcast-snooping.at:# send query packets


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/2] mcast-snooping: Test per port explicit flooding.

2023-11-16 Thread David Marchand
Hello Eelco,

On Thu, Nov 16, 2023 at 11:57 AM Eelco Chaudron  wrote:

[snip]

> > +bridge("br0")
> > +-
> > + 0. priority 32768
> > +NORMAL
> > + -> forwarding to mcast group port
> > + -> forwarding to mcast flood port
> > +
> > +Final flow: unchanged
> > +Megaflow: 
> > recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no
> > +Datapath actions: 1,2
>
>
> Are we sure the order here is always 1,2 vs the first test you sorted them? 
> Same for all the other multi-port tests below?
>
> I did run the test 200+ times, and it seems ok. Trying to understand this, as 
> I can see the first one reporting 100,1,2 and 100,2,1.

struct mcast_output out = MCAST_OUTPUT_INIT;
...
if (grp) {
xlate_normal_mcast_send_group(ctx, ms, grp, in_xbundle, );
xlate_normal_mcast_send_fports(ctx, ms, in_xbundle, );
xlate_normal_mcast_send_mrouters(ctx, ms, in_xbundle, ,
 );
...
mcast_output_finish(ctx, , in_xbundle, );

With:
static void
mcast_output_finish(struct xlate_ctx *ctx, struct mcast_output *out,
struct xbundle *in_xbundle, struct xvlan *xvlan)
{
if (out->flood) {
xlate_normal_flood(ctx, in_xbundle, xvlan);
} else {
for (size_t i = 0; i < out->n; i++) {
output_normal(ctx, out->xbundles[i], xvlan);
}
}
...


In this case, there is no flooding (contrary to previous tests) over
all the ports from this bridge.
There is only one "group" port and one "flood" port and the order is fixed.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 2/2] mcast-snooping: Flush flood and report ports when deleting interfaces.

2023-11-10 Thread David Marchand
When a configuration change triggers an interface destruction/creation
(like for example, setting ofport_request), a port object may still be
referenced as a fport or a rport in the mdb.

Before the fix, when flooding multicast traffic:
bridge("br0")
-
 0. priority 32768
NORMAL
 -> forwarding to mcast group port
 >> mcast flood port is unknown, dropping
 -> mcast flood port is input port, dropping
 -> forwarding to mcast flood port

Before the fix, when flooding igmp report traffic:
bridge("br0")
-
 0. priority 32768
NORMAL
 >> mcast port is unknown, dropping the report
 -> forwarding report to mcast flagged port
 -> mcast port is input port, dropping the Report
 -> forwarding report to mcast flagged port

Add relevant cleanup and update unit tests.

Fixes: 4fbbf8624868 ("mcast-snooping: Flush ports mdb when VLAN configuration 
changed.")
Signed-off-by: David Marchand 
---
Changes since v1:
- updated the test on report flooding,

---
 lib/mcast-snooping.c| 15 +++
 tests/mcast-snooping.at | 38 ++
 2 files changed, 53 insertions(+)

diff --git a/lib/mcast-snooping.c b/lib/mcast-snooping.c
index 029ca28558..34755447f8 100644
--- a/lib/mcast-snooping.c
+++ b/lib/mcast-snooping.c
@@ -948,6 +948,7 @@ mcast_snooping_flush_bundle(struct mcast_snooping *ms, void 
*port)
 {
 struct mcast_group *g;
 struct mcast_mrouter_bundle *m;
+struct mcast_port_bundle *p;
 
 if (!mcast_snooping_enabled(ms)) {
 return;
@@ -971,5 +972,19 @@ mcast_snooping_flush_bundle(struct mcast_snooping *ms, 
void *port)
 }
 }
 
+LIST_FOR_EACH_SAFE (p, node, >fport_list) {
+if (p->port == port) {
+mcast_snooping_flush_port(p);
+ms->need_revalidate = true;
+}
+}
+
+LIST_FOR_EACH_SAFE (p, node, >rport_list) {
+if (p->port == port) {
+mcast_snooping_flush_port(p);
+ms->need_revalidate = true;
+}
+}
+
 ovs_rwlock_unlock(>rwlock);
 }
diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at
index b5474cf392..1ce31168e8 100644
--- a/tests/mcast-snooping.at
+++ b/tests/mcast-snooping.at
@@ -207,6 +207,24 @@ Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e
 Datapath actions: 1,2
 ])
 
+AT_CHECK([ovs-vsctl set interface p2 ofport_request=4])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding to mcast group port
+ -> mcast flood port is input port, dropping
+ -> forwarding to mcast flood port
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no
+Datapath actions: 1,2
+])
+
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
@@ -381,6 +399,26 @@ This flow is handled by the userspace slow path because it:
   - Uses action(s) not supported by datapath.
 ])
 
+AT_CHECK([ovs-vsctl set interface p3 ofport_request=4])
+
+AT_CHECK([ovs-appctl ofproto/trace "in_port(1)" 
'01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101'],
 [0], [dnl
+Flow: 
ip,in_port=1,vlan_tci=0x,dl_src=00:0c:29:a0:27:a1,dl_dst=01:00:5e:01:01:01,nw_src=172.16.34.30,nw_dst=224.1.1.1,nw_proto=2,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=18,tp_dst=20
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding report to mcast flagged port
+ -> mcast port is input port, dropping the Report
+ -> forwarding report to mcast flagged port
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,ip,in_port=1,dl_src=00:0c:29:a0:27:a1,dl_dst=01:00:5e:01:01:01,nw_proto=2,nw_frag=no
+Datapath actions: 2,3
+This flow is handled by the userspace slow path because it:
+  - Uses action(s) not supported by datapath.
+])
+
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 1/2] mcast-snooping: Test per port explicit flooding.

2023-11-10 Thread David Marchand
Various options affect how the mcast snooping module work.

When multicast snooping is enabled and a reporter is known, it is still
possible to flood associated packets to some other port via the
mcast-snooping-flood option.

If flooding unregistered traffic is disabled, it is still possible to
flood multicast traffic too with the mcast-snooping-flood option.

IGMP reports may have to be flooded to some ports explicitly with the
mcast-snooping-flood-reports option.

Test those parameters.

Signed-off-by: David Marchand 
---
Changes since v1:
- fixed dest mac address,
- added tests for mcast-snooping-disable-flood-unregistered=true and
  mcast-snooping-flood-reports,

---
 tests/mcast-snooping.at | 280 
 1 file changed, 280 insertions(+)

diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at
index d5b7c4774c..b5474cf392 100644
--- a/tests/mcast-snooping.at
+++ b/tests/mcast-snooping.at
@@ -105,6 +105,286 @@ AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
+
+AT_SETUP([mcast - check multicast per port flooding])
+OVS_VSWITCHD_START([])
+
+AT_CHECK([
+ovs-vsctl set bridge br0 \
+datapath_type=dummy \
+mcast_snooping_enable=true \
+other-config:mcast-snooping-disable-flood-unregistered=false
+], [0])
+
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+
+AT_CHECK([
+ovs-vsctl add-port br0 p1 \
+-- set Interface p1 type=dummy other-config:hwaddr=aa:55:aa:55:00:01 
ofport_request=1 \
+-- add-port br0 p2 \
+-- set Interface p2 type=dummy other-config:hwaddr=aa:55:aa:55:00:02 
ofport_request=2 \
+-- add-port br0 p3 \
+-- set Interface p3 type=dummy other-config:hwaddr=aa:55:aa:55:00:03 
ofport_request=3 \
+], [0])
+
+ovs-appctl time/stop
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [stdout])
+AT_CHECK([grep -v 'Datapath actions:' stdout], [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> unregistered multicast, flooding
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no
+])
+AT_CHECK([sed -ne 's/^Datapath actions: \(.*\)$/\1/p' stdout | tr "," "\n" | 
sort -n], [0], [dnl
+1
+2
+100
+])
+
+# send report packets
+AT_CHECK([
+ovs-appctl netdev-dummy/receive p1  \
+
'01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101'
+], [0])
+AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl
+ port  VLAN  GROUPAge
+1 0  224.1.1.1   0
+])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding to mcast group port
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no
+Datapath actions: 1
+])
+
+AT_CHECK([ovs-vsctl set port p2 other_config:mcast-snooping-flood=true])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding to mcast group port
+ -> forwarding to mcast flood port
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_dst=224.1.1.1,nw_frag=no
+Datapath actions: 1,2
+])
+
+AT_CHECK([ovs-vsctl set port p3 other_config:mcast-snooping-flood=true])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:01:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:01:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-

Re: [ovs-dev] [PATCH 1/2] mcast-snooping: Test per port explicit flooding.

2023-11-10 Thread David Marchand
On Thu, Nov 9, 2023 at 4:33 PM Paolo Valerio  wrote:
>
> David Marchand  writes:
>
> > When multicast snooping is enabled and a reporter is known, it is still
> > possible to flood associated packets to some other port via the
> > mcast-snooping-flood option.
> >
> > Test this combination.
> >
> > Signed-off-by: David Marchand 
> > ---
> >  tests/mcast-snooping.at | 88 +
> >  1 file changed, 88 insertions(+)
> >
> > diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at
> > index d5b7c4774c..21c806ef63 100644
> > --- a/tests/mcast-snooping.at
> > +++ b/tests/mcast-snooping.at
> > @@ -105,6 +105,94 @@ AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl
> >  OVS_VSWITCHD_STOP
> >  AT_CLEANUP
> >
> > +
> > +AT_SETUP([mcast - check flooding on ports])
> > +OVS_VSWITCHD_START([])
> > +
> > +AT_CHECK([
> > +ovs-vsctl set bridge br0 \
> > +datapath_type=dummy \
> > +mcast_snooping_enable=true \
> > +other-config:mcast-snooping-disable-flood-unregistered=false
> > +], [0])
> > +
>
> in the case flood unregistered is disabled packets are supposed to
> be sent to flood ports. While at it, it might also be worth testing that
> like in the quick example at the end I used to test it.
> WDYT?

It sounds reasonable yes.

I was also considering testing reports flooding.
WDYT?


>
> > +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> > +
> > +AT_CHECK([
> > +ovs-vsctl add-port br0 p1 \
> > +-- set Interface p1 type=dummy other-config:hwaddr=aa:55:aa:55:00:01 
> > ofport_request=1 \
> > +-- add-port br0 p2 \
> > +-- set Interface p2 type=dummy other-config:hwaddr=aa:55:aa:55:00:02 
> > ofport_request=2 \
> > +-- add-port br0 p3 \
> > +-- set Interface p3 type=dummy other-config:hwaddr=aa:55:aa:55:00:03 
> > ofport_request=3 \
> > +], [0])
> > +
> > +ovs-appctl time/stop
> > +
> > +# send report packets
> > +AT_CHECK([
> > +ovs-appctl netdev-dummy/receive p1  \
> > +
> > '01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101'
> > +], [0])
> > +
> > +AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl
> > + port  VLAN  GROUPAge
> > +1 0  224.1.1.1   0
> > +])
> > +
> > +AT_CHECK([ovs-appctl ofproto/trace 
> > "in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
> >  [0], [dnl
> > +Flow: 
> > udp,in_port=3,vlan_tci=0x0000,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
> > +
>
> I think the mac for 224.1.1.1 maps to 01:00:5e:01:01:01.

Argh.. indeed, wrong copy/paste.
Thanks for the review!

>
> > +bridge("br0")
> > +-
> > + 0. priority 32768
> > +NORMAL
> > + -> forwarding to mcast group port


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/2] mcast-snooping: Test per port explicit flooding.

2023-11-09 Thread David Marchand
When multicast snooping is enabled and a reporter is known, it is still
possible to flood associated packets to some other port via the
mcast-snooping-flood option.

Test this combination.

Signed-off-by: David Marchand 
---
 tests/mcast-snooping.at | 88 +
 1 file changed, 88 insertions(+)

diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at
index d5b7c4774c..21c806ef63 100644
--- a/tests/mcast-snooping.at
+++ b/tests/mcast-snooping.at
@@ -105,6 +105,94 @@ AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
+
+AT_SETUP([mcast - check flooding on ports])
+OVS_VSWITCHD_START([])
+
+AT_CHECK([
+ovs-vsctl set bridge br0 \
+datapath_type=dummy \
+mcast_snooping_enable=true \
+other-config:mcast-snooping-disable-flood-unregistered=false
+], [0])
+
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+
+AT_CHECK([
+ovs-vsctl add-port br0 p1 \
+-- set Interface p1 type=dummy other-config:hwaddr=aa:55:aa:55:00:01 
ofport_request=1 \
+-- add-port br0 p2 \
+-- set Interface p2 type=dummy other-config:hwaddr=aa:55:aa:55:00:02 
ofport_request=2 \
+-- add-port br0 p3 \
+-- set Interface p3 type=dummy other-config:hwaddr=aa:55:aa:55:00:03 
ofport_request=3 \
+], [0])
+
+ovs-appctl time/stop
+
+# send report packets
+AT_CHECK([
+ovs-appctl netdev-dummy/receive p1  \
+
'01005E010101000C29A027A10800451C00014002CBAEAC10221EE001010112140CE9E0010101'
+], [0])
+
+AT_CHECK([ovs-appctl mdb/show br0], [0], [dnl
+ port  VLAN  GROUPAge
+1 0  224.1.1.1   0
+])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding to mcast group port
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_dst=224.1.1.1,nw_frag=no
+Datapath actions: 1
+])
+
+AT_CHECK([ovs-vsctl set port p2 other_config:mcast-snooping-flood=true])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding to mcast group port
+ -> forwarding to mcast flood port
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_dst=224.1.1.1,nw_frag=no
+Datapath actions: 1,2
+])
+
+AT_CHECK([ovs-vsctl set port p3 other_config:mcast-snooping-flood=true])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding to mcast group port
+ -> forwarding to mcast flood port
+ -> mcast flood port is input port, dropping
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_dst=224.1.1.1,nw_frag=no
+Datapath actions: 1,2
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+
 AT_SETUP([mcast - delete the port mdb when vlan configuration changed])
 OVS_VSWITCHD_START([])
 
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 2/2] mcast-snooping: Flush flood and report ports when deleting interfaces.

2023-11-09 Thread David Marchand
When a configuration change triggers an interface destruction/creation
(like for example, setting ofport_request), a port object may still be
referenced as a fport or a rport in the mdb:

bridge("br0")
-
 0. priority 32768
NORMAL
 -> forwarding to mcast group port
 >> mcast flood port is unknown, dropping
 -> mcast flood port is input port, dropping
 -> forwarding to mcast flood port

Add relevant cleanup and update a unit test.

Fixes: 4fbbf8624868 ("mcast-snooping: Flush ports mdb when VLAN configuration 
changed.")
Signed-off-by: David Marchand 
---
 lib/mcast-snooping.c| 15 +++
 tests/mcast-snooping.at | 18 ++
 2 files changed, 33 insertions(+)

diff --git a/lib/mcast-snooping.c b/lib/mcast-snooping.c
index 029ca28558..34755447f8 100644
--- a/lib/mcast-snooping.c
+++ b/lib/mcast-snooping.c
@@ -948,6 +948,7 @@ mcast_snooping_flush_bundle(struct mcast_snooping *ms, void 
*port)
 {
 struct mcast_group *g;
 struct mcast_mrouter_bundle *m;
+struct mcast_port_bundle *p;
 
 if (!mcast_snooping_enabled(ms)) {
 return;
@@ -971,5 +972,19 @@ mcast_snooping_flush_bundle(struct mcast_snooping *ms, 
void *port)
 }
 }
 
+LIST_FOR_EACH_SAFE (p, node, >fport_list) {
+if (p->port == port) {
+mcast_snooping_flush_port(p);
+ms->need_revalidate = true;
+}
+}
+
+LIST_FOR_EACH_SAFE (p, node, >rport_list) {
+if (p->port == port) {
+mcast_snooping_flush_port(p);
+ms->need_revalidate = true;
+}
+}
+
 ovs_rwlock_unlock(>rwlock);
 }
diff --git a/tests/mcast-snooping.at b/tests/mcast-snooping.at
index 21c806ef63..66bb8d0c65 100644
--- a/tests/mcast-snooping.at
+++ b/tests/mcast-snooping.at
@@ -189,6 +189,24 @@ Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e
 Datapath actions: 1,2
 ])
 
+AT_CHECK([ovs-vsctl set interface p2 ofport_request=4])
+
+AT_CHECK([ovs-appctl ofproto/trace 
"in_port(3),eth(src=aa:55:aa:55:00:ff,dst=01:00:5e:5e:01:01),eth_type(0x0800),ipv4(src=10.0.0.1,dst=224.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=0,dst=8000)"],
 [0], [dnl
+Flow: 
udp,in_port=3,vlan_tci=0x,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_src=10.0.0.1,nw_dst=224.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=0,tp_dst=8000
+
+bridge("br0")
+-
+ 0. priority 32768
+NORMAL
+ -> forwarding to mcast group port
+ -> mcast flood port is input port, dropping
+ -> forwarding to mcast flood port
+
+Final flow: unchanged
+Megaflow: 
recirc_id=0,eth,udp,in_port=3,dl_src=aa:55:aa:55:00:ff,dl_dst=01:00:5e:5e:01:01,nw_dst=224.1.1.1,nw_frag=no
+Datapath actions: 1,2
+])
+
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v8 2/9] system-dpdk: Don't require hugetlbfs.

2023-10-27 Thread David Marchand
On Fri, Oct 27, 2023 at 5:31 PM David Marchand
 wrote:
>
> dpdk-testpmd does not need hugetlbfs backing as we don't require
> multiprocess support in OVS unit tests.
>
> Switch to --in-memory and remove the (then unneeded) check on
> hugetlbfs presence.
>
> Acked-by: Aaron Conole 
> Acked-by: Eelco Chaudron 
> Signed-off-by: David Marchand 
> ---
> Changes since v7:
> - reverted use of dynamic allocations and kept initial memory
>   reservation: this avoids random failures when testpmd and ovs-dpdk
>   try to allocate memory at the same time,

Just a heads up on this change.

While testing the v8 series, I hit random failures on the mtu unit tests.
testpmd would fail to allocate memory in hugepages.

I suspect this has to do with OVS allocating memory in // of testpmd.
Reverting to an initial reserved memory makes the issue disappear so I
removed this change and kept existing behavior.
There may still be a race underneath but at least the situation is as
bad as before.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 8/9] system-dpdk: Rework cleanup for vhost-user client tests.

2023-10-27 Thread David Marchand
Those tests are subject to a race when a testpmd hosting the vhost-user
server is stopped and OVS has enough time to detect the vhost-user socket
drop and tries to reconnect to this socket.

In such a situation, the tests can fail as the OVS process with the
vhost-user client port complains with a warning log:

2023-09-08T13:15:18.160Z|00163|dpdk|INFO|VHOST_CONFIG:
(.../005/dpdkvhostclient0) vhost peer closed
2023-09-08T13:15:18.160Z|00164|netdev_dpdk|INFO|vHost Device
'.../005/dpdkvhostclient0' connection has been destroyed
2023-09-08T13:15:18.160Z|00165|dpdk|INFO|VHOST_CONFIG:
(.../005/dpdkvhostclient0) vhost-user client: socket created, fd: 24
2023-09-08T13:15:18.160Z|00166|dpdk|WARN|VHOST_CONFIG:
(.../005/dpdkvhostclient0) failed to connect: Connection refused
2023-09-08T13:15:18.160Z|00167|dpdk|INFO|VHOST_CONFIG:
(.../005/dpdkvhostclient0) reconnecting...

Invert the order of the cleanup steps.

Signed-off-by: David Marchand 
Acked-by: Eelco Chaudron 
---
Changes since v6:
- added this fix for spurious failures hit by Eelco,

---
 tests/system-dpdk.at | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index fd4a4b7d73..80277b24b6 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -222,10 +222,9 @@ AT_CHECK([test `ovs-vsctl get interface 
dpdkvhostuserclient0 statistics:tx_bytes
$((`ovs-vsctl get interface dpdkvhostuserclient0 
statistics:tx_q0_good_bytes` + dnl
   `ovs-vsctl get interface dpdkvhostuserclient0 
statistics:tx_q1_good_bytes`))])
 
-OVS_DPDK_STOP_TESTPMD()
-
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
+OVS_DPDK_STOP_TESTPMD()
 OVS_DPDK_STOP_VSWITCHD(["dnl
 /VHOST_CONFIG: (.*dpdkvhostclient0) recvmsg failed/d
 /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d
@@ -649,10 +648,9 @@ AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 
mtu_request=9000])
 AT_CHECK([ovs-appctl dpctl/show], [], [stdout])
 AT_CHECK([grep -E 'mtu=9000' stdout], [], [stdout])
 
-OVS_DPDK_STOP_TESTPMD()
-
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
+OVS_DPDK_STOP_TESTPMD()
 OVS_DPDK_STOP_VSWITCHD(["dnl
 /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d"])
 AT_CLEANUP
@@ -694,10 +692,9 @@ AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 
mtu_request=2000])
 AT_CHECK([ovs-appctl dpctl/show], [], [stdout])
 AT_CHECK([grep -E 'mtu=2000' stdout], [], [stdout])
 
-OVS_DPDK_STOP_TESTPMD()
-
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
+OVS_DPDK_STOP_TESTPMD()
 OVS_DPDK_STOP_VSWITCHD(["dnl
 /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d"])
 AT_CLEANUP
@@ -813,10 +810,9 @@ dnl Set MTU value above upper bound and check for error
 AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=9711])
 AT_CHECK([grep "dpdkvhostuserclient0: unsupported MTU 9711" ovs-vswitchd.log], 
[], [stdout])
 
-OVS_DPDK_STOP_TESTPMD()
-
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
+OVS_DPDK_STOP_TESTPMD()
 OVS_DPDK_STOP_VSWITCHD(["dnl
 /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d
 /dpdkvhostuserclient0: unsupported MTU 9711/d
@@ -859,10 +855,9 @@ dnl Set MTU value below lower bound and check for error
 AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 mtu_request=67])
 AT_CHECK([grep "dpdkvhostuserclient0: unsupported MTU 67" ovs-vswitchd.log], 
[], [stdout])
 
-OVS_DPDK_STOP_TESTPMD()
-
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
+OVS_DPDK_STOP_TESTPMD()
 OVS_DPDK_STOP_VSWITCHD(["dnl
 /VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d
 /dpdkvhostuserclient0: unsupported MTU 67/d
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 9/9] system-dpdk: Run traffic tests.

2023-10-27 Thread David Marchand
Integrate system-traffic.at tests as part of check-dpdk.

Some tests that can't work with the userspace datapath are skipped by
overriding some OVS_CHECK_* macros.

ADD_VETH is implemented using the net/af_xdp DPDK driver.

Signed-off-by: David Marchand 
Acked-by: Eelco Chaudron 
---
Changes since v6:
- fixed some checkpatch warning,

Changes since v4:
- switched to net/af_xdp, this removes the tweaking needed for net/tap,
  and it lets existing tool relying on kernel netdevs. veth offloading
  still needs some tweaking,

Changes since v3:
- reverted --dummy-numa and opted for configuring a number of rxqs
  relevant to the number of NUMA sockets,

Changes since v2:
- added ADD_VETH_IGNORE_LOGS and moved ignored error logs to
  OVS_TRAFFIC_VSWITCHD_STOP,
- added --no-pci to DPDK options to avoid failing the tests when
  running in a vm with a virtio-net device,
- faked a mono numa/mono core so that OVS requests at max 2 txqs on
  the net/tap port,

---
 .ci/dpdk-build.sh|  3 +-
 .github/workflows/build-and-test.yml |  2 +-
 tests/system-dpdk-macros.at  | 77 
 tests/system-dpdk-testsuite.at   |  2 +
 tests/system-dpdk.at |  3 --
 5 files changed, 82 insertions(+), 5 deletions(-)

diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh
index 35540f0694..aa83e44643 100755
--- a/.ci/dpdk-build.sh
+++ b/.ci/dpdk-build.sh
@@ -38,7 +38,8 @@ function build_dpdk()
 # any DPDK driver.
 # check-dpdk unit tests requires testpmd and some net/ driver.
 DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd"
-DPDK_OPTS="$DPDK_OPTS -Denable_drivers=net/null,net/tap,net/virtio"
+enable_drivers="net/null,net/af_xdp,net/tap,net/virtio"
+DPDK_OPTS="$DPDK_OPTS -Denable_drivers=$enable_drivers"
 
 # Install DPDK using prefix.
 DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build"
diff --git a/.github/workflows/build-and-test.yml 
b/.github/workflows/build-and-test.yml
index 4f62efb7c3..09654205e7 100644
--- a/.github/workflows/build-and-test.yml
+++ b/.github/workflows/build-and-test.yml
@@ -5,7 +5,7 @@ on: [push, pull_request]
 jobs:
   build-dpdk:
 env:
-  dependencies: gcc libnuma-dev ninja-build
+  dependencies: gcc libbpf-dev libnuma-dev ninja-build pkgconf
   CC: gcc
   DPDK_GIT: https://dpdk.org/git/dpdk-stable
   DPDK_VER: 22.11.1
diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index 7fedfd6515..dcdfa55741 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -127,3 +127,80 @@ m4_define([OVS_DPDK_STOP_TESTPMD],
   [AT_CHECK([kill `cat testpmd.pid`])
OVS_WAIT([kill -0 `cat testpmd.pid`], [kill -9 `cat testpmd.pid`])
 ])
+
+
+# OVS_TRAFFIC_VSWITCHD_START([vsctl-args], [vsctl-output], [dbinit-aux-args])
+#
+# Creates a database and starts ovsdb-server, starts ovs-vswitchd
+# connected to that database, calls ovs-vsctl to create a bridge named
+# br0 with predictable settings, passing 'vsctl-args' as additional
+# commands to ovs-vsctl.  If 'vsctl-args' causes ovs-vsctl to provide
+# output (e.g. because it includes "create" commands) then 'vsctl-output'
+# specifies the expected output after filtering through uuidfilt.
+# 'dbinit-aux-args' are passed as additional commands to 'ovs-vsctl init'
+# before starting ovs-vswitchd.
+m4_define([OVS_TRAFFIC_VSWITCHD_START],
+  [
+   OVS_DPDK_PRE_CHECK()
+   OVS_WAIT_WHILE([ip link show ovs-netdev])
+   dnl For functional tests, no need for DPDK PCI probing.
+   OVS_DPDK_START([--no-pci], [--disable-system], [$3])
+   dnl Add bridges, ports, etc.
+   OVS_WAIT_WHILE([ip link show br0])
+   AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| 
uuidfilt])], [0], [$2])
+])
+
+
+# OVS_TRAFFIC_VSWITCHD_STOP([ALLOWLIST], [extra_cmds])
+#
+# Gracefully stops ovs-vswitchd and ovsdb-server, checking their log files
+# for messages with severity WARN or higher and signaling an error if any
+# is present.  The optional ALLOWLIST may contain shell-quoted "sed"
+# commands to delete any warnings that are actually expected, e.g.:
+#
+#   OVS_TRAFFIC_VSWITCHD_STOP(["/expected error/d"])
+#
+# 'extra_cmds' are shell commands to be executed after OVS_VSWITCHD_STOP() is
+# invoked. They can be used to perform additional cleanups such as name space
+# removal.
+m4_define([OVS_TRAFFIC_VSWITCHD_STOP],
+  [OVS_DPDK_STOP_VSWITCHD([$1])
+   AT_CHECK([:; $2])
+])
+
+
+# Plug a veth into OVS via DPDK net/af_xdp.
+m4_define([ADD_VETH],
+[ AT_CHECK([ip link add $1 type veth peer name ovs-$1 || return 77])
+  CONFIGURE_VETH_OFFLOADS([$1])
+  AT_CHECK([ip link set $1 netns $2])
+  AT_CHECK([ip link set dev ovs-$1 up])
+  AT_CHECK([ovs-vsctl add-port $3 ovs-$1 -- \
+set interface ovs-$1 external-ids:iface-id="$1" -- \
+set interface ovs-$1 type=dpdk -- \
+set interface ovs-$1 
options:dpd

[ovs-dev] [PATCH v8 6/9] netdev-afxdp: Postpone libbpf logging helper registration.

2023-10-27 Thread David Marchand
When using net/af_xdp DPDK driver along OVS native AF_XDP support,
confusing logs are reported, like:

netdev_dpdk|INFO|Device 'net_af_xdpp0,iface=ovs-p0' attached to DPDK
dpif_netdev|INFO|PMD thread on numa_id: 0, core id: 11 created.
dpif_netdev|INFO|There are 1 pmd threads on numa node 0
dpdk|INFO|Device with port_id=0 already stopped
dpdk(pmd-c11/id:22)|INFO|PMD thread uses DPDK lcore 1.
netdev_dpdk|WARN|Rx checksum offload is not supported on port 0
netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(6)
.xdp_run_config
netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7)
xdp_metadata
netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7)
xdp_metadata
netdev_afxdp|INFO|libbpf: elf: skipping unrecognized data section(7)
xdp_metadata

This comes from the fact that netdev-afxdp unconditionnally registers a
helper for logging libbpf messages.
Making both net/af_xdp and netdev-afxdp work at the same time seems
difficult, so at least, ensure that netdev-afxdp won't register this
helper unless a netdev is actually allocated.

Signed-off-by: David Marchand 
Acked-by: Eelco Chaudron 
---
 lib/netdev-afxdp.c | 12 ++--
 lib/netdev-afxdp.h |  1 -
 lib/netdev-linux.c |  1 -
 3 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
index 16f26bc306..9884ccec4f 100644
--- a/lib/netdev-afxdp.c
+++ b/lib/netdev-afxdp.c
@@ -1195,18 +1195,18 @@ libbpf_print(enum libbpf_print_level level,
 return 0;
 }
 
-int netdev_afxdp_init(void)
-{
-libbpf_set_print(libbpf_print);
-return 0;
-}
-
 int
 netdev_afxdp_construct(struct netdev *netdev)
 {
+static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
 struct netdev_linux *dev = netdev_linux_cast(netdev);
 int ret;
 
+if (ovsthread_once_start()) {
+libbpf_set_print(libbpf_print);
+ovsthread_once_done();
+}
+
 /* Configure common netdev-linux first. */
 ret = netdev_linux_construct(netdev);
 if (ret) {
diff --git a/lib/netdev-afxdp.h b/lib/netdev-afxdp.h
index e91cd102d2..6c5459f6e6 100644
--- a/lib/netdev-afxdp.h
+++ b/lib/netdev-afxdp.h
@@ -47,7 +47,6 @@ struct xsk_socket_info;
 
 int netdev_afxdp_rxq_construct(struct netdev_rxq *rxq_);
 void netdev_afxdp_rxq_destruct(struct netdev_rxq *rxq_);
-int netdev_afxdp_init(void);
 int netdev_afxdp_construct(struct netdev *netdev_);
 void netdev_afxdp_destruct(struct netdev *netdev_);
 int netdev_afxdp_verify_mtu_size(const struct netdev *netdev, int mtu);
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index cca3408797..4538cdfe63 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -3754,7 +3754,6 @@ const struct netdev_class netdev_internal_class = {
 
 #ifdef HAVE_AF_XDP
 #define NETDEV_AFXDP_CLASS_COMMON   \
-.init = netdev_afxdp_init,  \
 .construct = netdev_afxdp_construct,\
 .destruct = netdev_afxdp_destruct,  \
 .get_stats = netdev_afxdp_get_stats,\
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 7/9] system-dpdk: Remove tap interfaces from vport MTU tests.

2023-10-27 Thread David Marchand
The unit tests for changing MTU with vhost-user ports are not using
those tap interfaces.

Signed-off-by: David Marchand 
---
Changes since v7:
- added this patch after getting regressions in Intel CI because of the
  next patch,

---
 tests/system-dpdk.at | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index 655e25ff13..fd4a4b7d73 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -636,9 +636,7 @@ AT_CHECK([grep "VHOST_CONFIG: 
($OVS_RUNDIR/dpdkvhostclient0) vhost-user client:
 AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' 
mode, using client socket" ovs-vswitchd.log], [], [stdout])
 AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." 
ovs-vswitchd.log], [], [stdout])
 
-OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"
 \
---vdev="net_tap0,iface=tap0"])
-
+OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"])
 OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log])
 
 dnl Check default MTU value in the datapath
@@ -683,9 +681,7 @@ AT_CHECK([grep "VHOST_CONFIG: 
($OVS_RUNDIR/dpdkvhostclient0) vhost-user client:
 AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' 
mode, using client socket" ovs-vswitchd.log], [], [stdout])
 AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." 
ovs-vswitchd.log], [], [stdout])
 
-OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"
 \
---vdev="net_tap0,iface=tap0"])
-
+OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"])
 OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log])
 
 dnl Check MTU value in the datapath
@@ -805,8 +801,7 @@ AT_CHECK([ovs-vsctl set Interface dpdkvhostuserclient0 
mtu_request=9702])
 AT_CHECK([ovs-vsctl show], [], [stdout])
 sleep 2
 
-OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"
 \
---vdev="net_tap0,iface=tap0"])
+OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"])
 
 OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log])
 
@@ -852,8 +847,7 @@ AT_CHECK([grep "VHOST_CONFIG: 
($OVS_RUNDIR/dpdkvhostclient0) vhost-user client:
 AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' 
mode, using client socket" ovs-vswitchd.log], [], [stdout])
 AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) reconnecting..." 
ovs-vswitchd.log], [], [stdout])
 
-OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"
 \
---vdev="net_tap0,iface=tap0"])
+OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1"])
 
 OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log])
 
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 5/9] system-dpdk: Refactor OVS daemons helpers.

2023-10-27 Thread David Marchand
Align system-dpdk existing helpers to other common OVS helpers so they
can accept some optional arguments.

Introduce a OVS_DPDK_STOP_VSWITCHD wrapper around OVS_VSWITCHD_STOP to
catch dpdk related logs in a centralised fashion.

Signed-off-by: David Marchand 
Acked-by: Eelco Chaudron 
---
Changes since v6:
- did a minor cleanup on vhost-user client test log pattern (one entry
  concerned vhost-user ports, not vhost-user *client* ports),

---
 tests/system-dpdk-macros.at |  21 -
 tests/system-dpdk.at| 158 +++-
 2 files changed, 82 insertions(+), 97 deletions(-)

diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index 35d14bee8f..7fedfd6515 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -36,12 +36,13 @@ m4_define([OVS_DPDK_PRE_PHY_SKIP],
 #
 m4_define([OVS_DPDK_START],
   [dnl start ovs dpdk
-   OVS_DPDK_START_OVSDB()
+   OVS_DPDK_START_OVSDB($3)
dnl Enable DPDK functionality
AT_CHECK([ovs-vsctl --no-wait set Open_vSwitch . 
other_config:dpdk-init=true])
-   OVS_DPDK_START_VSWITCHD($1)
+   OVS_DPDK_START_VSWITCHD([$1], [$2])
 ])
 
+
 # OVS_DPDK_START_OVSDB()
 #
 # Create an empty database and start ovsdb-server.
@@ -60,9 +61,10 @@ m4_define([OVS_DPDK_START_OVSDB],
AT_CAPTURE_FILE([ovsdb-server.log])
 
dnl Initialize database.
-   AT_CHECK([ovs-vsctl --no-wait init])
+   AT_CHECK([ovs-vsctl --no-wait init $1])
 ])
 
+
 # OVS_DPDK_START_VSWITCHD()
 #
 # Add special configuration for dpdk-init. Start ovs-vswitchd.
@@ -72,12 +74,23 @@ m4_define([OVS_DPDK_START_VSWITCHD],
AT_CHECK([ovs-vsctl --no-wait set Open_vSwitch . 
other_config:dpdk-extra="--log-level=pmd.*:error $1"])
 
dnl Start ovs-vswitchd.
-   AT_CHECK([ovs-vswitchd --detach --no-chdir --pidfile --log-file -vvconn 
-vofproto_dpif -vunixctl], [0], [stdout], [stderr])
+   AT_CHECK([ovs-vswitchd $2 --detach --no-chdir --pidfile --log-file -vvconn 
-vofproto_dpif -vunixctl], [0], [stdout], [stderr])
AT_CAPTURE_FILE([ovs-vswitchd.log])
on_exit "kill_ovs_vswitchd `cat ovs-vswitchd.pid`"
 ])
 
 
+m4_define([OVS_DPDK_STOP_VSWITCHD],
+  [OVS_VSWITCHD_STOP([dnl
+$1";/does not exist. The Open vSwitch kernel module is probably not loaded./d
+/does not support MTU configuration,/d
+/EAL: No \(available\|free\) .*hugepages reported/d
+/Failed to enable flow control/d
+/Rx checksum offload is not supported on/d
+/TELEMETRY: No legacy callbacks, legacy socket not created/d"])
+])
+
+
 # OVS_DPDK_CHECK_TESTPMD()
 #
 # Check dpdk-testpmd availability.
diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index 270587e2c0..655e25ff13 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -3,15 +3,6 @@ m4_define([CONFIGURE_VETH_OFFLOADS],
 
 AT_BANNER([OVS-DPDK unit tests])
 
-m4_define([SYSTEM_DPDK_ALLOWED_LOGS],[
-\@does not exist. The Open vSwitch kernel module is probably not loaded.@d
-\@does not support MTU configuration,@d
-\@EAL: No \(available\|free\) .*hugepages reported@d
-\@Failed to enable flow control@d
-\@Rx checksum offload is not supported on@d
-\@TELEMETRY: No legacy callbacks, legacy socket not created@d
-])
-
 dnl CHECK_MEMPOOL_PARAM([mtu], [numa], [+line])
 dnl
 dnl Waits for logs to indicate that the user has configured a mempool
@@ -36,7 +27,7 @@ OVS_DPDK_START([--no-pci])
 AT_CHECK([grep "DPDK Enabled - initializing..." ovs-vswitchd.log], [], 
[stdout])
 AT_CHECK([grep "EAL" ovs-vswitchd.log], [], [stdout])
 AT_CHECK([grep "DPDK Enabled - initialized" ovs-vswitchd.log], [], [stdout])
-OVS_VSWITCHD_STOP("[SYSTEM_DPDK_ALLOWED_LOGS]")
+OVS_DPDK_STOP_VSWITCHD
 AT_CLEANUP
 dnl --
 
@@ -58,7 +49,7 @@ sleep 2
 
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 phy0], [], [stdout], [stderr])
-OVS_VSWITCHD_STOP("[SYSTEM_DPDK_ALLOWED_LOGS]")
+OVS_DPDK_STOP_VSWITCHD
 AT_CLEANUP
 dnl --
 
@@ -84,9 +75,8 @@ AT_CHECK([grep "VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) 
reconnecting..." ov
 
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
-OVS_VSWITCHD_STOP("m4_join([], [SYSTEM_DPDK_ALLOWED_LOGS], [
-\@VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostclient0) failed to connect: No such file 
or directory@d
-])")
+OVS_DPDK_STOP_VSWITCHD(["dnl
+/VHOST_CONFIG: (.*dpdkvhostclient0) failed to connect: No such file or 
directory/d"])
 AT_CLEANUP
 dnl --
 
@@ -150,12 +140,11 @@ OVS_WAIT_UNTIL([grep "vHost Device 
'$OVS_RUNDIR/dpdkvhostuser0' has been removed
 
 dnl Clean up
 AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuser0], [], [stdout], [stderr])
-OVS_VSWITCHD_STOP("m4_join([], [SYSTEM_DPDK_ALLOWED_LOGS], [
-\@VHOST_CONFIG: ($OVS_RUNDIR/dpdkvhostuser0) recvmsg

[ovs-dev] [PATCH v8 4/9] tests: Define a macro to skip tc relying tests.

2023-10-27 Thread David Marchand
Some unit tests expect that a OVS port has an associated netdevice on
which they can hook tc.
This will not be possible when testing the userspace datapath with DPDK.
Introduce a helper (which will be overriden in system-dpdk tests) and
use it in the existing tests.

Acked-by: Aaron Conole 
Signed-off-by: David Marchand 
Acked-by: Eelco Chaudron 
---
Changes since v4:
- as the traffic tests now use net/af_xdp, it is not required to skip
  tests relying on tcpdump anymore,

---
 tests/system-common-macros.at| 6 ++
 tests/system-offloads-traffic.at | 6 +++---
 tests/system-traffic.at  | 6 +++---
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
index 0077a8609c..0113aae8bd 100644
--- a/tests/system-common-macros.at
+++ b/tests/system-common-macros.at
@@ -297,6 +297,12 @@ m4_define([OVS_START_L7],
 #
 m4_define([OFPROTO_CLEAR_DURATION_IDLE], [[sed -e 
's/duration=.*s,/duration=,/g' -e 
's/idle_age=[0-9]*,/idle_age=,/g']])
 
+# OVS_CHECK_TC_QDISC()
+#
+# Macro to skip tests when tc qdisc can't be applied on a OVS port.
+m4_define([OVS_CHECK_TC_QDISC],
+[AT_SKIP_IF([test $HAVE_TC = no])])
+
 # OVS_CHECK_TUNNEL_TSO()
 #
 # Macro to be used in general tunneling tests that could be also
diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at
index 5ad6b4bfdf..0bedee7530 100644
--- a/tests/system-offloads-traffic.at
+++ b/tests/system-offloads-traffic.at
@@ -20,7 +20,7 @@ m4_define([OVS_CHECK_ACTIONS], [
 
 m4_define([CHECK_TC_INGRESS_PPS],
 [
-AT_SKIP_IF([test $HAVE_TC = "no"])
+OVS_CHECK_TC_QDISC()
 AT_CHECK([ip link add ovs_tc_pps0 type veth peer name ovs_tc_pps1 dnl
   || exit 77])
 on_exit 'ip link del ovs_tc_pps0'
@@ -95,7 +95,7 @@ AT_CLEANUP
 
 AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - 
offloads disabled])
 AT_KEYWORDS([ingress_policing])
-AT_SKIP_IF([test $HAVE_TC = "no"])
+OVS_CHECK_TC_QDISC()
 OVS_TRAFFIC_VSWITCHD_START()
 AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw-offload=false])
 AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
@@ -118,7 +118,7 @@ AT_CLEANUP
 
 AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - 
offloads enabled])
 AT_KEYWORDS([ingress_policing])
-AT_SKIP_IF([test $HAVE_TC = "no"])
+OVS_CHECK_TC_QDISC()
 OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . 
other_config:hw-offload=true])
 AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
 ADD_NAMESPACES(at_ns0)
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 7ea4502028..a7d4ed83bd 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -2321,7 +2321,7 @@ AT_CLEANUP
 AT_BANNER([QoS])
 
 AT_SETUP([QoS - basic configuration])
-AT_SKIP_IF([test $HAVE_TC = no])
+OVS_CHECK_TC_QDISC()
 OVS_TRAFFIC_VSWITCHD_START()
 
 ADD_NAMESPACES(at_ns0, at_ns1)
@@ -2355,7 +2355,7 @@ OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
 AT_SETUP([QoS - 64bit])
-AT_SKIP_IF([test $HAVE_TC = no])
+OVS_CHECK_TC_QDISC()
 AT_SKIP_IF([test $HAVE_TCA_HTB_RATE64 = no])
 OVS_TRAFFIC_VSWITCHD_START()
 
@@ -2383,7 +2383,7 @@ OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
 AT_SETUP([Ingress Policing - 64-bit])
-AT_SKIP_IF([test $HAVE_TC = no])
+OVS_CHECK_TC_QDISC()
 AT_SKIP_IF([test $HAVE_TCA_POLICE_PKTRATE64 = no])
 OVS_TRAFFIC_VSWITCHD_START()
 ADD_NAMESPACES(ns0)
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 3/9] ci: Run DPDK tests in GitHub Actions.

2023-10-27 Thread David Marchand
Let's enhance our coverage in the CI and run DPDK system tests.

A few DPDK drivers are enabled in DPDK compilation.

Put DPDK build in $PATH for dpdk-testpmd to be available.
sudo drops PATH= updates and -E alone does not seem to preserve this
variable.
Pass PATH=$PATH when running the tests, as a workaround.
Since those tests are run as root, the collection of logs is updated
accordingly.

In GHA, only two cores are available but some test rely on testpmd using
three lcores.
Add a DPDK_EAL_OPTIONS environment variable and use it to map all
testpmd lcores to core 1 (and leave core 0 alone for OVS main and PMD
threads).

Signed-off-by: David Marchand 
Acked-by: Aaron Conole 
Acked-by: Eelco Chaudron 
---
Changes since v1:
- rebased after DPDK build has been moved out of linux-build.sh,
- restored running "normal" checks in the DPDK jobs,

---
 .ci/dpdk-build.sh|  7 ---
 .ci/linux-build.sh   | 15 ++-
 .github/workflows/build-and-test.yml |  7 ---
 tests/system-dpdk-macros.at  |  2 +-
 4 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/.ci/dpdk-build.sh b/.ci/dpdk-build.sh
index 02dcefef61..35540f0694 100755
--- a/.ci/dpdk-build.sh
+++ b/.ci/dpdk-build.sh
@@ -35,9 +35,10 @@ function build_dpdk()
 DPDK_OPTS="$DPDK_OPTS -Ddeveloper_mode=disabled"
 
 # OVS compilation and "normal" unit tests (run in the CI) do not depend on
-# any DPDK driver being present.
-# We can disable all drivers to save compilation time.
-DPDK_OPTS="$DPDK_OPTS -Ddisable_drivers=*/*"
+# any DPDK driver.
+# check-dpdk unit tests requires testpmd and some net/ driver.
+DPDK_OPTS="$DPDK_OPTS -Denable_apps=test-pmd"
+DPDK_OPTS="$DPDK_OPTS -Denable_drivers=net/null,net/tap,net/virtio"
 
 # Install DPDK using prefix.
 DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build"
diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 8227a57487..aa2ecc5050 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -22,6 +22,9 @@ function install_dpdk()
 # Export the following path for pkg-config to find the .pc file.
 export PKG_CONFIG_PATH=$DPDK_LIB/pkgconfig/:$PKG_CONFIG_PATH
 
+# Expose dpdk binaries.
+export PATH=$(pwd)/dpdk-dir/build/bin:$PATH
+
 if [ ! -f "${VERSION_FILE}" ]; then
 echo "Could not find DPDK in $(pwd)/dpdk-dir"
 return 1
@@ -113,7 +116,7 @@ fi
 
 OPTS="${EXTRA_OPTS} ${OPTS} $*"
 
-if [ "$TESTSUITE" ]; then
+if [ "$TESTSUITE" = 'test' ]; then
 # 'distcheck' will reconfigure with required options.
 # Now we only need to prepare the Makefile without sparse-wrapped CC.
 configure_ovs
@@ -123,6 +126,16 @@ if [ "$TESTSUITE" ]; then
 TESTSUITEFLAGS=-j4 RECHECK=yes
 else
 build_ovs
+for testsuite in $TESTSUITE; do
+run_as_root=
+if [ "${testsuite##*dpdk}" != "$testsuite" ]; then
+sudo sh -c 'echo 1024 > /proc/sys/vm/nr_hugepages' || true
+[ "$(cat /proc/sys/vm/nr_hugepages)" = '1024' ]
+export DPDK_EAL_OPTIONS="--lcores 0@1,1@1,2@1"
+run_as_root="sudo -E PATH=$PATH"
+fi
+$run_as_root make $testsuite TESTSUITEFLAGS=-j4 RECHECK=yes
+done
 fi
 
 exit 0
diff --git a/.github/workflows/build-and-test.yml 
b/.github/workflows/build-and-test.yml
index bc5494e863..4f62efb7c3 100644
--- a/.github/workflows/build-and-test.yml
+++ b/.github/workflows/build-and-test.yml
@@ -123,10 +123,10 @@ jobs:
 opts: --enable-shared
 
   - compiler: gcc
-testsuite:test
+testsuite:check check-dpdk
 dpdk: dpdk
   - compiler: clang
-testsuite:test
+testsuite:check check-dpdk
 dpdk: dpdk
 
   - compiler: gcc
@@ -213,7 +213,8 @@ jobs:
 mkdir logs
 cp config.log ./logs/
 cp -r ./*/_build/sub/tests/testsuite.* ./logs/ || true
-tar -czvf logs.tgz logs/
+sudo cp -r ./tests/*testsuite.* ./logs/ || true
+sudo tar -czvf logs.tgz logs/
 
 - name: upload logs on failure
   if: failure() || cancelled()
diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index a176a57a4b..35d14bee8f 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -94,7 +94,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD],
 m4_define([OVS_DPDK_START_TESTPMD],
   [AT_CHECK([lscpu], [], [stdout])
AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' > NUMA_NODE])
-   eal_options="--in-memory --socket-mem="$(cat NUMA_NODE)" 
--single-file-segments --no-pci"
+   eal_options=&qu

[ovs-dev] [PATCH v8 2/9] system-dpdk: Don't require hugetlbfs.

2023-10-27 Thread David Marchand
dpdk-testpmd does not need hugetlbfs backing as we don't require
multiprocess support in OVS unit tests.

Switch to --in-memory and remove the (then unneeded) check on
hugetlbfs presence.

Acked-by: Aaron Conole 
Acked-by: Eelco Chaudron 
Signed-off-by: David Marchand 
---
Changes since v7:
- reverted use of dynamic allocations and kept initial memory
  reservation: this avoids random failures when testpmd and ovs-dpdk
  try to allocate memory at the same time,

---
 tests/system-dpdk-macros.at | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index 2cfd26d840..a176a57a4b 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -7,9 +7,6 @@ m4_define([OVS_DPDK_PRE_CHECK],
   [dnl Check Hugepages
AT_CHECK([cat /proc/meminfo], [], [stdout])
AT_SKIP_IF([grep -E 'HugePages_Free: *0' stdout], [], [stdout])
-   AT_CHECK([mount], [], [stdout])
-   AT_CHECK([grep 'hugetlbfs' stdout], [], [stdout], [])
-
 ])
 
 
@@ -97,7 +94,7 @@ m4_define([OVS_DPDK_CHECK_TESTPMD],
 m4_define([OVS_DPDK_START_TESTPMD],
   [AT_CHECK([lscpu], [], [stdout])
AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' > NUMA_NODE])
-   eal_options="--socket-mem="$(cat NUMA_NODE)" --file-prefix page0 
--single-file-segments --no-pci"
+   eal_options="--in-memory --socket-mem="$(cat NUMA_NODE)" 
--single-file-segments --no-pci"
options="$1"
test "$options" != "${options%% -- *}" || options="$options -- "
eal_options="$eal_options ${options%% -- *}"
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 1/9] system-dpdk: Introduce helpers for testpmd.

2023-10-27 Thread David Marchand
Rather than copy/paste everywhere, introduce helpers to control
testpmd runs.
Rely on --stats-period (which outputs port stats every n seconds) so that
testpmd keeps running without expecting any user input.

Acked-by: Aaron Conole 
Acked-by: Eelco Chaudron 
Signed-off-by: David Marchand 
---
Changes since v7:
- fixed -- detection in arguments passed to OVS_DPDK_START_TESTPMD,
- wrote the testpmd command to a debug file,

Changes since v1:
- fixed OVS_DPDK_START_TESTPMD passed arguments evaluation:: $@ -> $1,

---
 tests/system-dpdk-macros.at |  38 +
 tests/system-dpdk.at| 103 +---
 2 files changed, 62 insertions(+), 79 deletions(-)

diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
index 3920f08a5e..2cfd26d840 100644
--- a/tests/system-dpdk-macros.at
+++ b/tests/system-dpdk-macros.at
@@ -79,3 +79,41 @@ m4_define([OVS_DPDK_START_VSWITCHD],
AT_CAPTURE_FILE([ovs-vswitchd.log])
on_exit "kill_ovs_vswitchd `cat ovs-vswitchd.pid`"
 ])
+
+
+# OVS_DPDK_CHECK_TESTPMD()
+#
+# Check dpdk-testpmd availability.
+#
+m4_define([OVS_DPDK_CHECK_TESTPMD],
+  [AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null])
+])
+
+
+# OVS_DPDK_START_TESTPMD()
+#
+# Start dpdk-testpmd in background.
+#
+m4_define([OVS_DPDK_START_TESTPMD],
+  [AT_CHECK([lscpu], [], [stdout])
+   AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' > NUMA_NODE])
+   eal_options="--socket-mem="$(cat NUMA_NODE)" --file-prefix page0 
--single-file-segments --no-pci"
+   options="$1"
+   test "$options" != "${options%% -- *}" || options="$options -- "
+   eal_options="$eal_options ${options%% -- *}"
+   testpmd_options="-a --stats-period 2 ${options#* -- }"
+   echo "dpdk-testpmd $eal_options -- $testpmd_options" >testpmd.cmd
+   dpdk-testpmd $eal_options -- $testpmd_options >testpmd.log 2>&1 & \
+   echo $! > testpmd.pid
+   on_exit "kill -9 `cat testpmd.pid`"
+])
+
+
+# OVS_DPDK_STOP_TESTPMD()
+#
+# Stop background dpdk-testpmd.
+#
+m4_define([OVS_DPDK_STOP_TESTPMD],
+  [AT_CHECK([kill `cat testpmd.pid`])
+   OVS_WAIT([kill -0 `cat testpmd.pid`], [kill -9 `cat testpmd.pid`])
+])
diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index 0f58e85742..270587e2c0 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -97,13 +97,9 @@ dnl Ping vhost-user port
 AT_SETUP([OVS-DPDK - ping vhost-user ports])
 AT_KEYWORDS([dpdk])
 OVS_DPDK_PRE_CHECK()
-AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null])
+OVS_DPDK_CHECK_TESTPMD()
 OVS_DPDK_START([--no-pci])
 
-dnl Find number of sockets
-AT_CHECK([lscpu], [], [stdout])
-AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' > NUMA_NODE])
-
 dnl Add userspace bridge and attach it to OVS
 AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev])
 AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuser0 -- set Interface 
dpdkvhostuser0 \
@@ -125,12 +121,8 @@ ADD_NAMESPACES(ns1, ns2)
 dnl Add veth device
 ADD_VETH(tap1, ns2, br10, "172.31.110.12/24")
 
-dnl Execute testpmd in background
-on_exit "pkill -f -x -9 'tail -f /dev/null'"
-tail -f /dev/null | dpdk-testpmd --socket-mem="$(cat NUMA_NODE)" --no-pci\
-   --vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostuser0" \
-   --vdev="net_tap0,iface=tap0" --file-prefix page0 \
-   --single-file-segments -- -a 
>$OVS_RUNDIR/testpmd-dpdkvhostuser0.log 2>&1 &
+OVS_DPDK_START_TESTPMD([--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostuser0"
 \
+--vdev="net_tap0,iface=tap0"])
 
 OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log])
 OVS_WAIT_UNTIL([ip link show dev tap0 | grep -qw LOWER_UP])
@@ -151,8 +143,7 @@ AT_CHECK([ip netns exec ns2 ip link show], [], [stdout], 
[stderr])
 AT_CHECK([ip netns exec ns1 ping -c 4 -I tap0 172.31.110.12], [], [stdout],
  [stderr])
 
-dnl Clean up the testpmd now
-pkill -f -x -9 'tail -f /dev/null'
+OVS_DPDK_STOP_TESTPMD()
 
 dnl Wait for vhost-user handling the socket disconnect.
 OVS_WAIT_UNTIL([grep "vHost Device '$OVS_RUNDIR/dpdkvhostuser0' has been 
removed" ovs-vswitchd.log])
@@ -173,13 +164,9 @@ dnl Ping vhost-user-client port
 AT_SETUP([OVS-DPDK - ping vhost-user-client ports])
 AT_KEYWORDS([dpdk])
 OVS_DPDK_PRE_CHECK()
-AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null])
+OVS_DPDK_CHECK_TESTPMD()
 OVS_DPDK_START([--no-pci])
 
-dnl Find number of sockets
-AT_CHECK([lscpu], [], [stdout])
-AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' >

Re: [ovs-dev] [PATCH v7 8/8] system-dpdk: Run traffic tests.

2023-10-27 Thread David Marchand
On Fri, Oct 27, 2023 at 1:45 PM Eelco Chaudron  wrote:
> On 23 Oct 2023, at 10:18, David Marchand wrote:
>
> > Integrate system-traffic.at tests as part of check-dpdk.
> >
> > Some tests that can't work with the userspace datapath are skipped by
> > overriding some OVS_CHECK_* macros.
> >
> > ADD_VETH is implemented using the net/af_xdp DPDK driver.
> >
> > Signed-off-by: David Marchand 
>
> This patch looks as good as it did last time, so
>
> Acked-by: Eelco Chaudron 

Thanks.


>
> I assume you will do one more revision based on some feedback on the first 
> patch in the series.

Yes, I will send one more revision.

Apart from the comments on patch1, I have another issue to handle.
We hit some random yet frequent issues because of the previous patch
that reorders del-port and testpmd shutdown: this reordering opens a
little window during which some packets may be received by a pmd while
the port is deleted (see thread
https://mail.openvswitch.org/pipermail/ovs-dev/2023-October/408956.html).


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] readthedocs: Use dirhtml builder.

2023-10-27 Thread David Marchand
On Thu, Oct 26, 2023 at 7:54 PM Ilya Maximets  wrote:
>
> We used this builder before, but from the project configuration
> on the website.  ReadTheDocs doesn't allow to change it there
> anymore and it doesn't allow to see the full name of the previously
> used builder (!!), so I failed to migrate it to the config file.
>
> The result is that older link like:
>   https://docs.openvswitch.org/en/latest/howto/dpdk/
> Now require .html:
>   https://docs.openvswitch.org/en/latest/howto/dpdk.html
>
> Fixing now by switching the builder back.
>
> Fixes: e388bd73b70d ("readthedocs: Add the configuration file.")
> Reported-by: Antonin Bas 
> Reported-by: David Marchand 
> Reported-at: https://github.com/openvswitch/ovs-issues/issues/310
> Signed-off-by: Ilya Maximets 
> ---
>
> The version of the docs with the change applied can be
> temporarily seen here:
>https://igsilya-ovs.readthedocs.io/en/latest/
>
>  .readthedocs.yaml | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/.readthedocs.yaml b/.readthedocs.yaml
> index e481e64f1..7d505150e 100644
> --- a/.readthedocs.yaml
> +++ b/.readthedocs.yaml
> @@ -14,6 +14,7 @@ build:
>  # Build documentation in the "Documentation/" directory with Sphinx.
>  sphinx:
>configuration: Documentation/conf.py
> +  builder: "dirhtml"
>
>  # Build all formats: HTML, PDF, ePub.
>  formats: all
> --
> 2.41.0
>

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 1/8] system-dpdk: Introduce helpers for testpmd.

2023-10-26 Thread David Marchand
On Thu, Oct 26, 2023 at 10:10 AM Frode Nordahl
 wrote:
>
> On Mon, Oct 23, 2023 at 10:19 AM David Marchand
>  wrote:
> >
> > Rather than copy/paste everywhere, introduce helpers to control
> > testpmd runs.
> > Rely on --stats-period (which outputs port stats every n seconds) so that
> > testpmd keeps running without expecting any user input.
> >
> > Signed-off-by: David Marchand 
> > Acked-by: Aaron Conole 
> > Acked-by: Eelco Chaudron 
> > ---
> > Changes since v1:
> > - fixed OVS_DPDK_START_TESTPMD passed arguments evaluation:: $@ -> $1,
>
> Thanks for working on this, is there a plan to backport this?

I did not test extensively with earlier versions of DPDK, but I would
expect it to work.
I am all for getting those unit tests ran in previous branches.


>
> It appears that there is currently an issue with dpdk-testpmd that
> makes it exit immediately when run in non-interactive mode [0], while
> this is an upstream DPDK bug, it does affect the system-dpdk testsuite
> for released versions of OVS. I suspect this DPDK commit broke it [1].
>
> 0: https://bugs.launchpad.net/bugs/2040097
> 1: 
> https://github.com/DPDK/dpdk/commit/0fd1386c30c3ad9365d7fdd2829bf7cb2e1b9dff

Yes [1] changed testpmd behavior.
The issue seems related to the tap driver as I don't reproduce the
early quit when using some other virtual driver.
I opened an upstream bug: https://bugs.dpdk.org/show_bug.cgi?id=1305.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 1/8] system-dpdk: Introduce helpers for testpmd.

2023-10-25 Thread David Marchand
On Wed, Oct 25, 2023 at 2:50 PM Aaron Conole  wrote:
>
> David Marchand  writes:
>
> > On Mon, Oct 23, 2023 at 10:20 AM David Marchand
> >  wrote:
> >> +# OVS_DPDK_CHECK_TESTPMD()
> >> +#
> >> +# Check dpdk-testpmd availability.
> >> +#
> >> +m4_define([OVS_DPDK_CHECK_TESTPMD],
> >> +  [AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null])
> >> +])
> >> +
> >> +
> >> +# OVS_DPDK_START_TESTPMD()
> >> +#
> >> +# Start dpdk-testpmd in background.
> >> +#
> >> +m4_define([OVS_DPDK_START_TESTPMD],
> >> +  [AT_CHECK([lscpu], [], [stdout])
> >> + AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while
> >> (c++<$(3)) {printf "512,"}; print "512"}' > NUMA_NODE])
> >> +   eal_options="--socket-mem="$(cat NUMA_NODE)" --file-prefix page0 
> >> --single-file-segments --no-pci"
> >> +   options="$1"
> >> +   [ "$options" != "${options%% -- *}" ] || options="$options -- "
> >
> > I realised, looking at a generated dpdk testsuite file (while trying
> > to understand a Intel CI failure), that this syntax above is wrong.
> >
> > It is not a big problem, since testpmd (/getopt) does not complain
> > about such a trailing --.
> > Yet, better to avoid [] and instead use a "if test ...; then ...; fi" 
> > construct.
>
> I think we need to use [[...]] to get the behavior you intend under m4,
> but also we can use test (since '[' is usually either an alias or a
> reimplementation of 'test').

I don't mind.
I see both uses of "test " and some [[]] in tests/.
But this file only used test so far... so I would tend to go with it.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovs-build] |fail| pw1853561 [ovs-dev, v7, 8/8] system-dpdk: Run traffic tests.

2023-10-25 Thread David Marchand
On Wed, Oct 25, 2023 at 1:18 PM Ilya Maximets  wrote:
> On 10/25/23 12:09, David Marchand wrote:
> >>> 2023-10-23T15:02:13.756Z|00082|bridge|INFO|bridge br10: deleted interface 
> >>> dpdkvhostuserclient0 on port 1
> >>> 2023-10-23T15:02:13.756Z|00083|dpif_netdev|INFO|PMD thread on numa_id: 1, 
> >>> core id: 88 destroyed.
> >>> 2023-10-23T15:02:13.772Z|2|dpdk(pmd-c88/id:103)|INFO|PMD thread 
> >>> released DPDK lcore 2.
> >>> 2023-10-23T15:02:13.778Z|00084|dpif_netdev|INFO|PMD thread on numa_id: 0, 
> >>> core id: 21 destroyed.
> >>> 2023-10-23T15:02:13.778Z|2|ofproto_dpif_xlate(pmd-c21/id:102)|WARN|received
> >>>  packet on unknown port 1 on bridge br10 while processing 
> >>> icmp6,in_port=1,vlan_tci=0x,dl_src=ca:76:e9:ff:a2:09,dl_dst=33:33:00:00:00:02,ipv6_src=fe80::c876:e9ff:feff:a209,ipv6_dst=ff02::2,ipv6_label=0x0,nw_tos=0,nw_ecn=0,nw_ttl=255,nw_frag=no,icmp_type=133,icmp_code=0
> >>> 2023-10-23T15:02:13.791Z|3|dpdk(pmd-c21/id:102)|INFO|PMD thread 
> >>> released DPDK lcore 1.
> >>> 2023-10-23T15:02:13.801Z|00085|dpdk|INFO|VHOST_CONFIG: 
> >>> (/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0) free 
> >>> connfd 95
> >>> 2023-10-23T15:02:13.801Z|00086|netdev_dpdk|INFO|vHost Device 
> >>> '/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0' not 
> >>> found
> >
> > I am a bit puzzled at this report.
> > It is similar to
> > https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/396325.html.
> >
> > I understand this shows a race in OVS cleaning up sequence, with some
> > packet (triggering an upcall) received by a pmd on a port that is not
> > referenced in the ofproto bridge anymore.
> > Why did it show up again? This is probably due to my patch 7 in the v7
> > series which lets testpmd sends packets while deleting the vhu port.
> >
> > The easiest (laziest?) for me is probably to drop this patch 7 and
> > instead waive warnings about a vhu socket reconnection...
>
> The packets are coming from the kernel interface on the other side
> of testpmd, right?  In that case, can we just bring that interface
> down before removing OVS port to prevent random ipv6 traffic from
> flowing around?  Another similar option might be to set admin state
> DOWN on the OVS side for the vhost-user port.

Putting down the tap iface should do the job yes.

But now I wonder why we need such a setup with testpmd + a tap in the
mtu unit tests: no packet is being actively injected by the unit tests
themselves.

I get that testpmd will make sure that the vhost-user client port is
running in a "nominal" situation when changing the mtu, so ok to keep
it.
But can we remove those tap iface from testpmd (for those MTU tests)?


>
> > But I find it strange that there is a window in which OVS pmd threads
> > still poll packets (and complain) while the ports are being removed.
>
> OpenFlow ports are getting removed before their backing datapath ports,
> so there is always a small window where packets can arrive on datapath
> ports that do not have associated OpenFlow port numbers anymore.
> Reversing this might not be an option due to reference counting, but I
> don't remember exactly.
>
> Same applies to upcalls in kenrel datapath, because packets can be queued
> for upcall while the port is getting removed.  And it's even trickier to
> fix that for a kernel, because it's done fully asynchronously.

Ok, thanks for the context / explanations.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovs-build] |fail| pw1853561 [ovs-dev, v7, 8/8] system-dpdk: Run traffic tests.

2023-10-25 Thread David Marchand
Forwarding to dev@

On Mon, Oct 23, 2023 at 6:05 PM  wrote:
> > 2023-10-23T15:02:12.622Z|00063|dpdk|INFO|VHOST_CONFIG: 
> > (/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0) virtio 
> > is now ready for processing.
> > 2023-10-23T15:02:12.622Z|00064|netdev_dpdk|INFO|vHost Device 
> > '/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0' has 
> > been added on numa node 0
> > 2023-10-23T15:02:13.592Z|00074|dpif_netdev|INFO|Performing pmd to rx queue 
> > assignment using cycles algorithm.
> > 2023-10-23T15:02:13.592Z|00075|dpif_netdev|INFO|Core 21 on numa node 0 
> > assigned port 'dpdkvhostuserclient0' rx queue 0 (measured processing cycles 
> > 0).
> > 2023-10-23T15:02:13.592Z|1|netdev_dpdk(ovs_vhost2)|INFO|State of queue 
> > 0 ( tx_qid 0 ) of vhost device 
> > '/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0' 
> > changed to 'enabled'
> > 2023-10-23T15:02:13.592Z|2|netdev_dpdk(ovs_vhost2)|INFO|State of queue 
> > 1 ( rx_qid 0 ) of vhost device 
> > '/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0' 
> > changed to 'enabled'
> > 2023-10-23T15:02:13.595Z|00076|unixctl|DBG|received request dpctl/show[], 
> > id=0
> > 2023-10-23T15:02:13.596Z|00077|unixctl|DBG|replying with success, id=0: 
> > "netdev@ovs-netdev:
> >   lookups: hit:0 missed:2 lost:0
> >   flows: 2
> >   port 0: ovs-netdev (tap)
> >   port 1: br10 (tap)
> >   port 2: dpdkvhostuserclient0 (dpdkvhostuserclient: 
> > configured_rx_queues=1, configured_tx_queues=1, mtu=9000, 
> > requested_rx_queues=1, requested_tx_queues=1)
> > "
> > 2023-10-23T15:02:13.715Z|00078|dpif_netdev|INFO|Performing pmd to rx queue 
> > assignment using cycles algorithm.
> > 2023-10-23T15:02:13.715Z|00079|dpif_netdev|INFO|Core 21 on numa node 0 
> > assigned port 'dpdkvhostuserclient0' rx queue 0 (measured processing cycles 
> > 0).
> > 2023-10-23T15:02:13.728Z|00080|unixctl|DBG|received request dpctl/show[], 
> > id=0
> > 2023-10-23T15:02:13.728Z|00081|unixctl|DBG|replying with success, id=0: 
> > "netdev@ovs-netdev:
> >   lookups: hit:0 missed:2 lost:0
> >   flows: 2
> >   port 0: ovs-netdev (tap)
> >   port 1: br10 (tap)
> >   port 2: dpdkvhostuserclient0 (dpdkvhostuserclient: 
> > configured_rx_queues=1, configured_tx_queues=1, mtu=2000, 
> > requested_rx_queues=1, requested_tx_queues=1)
> > "
> > 2023-10-23T15:02:13.756Z|00082|bridge|INFO|bridge br10: deleted interface 
> > dpdkvhostuserclient0 on port 1
> > 2023-10-23T15:02:13.756Z|00083|dpif_netdev|INFO|PMD thread on numa_id: 1, 
> > core id: 88 destroyed.
> > 2023-10-23T15:02:13.772Z|2|dpdk(pmd-c88/id:103)|INFO|PMD thread 
> > released DPDK lcore 2.
> > 2023-10-23T15:02:13.778Z|00084|dpif_netdev|INFO|PMD thread on numa_id: 0, 
> > core id: 21 destroyed.
> > 2023-10-23T15:02:13.778Z|2|ofproto_dpif_xlate(pmd-c21/id:102)|WARN|received
> >  packet on unknown port 1 on bridge br10 while processing 
> > icmp6,in_port=1,vlan_tci=0x,dl_src=ca:76:e9:ff:a2:09,dl_dst=33:33:00:00:00:02,ipv6_src=fe80::c876:e9ff:feff:a209,ipv6_dst=ff02::2,ipv6_label=0x0,nw_tos=0,nw_ecn=0,nw_ttl=255,nw_frag=no,icmp_type=133,icmp_code=0
> > 2023-10-23T15:02:13.791Z|3|dpdk(pmd-c21/id:102)|INFO|PMD thread 
> > released DPDK lcore 1.
> > 2023-10-23T15:02:13.801Z|00085|dpdk|INFO|VHOST_CONFIG: 
> > (/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0) free 
> > connfd 95
> > 2023-10-23T15:02:13.801Z|00086|netdev_dpdk|INFO|vHost Device 
> > '/root/ovs-dev/tests/system-dpdk-testsuite.dir/017/dpdkvhostclient0' not 
> > found

I am a bit puzzled at this report.
It is similar to
https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/396325.html.

I understand this shows a race in OVS cleaning up sequence, with some
packet (triggering an upcall) received by a pmd on a port that is not
referenced in the ofproto bridge anymore.
Why did it show up again? This is probably due to my patch 7 in the v7
series which lets testpmd sends packets while deleting the vhu port.

The easiest (laziest?) for me is probably to drop this patch 7 and
instead waive warnings about a vhu socket reconnection...
But I find it strange that there is a window in which OVS pmd threads
still poll packets (and complain) while the ports are being removed.

Opinions?

-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


  1   2   3   4   5   6   7   8   9   10   >