date:20211207

[ovs-dev] [PATCH v3] acinclude: Provide better error info when linking fails with DPDK.

2021-12-07 Thread Sunil Pai G

Currently, on failure to link with DPDK, the configure script provides
an error message to update the PKG_CONFIG_PATH even though the cause of
failure was missing dependencies. Improve the error message to include this
scenario.

Signed-off-by: Sunil Pai G 
---
v2-> v3: Fix sentence.
v1-> v2: Improve logging instead of printing contents from config.log
---
 acinclude.m4 | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/acinclude.m4 b/acinclude.m4
index dba365ea1..15391e68d 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -486,7 +486,8 @@ AC_DEFUN([OVS_CHECK_DPDK], [
DPDKLIB_FOUND=true],
   [AC_MSG_RESULT([no])
AC_MSG_ERROR(m4_normalize([
-  Could not find DPDK library in default search path, update
+  Failed to link with DPDK, check config.log for more details.
+  Could not find working DPDK library in default search path, update
   PKG_CONFIG_PATH for pkg-config to find the .pc file in
   non-standard location]))
   ])
-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2] acinclude: Provide better error info when linking fails with DPDK.

2021-12-07 Thread Sunil Pai G

Currently, on failure to link with DPDK, the configure script provides
an error message to update the PKG_CONFIG_PATH even though the cause of
failure was missing dependencies. Improve the error message to include this
scenario.

Signed-off-by: Sunil Pai G 
---
 acinclude.m4 | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/acinclude.m4 b/acinclude.m4
index dba365ea1..cadf71cc1 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -486,7 +486,8 @@ AC_DEFUN([OVS_CHECK_DPDK], [
DPDKLIB_FOUND=true],
   [AC_MSG_RESULT([no])
AC_MSG_ERROR(m4_normalize([
-  Could not find DPDK library in default search path, update
+  Failed to link to with DPDK, check config.log for more details.
+  Could not find working DPDK library in default search path, update
   PKG_CONFIG_PATH for pkg-config to find the .pc file in
   non-standard location]))
   ])
-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v4] netdev-vport : Fix userspace tunnel ioctl(SIOCGIFINDEX) info logs.

2021-12-07 Thread 0-day Robot

Bleep bloop.  Greetings lin huang, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Author linhuang  needs to sign off.
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Lin Huang 
Lines checked: 59, Warnings: 1, Errors: 1


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v4] netdev-vport : Fix userspace tunnel ioctl(SIOCGIFINDEX) info logs.

2021-12-07 Thread lin huang

From: linhuang 

Userspace tunnel doesn't have a valid device in the kernel. So
get_ifindex() function (ioctl) always get error during
adding a port, deleting a port or updating a port status.

The info log is
"2021-08-29T09:17:39.830Z|00059|netdev_linux|INFO|ioctl(SIOCGIFINDEX)
on vxlan_sys_4789 device failed: No such device"

If there are a lot of userspace tunnel ports on a bridge, the
iface_refresh_netdev_status() function will spend a lot of time.

So ignore userspace tunnel port ioctl(SIOCGIFINDEX) operation, just
return -ENODEV.

Signed-off-by: Lin Huang 
Test-by: Mike Pattrick 
---
 lib/netdev-vport.c | 6 ++
 vswitchd/bridge.c  | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index 499c0291c9..64331f74bf 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -1151,6 +1151,12 @@ netdev_vport_get_ifindex(const struct netdev *netdev_)
 {
 char buf[NETDEV_VPORT_NAME_BUFSIZE];
 const char *name = netdev_vport_get_dpif_port(netdev_, buf, sizeof(buf));
+const char *dpif_type = netdev_get_dpif_type(netdev_);
+
+if (dpif_type && strcmp(dpif_type, "system")) {
+/* Not a system device. */
+return -ENODEV;
+}

 return linux_get_ifindex(name);
 }
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index 5223aa8970..513ef7ea9c 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -2052,6 +2052,8 @@ iface_do_create(const struct bridge *br,
 goto error;
 }

+netdev_set_dpif_type(netdev, br->ofproto->type);
+
 error = iface_set_netdev_config(iface_cfg, netdev, errp);
 if (error) {
 goto error;
--
2.27.0
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 0/4] dpif-netdev: Hash-based Tx packet steering

2021-12-07 Thread Ilya Maximets

On 11/24/21 22:23, Maxime Coquelin wrote:
> This series introduces a new HXPS Tx mode alognside existing
> XPS and static modes. The goal is to provide a mode where all
> the transmit queues are used, whatever the number of PMD
> threads. This may be used with Vhost-user ports, where the
> guest application driving the Virtio device expects packets
> to be distributed on all the queues.
> 
> As a preliminary step, in order to be able to validate the
> feature at OVS level, the first patch introduces per-queue
> basic statistics for Vhost-user ports. This patch is
> complementary to David's patch [0] adding per-queue
> statistics to DPDK ports using xstats.
> 
> The series also introduces two DPDK tests for Vhost-user
> multiqueue, with and without HXPS enabled.
> 
> Maxime Coquelin (4):
>   netdev-dpdk: Introduce per rxq/txq Vhost-user statistics
>   dpif-netdev: Introduce Tx queue mode
>   dpif-netdev: Add HXPS Tx queue mode
>   system-dpdk: Add tests for HXPS
> 
>  Documentation/automake.mk   |   1 +
>  Documentation/topics/dpdk/hxps.rst  |  51 ++
>  Documentation/topics/dpdk/index.rst |   1 +
>  lib/dpif-netdev.c   |  95 ++
>  lib/netdev-dpdk.c   | 143 ++--
>  tests/system-dpdk.at| 135 ++
>  6 files changed, 399 insertions(+), 27 deletions(-)
>  create mode 100644 Documentation/topics/dpdk/hxps.rst
> 

Hi, Maxime.  Thanks for working on this.
I agree that this feature might be very useful for some deployments.

I didn't read the code any carefully, just glanced at it.  But I have
a couple of high level comments:

1. I don't think that the test for the actual XPS mode implementation
   should be part of a system-dpdk testsuite, as it's not actually
   related to DPDK and doesn't require any system HW/ports/non-OVS
   applications running in order to test it.

   So, I think, that we should be able to do practically the same
   testing, but with dummy interfaces, with a test placed in the
   dpif-netdev.at or pmd.at (probably, latter).

   To achieve that you'll need per-queue stats in netdev-dummy, but
   implementation of these will be practically the same or even a
   bit simpler as you did for vhost-user ports.

   Per-queue stats for vhost-user ports might be good to have in
   general, so that patch along with some simple test in system-dpdk.at
   for them could be split from this patch set and sent separately.
   Or dropped, if you think they are not valuable (?).

2. The test itself doesn't need a packet generator script, AFAICT.
   You may just send some number of packets via dummy port changing
   source or destination udp/tcp port affecting the packet hash this
   way.  For example, see the SEND_TCP_BOND_PKTS macro in the
   tests/ofproto-dpif.at and how bonding rebalancing tests are using it.

3. Might be better instead of introduction of a specialized config
   knob (other_config:hxps=true), to have a multi-choice knob like
   other_config:xps-mode with 2 options 'default' and 'hash', where
   'default' is a current way of tx queue distribution and it will
   be a default value.  'hash' will be a new mode that uses packet
   hash to choose the tx queue (what's implemented in this patch set).

What do you think?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 1/4] netdev-dpdk: Introduce per rxq/txq Vhost-user statistics

2021-12-07 Thread Maxime Coquelin


Hi David,

On 12/7/21 21:37, David Marchand wrote:

Hey Maxime,

On Wed, Nov 24, 2021 at 10:24 PM Maxime Coquelin
 wrote:


HXPS feature will enable steering Tx packets on transmist


transmit*


queues based on their hashes. In order to test the feature,


"their hashes" is ambiguous.


s/their hashes/the packets hashes/ ?


it is needed to be able to get the per-queue statistics for
Vhost-user ports.


This is a general comment, for consistency, I'd write vhost, all lowercase.


Ok, I will make it consistent in next revision.





This patch introduces "bytes", "packets" and "error"
per-queue custom statistics for Vhost-user ports.

Suggested-by David Marchand 
Signed-off-by: Maxime Coquelin 
---
  lib/netdev-dpdk.c | 143 +++---
  1 file changed, 135 insertions(+), 8 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ca92c947a..e80d5b4ab 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -192,6 +192,13 @@ static const struct vhost_device_ops virtio_net_device_ops 
=
  .guest_notified = vhost_guest_notified,
  };

+/* Custome software per-queue stats for Vhost ports */


Custom*


+struct netdev_dpdk_vhost_q_stats {
+uint64_t bytes;
+uint64_t packets;
+uint64_t errors;
+};
+
  /* Custom software stats for dpdk ports */
  struct netdev_dpdk_sw_stats {
  /* No. of retries when unable to transmit. */
@@ -206,6 +213,10 @@ struct netdev_dpdk_sw_stats {
  uint64_t rx_qos_drops;
  /* Packet drops in HWOL processing. */
  uint64_t tx_invalid_hwol_drops;
+/* Per-queue Vhost Tx stats */
+struct netdev_dpdk_vhost_q_stats *txq;
+/* Per-queue Vhost Rx stats */
+struct netdev_dpdk_vhost_q_stats *rxq;


Here, we add "driver" specific stats, while netdev_dpdk_sw_stats
struct carries OVS "own" stats.
This netdev_dpdk_sw_stats struct is converted by
netdev_dpdk_get_sw_custom_stats and there is a small framework on
adding custom OVS stats (using some macros "trick").

I'd rather leave netdev_dpdk_sw_stats struct untouched for consistency.
Pointers to vhost specific stats can be added to the netdev_dpdk
struct (we have some spare space after the pointer to
netdev_dpdk_sw_stats).


That makes sense, it is indeed better to have it out of
netdev_dpdk_sw_stats struct.




  };

  enum dpdk_dev_type {
@@ -1276,6 +1287,13 @@ common_construct(struct netdev *netdev, dpdk_port_t 
port_no,
  dev->sw_stats = xzalloc(sizeof *dev->sw_stats);
  dev->sw_stats->tx_retries = (dev->type == DPDK_DEV_VHOST) ? 0 : 
UINT64_MAX;

+if (dev->type == DPDK_DEV_VHOST) {
+dev->sw_stats->txq = xcalloc(netdev->n_txq,
+ sizeof *dev->sw_stats->txq);
+dev->sw_stats->rxq = xcalloc(netdev->n_rxq,
+ sizeof *dev->sw_stats->rxq);
+}
+
  return 0;
  }

@@ -2353,17 +2371,21 @@ netdev_dpdk_vhost_update_rx_size_counters(struct 
netdev_stats *stats,
  }

  static inline void
-netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk *dev,
+netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk *dev, int qid,
   struct dp_packet **packets, int count,
   int qos_drops)
  {
  struct netdev_stats *stats = >stats;
+struct netdev_dpdk_vhost_q_stats *q_stats = >sw_stats->rxq[qid];


reverse xmas tree?


Will fix it here and elsewhere.




  struct dp_packet *packet;
  unsigned int packet_size;
  int i;

  stats->rx_packets += count;
+q_stats->packets += count;
  stats->rx_dropped += qos_drops;
+q_stats->errors += qos_drops;
+
  for (i = 0; i < count; i++) {
  packet = packets[i];
  packet_size = dp_packet_size(packet);
@@ -2374,6 +2396,7 @@ netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk 
*dev,
   * further processing. */
  stats->rx_errors++;
  stats->rx_length_errors++;
+q_stats->errors++;
  continue;
  }

@@ -2385,6 +2408,7 @@ netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk 
*dev,
  }

  stats->rx_bytes += packet_size;
+q_stats->bytes += packet_size;
  }

  if (OVS_UNLIKELY(qos_drops)) {
@@ -2437,7 +2461,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
  }

  rte_spinlock_lock(>stats_lock);
-netdev_dpdk_vhost_update_rx_counters(dev, batch->packets,
+netdev_dpdk_vhost_update_rx_counters(dev, qid, batch->packets,
   nb_rx, qos_drops);
  rte_spinlock_unlock(>stats_lock);

@@ -2551,7 +2575,7 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, 
struct rte_mbuf **pkts,
  }

  static inline void
-netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk *dev,
+netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk *dev, int qid,
   struct dp_packet **packets,
   int attempted,

Re: [ovs-dev] [PATCH 1/4] netdev-dpdk: Introduce per rxq/txq Vhost-user statistics

2021-12-07 Thread David Marchand

Hey Maxime,

On Wed, Nov 24, 2021 at 10:24 PM Maxime Coquelin
 wrote:
>
> HXPS feature will enable steering Tx packets on transmist

transmit*

> queues based on their hashes. In order to test the feature,

"their hashes" is ambiguous.

> it is needed to be able to get the per-queue statistics for
> Vhost-user ports.

This is a general comment, for consistency, I'd write vhost, all lowercase.


>
> This patch introduces "bytes", "packets" and "error"
> per-queue custom statistics for Vhost-user ports.
>
> Suggested-by David Marchand 
> Signed-off-by: Maxime Coquelin 
> ---
>  lib/netdev-dpdk.c | 143 +++---
>  1 file changed, 135 insertions(+), 8 deletions(-)
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index ca92c947a..e80d5b4ab 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -192,6 +192,13 @@ static const struct vhost_device_ops 
> virtio_net_device_ops =
>  .guest_notified = vhost_guest_notified,
>  };
>
> +/* Custome software per-queue stats for Vhost ports */

Custom*

> +struct netdev_dpdk_vhost_q_stats {
> +uint64_t bytes;
> +uint64_t packets;
> +uint64_t errors;
> +};
> +
>  /* Custom software stats for dpdk ports */
>  struct netdev_dpdk_sw_stats {
>  /* No. of retries when unable to transmit. */
> @@ -206,6 +213,10 @@ struct netdev_dpdk_sw_stats {
>  uint64_t rx_qos_drops;
>  /* Packet drops in HWOL processing. */
>  uint64_t tx_invalid_hwol_drops;
> +/* Per-queue Vhost Tx stats */
> +struct netdev_dpdk_vhost_q_stats *txq;
> +/* Per-queue Vhost Rx stats */
> +struct netdev_dpdk_vhost_q_stats *rxq;

Here, we add "driver" specific stats, while netdev_dpdk_sw_stats
struct carries OVS "own" stats.
This netdev_dpdk_sw_stats struct is converted by
netdev_dpdk_get_sw_custom_stats and there is a small framework on
adding custom OVS stats (using some macros "trick").

I'd rather leave netdev_dpdk_sw_stats struct untouched for consistency.
Pointers to vhost specific stats can be added to the netdev_dpdk
struct (we have some spare space after the pointer to
netdev_dpdk_sw_stats).


>  };
>
>  enum dpdk_dev_type {
> @@ -1276,6 +1287,13 @@ common_construct(struct netdev *netdev, dpdk_port_t 
> port_no,
>  dev->sw_stats = xzalloc(sizeof *dev->sw_stats);
>  dev->sw_stats->tx_retries = (dev->type == DPDK_DEV_VHOST) ? 0 : 
> UINT64_MAX;
>
> +if (dev->type == DPDK_DEV_VHOST) {
> +dev->sw_stats->txq = xcalloc(netdev->n_txq,
> + sizeof *dev->sw_stats->txq);
> +dev->sw_stats->rxq = xcalloc(netdev->n_rxq,
> + sizeof *dev->sw_stats->rxq);
> +}
> +
>  return 0;
>  }
>
> @@ -2353,17 +2371,21 @@ netdev_dpdk_vhost_update_rx_size_counters(struct 
> netdev_stats *stats,
>  }
>
>  static inline void
> -netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk *dev,
> +netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk *dev, int qid,
>   struct dp_packet **packets, int count,
>   int qos_drops)
>  {
>  struct netdev_stats *stats = >stats;
> +struct netdev_dpdk_vhost_q_stats *q_stats = >sw_stats->rxq[qid];

reverse xmas tree?


>  struct dp_packet *packet;
>  unsigned int packet_size;
>  int i;
>
>  stats->rx_packets += count;
> +q_stats->packets += count;
>  stats->rx_dropped += qos_drops;
> +q_stats->errors += qos_drops;
> +
>  for (i = 0; i < count; i++) {
>  packet = packets[i];
>  packet_size = dp_packet_size(packet);
> @@ -2374,6 +2396,7 @@ netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk 
> *dev,
>   * further processing. */
>  stats->rx_errors++;
>  stats->rx_length_errors++;
> +q_stats->errors++;
>  continue;
>  }
>
> @@ -2385,6 +2408,7 @@ netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk 
> *dev,
>  }
>
>  stats->rx_bytes += packet_size;
> +q_stats->bytes += packet_size;
>  }
>
>  if (OVS_UNLIKELY(qos_drops)) {
> @@ -2437,7 +2461,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
>  }
>
>  rte_spinlock_lock(>stats_lock);
> -netdev_dpdk_vhost_update_rx_counters(dev, batch->packets,
> +netdev_dpdk_vhost_update_rx_counters(dev, qid, batch->packets,
>   nb_rx, qos_drops);
>  rte_spinlock_unlock(>stats_lock);
>
> @@ -2551,7 +2575,7 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, 
> struct rte_mbuf **pkts,
>  }
>
>  static inline void
> -netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk *dev,
> +netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk *dev, int qid,
>   struct dp_packet **packets,
>   int attempted,
>   struct netdev_dpdk_sw_stats 
> *sw_stats_add)
> @@ -2561,14

[ovs-dev] [PATCH v4] tests: Fix endianness in netlink policy test fixtures.

2021-12-07 Thread Frode Nordahl

The netlink policy unit test contains test fixture data that is
subject to endianness and currently fails on big endian systems.

Store the fixture data in a struct to ensure proper byte order for
the header data.

Also fix improper style for sizeof with expressions.

Fixes: bfee9f6c0115 ("netlink: Add support for parsing link layer address.")
Signed-off-by: Frode Nordahl 
---
 tests/test-netlink-policy.c | 67 +++--
 1 file changed, 42 insertions(+), 25 deletions(-)

diff --git a/tests/test-netlink-policy.c b/tests/test-netlink-policy.c
index 5f2bf7101..55083935a 100644
--- a/tests/test-netlink-policy.c
+++ b/tests/test-netlink-policy.c
@@ -24,6 +24,11 @@
 #include "ovstest.h"
 #include "util.h"
 
+struct nlattr_fixture {
+struct nlattr nlattr;
+uint8_t data[32];
+};
+
 /* nla_len is an inline function in the kernel net/netlink header, which we
  * don't necessarilly have at build time, so provide our own with
  * non-conflicting name. */
@@ -32,66 +37,78 @@ _nla_len(const struct nlattr *nla) {
 return nla->nla_len - NLA_HDRLEN;
 }
 
+#define TEST_POLICY_ATTR 42
+
 static void
 test_nl_policy_parse_ll_addr(struct ovs_cmdl_context *ctx OVS_UNUSED) {
 struct nl_policy policy[] = {
-[42] = { .type = NL_A_LL_ADDR,
- .optional = false, },
+[TEST_POLICY_ATTR] = { .type = NL_A_LL_ADDR,
+   .optional = false, },
 };
 struct nlattr *attrs[ARRAY_SIZE(policy)];
-uint8_t fixture_nl_data_policy_short[] = {
+struct nlattr_fixture fixture_nl_data_policy_short = {
 /* too short according to policy */
-0x04, 0x00, 0x2a, 0x00,
+.nlattr.nla_len = 5,
+.nlattr.nla_type = TEST_POLICY_ATTR,
+.data = { 0x00 },
 };
-uint8_t fixture_nl_data_policy_long[] = {
+struct nlattr_fixture fixture_nl_data_policy_long = {
 /* too long according to policy */
-0x19, 0x00, 0x2a, 0x00, 0x00, 0x00, 0x67, 0xfe, 0x80, 0x00, 0x00, 0x00,
-0x00, 0x00, 0x00, 0xe4, 0x1d, 0x2d, 0x03, 0x00, 0xa5, 0xf0, 0x2f, 0x00,
-0x00,
+.nlattr.nla_len = 25,
+.nlattr.nla_type = TEST_POLICY_ATTR,
+.data = { 0x00, 0x00, 0x67, 0xfe, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0xe4, 0x1d, 0x2d, 0x03, 0x00, 0xa5, 0xf0, 0x2f, 0x00,
+  0x00 },
 };
-uint8_t fixture_nl_data_eth[] = {
+struct nlattr_fixture fixture_nl_data_eth = {
 /* valid policy and eth_addr length */
-0x0a, 0x00, 0x2a, 0x00, 0x00, 0x53, 0x00, 0x00, 0x00, 0x2a,
+.nlattr.nla_len = 10,
+.nlattr.nla_type = TEST_POLICY_ATTR,
+.data = { 0x00, 0x53, 0x00, 0x00, 0x00, 0x2a },
 };
-uint8_t fixture_nl_data_ib[] = {
+struct nlattr_fixture fixture_nl_data_ib = {
 /* valid policy and ib_addr length */
-0x18, 0x00, 0x2a, 0x00, 0x00, 0x00, 0x00, 0x67, 0xfe, 0x80, 0x00, 0x00,
-0x00, 0x00, 0x00, 0x00, 0xe4, 0x1d, 0x2d, 0x03, 0x00, 0xa5, 0xf0, 0x2f,
+.nlattr.nla_len = 24,
+.nlattr.nla_type = TEST_POLICY_ATTR,
+.data = { 0x00, 0x00, 0x00, 0x67, 0xfe, 0x80, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0xe4, 0x1d, 0x2d, 0x03, 0x00, 0xa5, 0xf0, 0x2f },
 };
-uint8_t fixture_nl_data_invalid[] = {
+struct nlattr_fixture fixture_nl_data_invalid = {
 /* valid policy but data neither eth_addr nor ib_addr */
-0x0b, 0x00, 0x2a, 0x00, 0x00, 0x53, 0x00, 0x00, 0x00, 0x2a, 0x00,
+.nlattr.nla_len = 11,
+.nlattr.nla_type = TEST_POLICY_ATTR,
+.data = { 0x00, 0x53, 0x00, 0x00, 0x00, 0x2a, 0x00 },
 };
 struct ofpbuf *buf;
 
 /* confirm policy fails with too short data */
 buf = ofpbuf_clone_data(_nl_data_policy_short,
-sizeof(fixture_nl_data_policy_short));
+fixture_nl_data_policy_short.nlattr.nla_len);
 ovs_assert(!nl_policy_parse(buf, 0, policy, attrs, ARRAY_SIZE(policy)));
 ofpbuf_delete(buf);
-memset(, 0, sizeof(*attrs));
+memset(, 0, sizeof *attrs);
 
 /* confirm policy fails with too long data */
 buf = ofpbuf_clone_data(_nl_data_policy_long,
-sizeof(fixture_nl_data_policy_long));
+fixture_nl_data_policy_long.nlattr.nla_len);
 ovs_assert(!nl_policy_parse(buf, 0, policy, attrs, ARRAY_SIZE(policy)));
 ofpbuf_delete(buf);
-memset(, 0, sizeof(*attrs));
+memset(, 0, sizeof *attrs);
 
 /* confirm policy passes and interpret valid ethernet lladdr */
 buf = ofpbuf_clone_data(_nl_data_eth,
-sizeof(fixture_nl_data_eth));
+fixture_nl_data_eth.nlattr.nla_len);
 ovs_assert(nl_policy_parse(buf, 0, policy, attrs, ARRAY_SIZE(policy)));
 ovs_assert((_nla_len(attrs[42]) == sizeof(struct eth_addr)));
 struct eth_addr eth_expect = ETH_ADDR_C(00,53,00,00,00,2a);
 struct eth_addr

Re: [ovs-dev] [PATCH v3] tests: Fix endianness in netlink policy test fixtures.

2021-12-07 Thread Frode Nordahl

On Tue, Dec 7, 2021 at 8:01 AM Frode Nordahl
 wrote:
>
> The netlink policy unit test contains test fixture data that is
> subject to endianness and currently fails on big endian systems.
>
> Store the fixture data in a struct to ensure proper byte order for
> the header data.
>
> Also fix improper style for sizeof with expressions.
>
> Fixes: bfee9f6c0115 ("netlink: Add support for parsing link layer address.")
> Signed-off-by: Frode Nordahl 
> ---
>  tests/test-netlink-policy.c | 75 -
>  1 file changed, 50 insertions(+), 25 deletions(-)
>
> diff --git a/tests/test-netlink-policy.c b/tests/test-netlink-policy.c
> index 5f2bf7101..e908d3100 100644
> --- a/tests/test-netlink-policy.c
> +++ b/tests/test-netlink-policy.c
> @@ -24,6 +24,12 @@
>  #include "ovstest.h"
>  #include "util.h"
>
> +struct nlattr_fixture {
> +struct nlattr nlattr;
> +uint8_t data[32];
> +size_t data_len;

When I wrote this I had a very good use case for handling the whole of
the nlattr and data size separately in my head, but when revisiting it
becomes clear that the unit tests currently present in this file are
focusing on testing OVS netlink library's ability to parse a single
data type.  We will not be testing the OVS netlink library's ability
to detect and handle unexpected or malformed data being transmitted
over the netlink socket, so let's scrap it as there is no real need
for it.

As a consequence a v4 will follow, sorry about the noise :-)

-- 
Frode Nordahl

> +};
> +
>  /* nla_len is an inline function in the kernel net/netlink header, which we
>   * don't necessarilly have at build time, so provide our own with
>   * non-conflicting name. */
> @@ -32,66 +38,85 @@ _nla_len(const struct nlattr *nla) {
>  return nla->nla_len - NLA_HDRLEN;
>  }
>
> +#define TEST_POLICY_ATTR 42
> +
>  static void
>  test_nl_policy_parse_ll_addr(struct ovs_cmdl_context *ctx OVS_UNUSED) {
>  struct nl_policy policy[] = {
> -[42] = { .type = NL_A_LL_ADDR,
> - .optional = false, },
> +[TEST_POLICY_ATTR] = { .type = NL_A_LL_ADDR,
> +   .optional = false, },
>  };
>  struct nlattr *attrs[ARRAY_SIZE(policy)];
> -uint8_t fixture_nl_data_policy_short[] = {
> +struct nlattr_fixture fixture_nl_data_policy_short = {
>  /* too short according to policy */
> -0x04, 0x00, 0x2a, 0x00,
> +.nlattr.nla_len = 4,
> +.nlattr.nla_type = TEST_POLICY_ATTR,
> +.data = { 0x00 },
> +.data_len = 1,
>  };
> -uint8_t fixture_nl_data_policy_long[] = {
> +struct nlattr_fixture fixture_nl_data_policy_long = {
>  /* too long according to policy */
> -0x19, 0x00, 0x2a, 0x00, 0x00, 0x00, 0x67, 0xfe, 0x80, 0x00, 0x00, 
> 0x00,
> -0x00, 0x00, 0x00, 0xe4, 0x1d, 0x2d, 0x03, 0x00, 0xa5, 0xf0, 0x2f, 
> 0x00,
> -0x00,
> +.nlattr.nla_len = 25,
> +.nlattr.nla_type = TEST_POLICY_ATTR,
> +.data = { 0x00, 0x00, 0x67, 0xfe, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
> +  0x00, 0xe4, 0x1d, 0x2d, 0x03, 0x00, 0xa5, 0xf0, 0x2f, 0x00,
> +  0x00 },
> +.data_len = 21,
>  };
> -uint8_t fixture_nl_data_eth[] = {
> +struct nlattr_fixture fixture_nl_data_eth = {
>  /* valid policy and eth_addr length */
> -0x0a, 0x00, 0x2a, 0x00, 0x00, 0x53, 0x00, 0x00, 0x00, 0x2a,
> +.nlattr.nla_len = 10,
> +.nlattr.nla_type = TEST_POLICY_ATTR,
> +.data = { 0x00, 0x53, 0x00, 0x00, 0x00, 0x2a },
> +.data_len = 6,
>  };
> -uint8_t fixture_nl_data_ib[] = {
> +struct nlattr_fixture fixture_nl_data_ib = {
>  /* valid policy and ib_addr length */
> -0x18, 0x00, 0x2a, 0x00, 0x00, 0x00, 0x00, 0x67, 0xfe, 0x80, 0x00, 
> 0x00,
> -0x00, 0x00, 0x00, 0x00, 0xe4, 0x1d, 0x2d, 0x03, 0x00, 0xa5, 0xf0, 
> 0x2f,
> +.nlattr.nla_len = 24,
> +.nlattr.nla_type = TEST_POLICY_ATTR,
> +.data = { 0x00, 0x00, 0x00, 0x67, 0xfe, 0x80, 0x00, 0x00, 0x00, 0x00,
> +  0x00, 0x00, 0xe4, 0x1d, 0x2d, 0x03, 0x00, 0xa5, 0xf0, 0x2f 
> },
> +.data_len = 20,
>  };
> -uint8_t fixture_nl_data_invalid[] = {
> +struct nlattr_fixture fixture_nl_data_invalid = {
>  /* valid policy but data neither eth_addr nor ib_addr */
> -0x0b, 0x00, 0x2a, 0x00, 0x00, 0x53, 0x00, 0x00, 0x00, 0x2a, 0x00,
> +.nlattr.nla_len = 11,
> +.nlattr.nla_type = TEST_POLICY_ATTR,
> +.data = { 0x00, 0x53, 0x00, 0x00, 0x00, 0x2a, 0x00 },
> +.data_len = 7,
>  };
>  struct ofpbuf *buf;
>
>  /* confirm policy fails with too short data */
>  buf = ofpbuf_clone_data(_nl_data_policy_short,
> -sizeof(fixture_nl_data_policy_short));
> +NLA_HDRLEN +
> +fixture_nl_data_policy_short.data_len);
>  ovs_assert(!nl_policy_parse(buf,

[ovs-dev] [[PATCH RFC] 17/17] Enable TSO if available.

2021-12-07 Thread Flavio Leitner

Now that there is a segmentation in software as a fall back in
case a netdev doesn't support TCP segmentation offloading (TSO),
enable it by default on all possible netdevs.

This patch showcase the idea, but it can't really be applied
because it doesn't support encapsulated packets yet. Either it
would have to enable that support first or provide a switch to
turn on/off globally depending on the use case.

This patch is good to also measure performance with P2P and PVP
and check if there are regressions before continue the work.

The encapsulated traffic is challenging because DPDK ports
require pointers to inner headers [1] and OVS doesn't support
them at the moment. We could store the pointers when the packet
is encapsulated, but then any further change in the packet
headers may or may not cause the inner pointers to change too.

Another requirement not present here is the control of the
features (csum, TSO) per port. That can be done, but for
example if a vhost-user port has TSO turned off, then the
software segmentation is used. Currently that allocates
packets from normal memory, so DPDK would have to copy
(dpdk_do_tx_copy) each packet to send out on another DPDK port.

[1]
https://doc.dpdk.org/guides/prog_guide/mbuf_lib.html#meta-information

Signed-off-by: Flavio Leitner 
---
 Documentation/topics/userspace-tso.rst |  12 --
 lib/automake.mk|   2 -
 lib/netdev-dpdk.c  |  56 +++--
 lib/netdev-linux.c | 155 -
 lib/netdev.c   |   4 +-
 lib/userspace-tso.c|  48 
 lib/userspace-tso.h|  23 
 vswitchd/bridge.c  |   2 -
 vswitchd/vswitch.xml   |  20 
 9 files changed, 68 insertions(+), 254 deletions(-)
 delete mode 100644 lib/userspace-tso.c
 delete mode 100644 lib/userspace-tso.h

diff --git a/Documentation/topics/userspace-tso.rst 
b/Documentation/topics/userspace-tso.rst
index bd64e7ed3..a574ae9e3 100644
--- a/Documentation/topics/userspace-tso.rst
+++ b/Documentation/topics/userspace-tso.rst
@@ -27,8 +27,6 @@
 Userspace Datapath - TSO
 
 
-**Note:** This feature is considered experimental.
-
 TCP Segmentation Offload (TSO) enables a network stack to delegate segmentation
 of an oversized TCP segment to the underlying physical NIC. Offload of frame
 segmentation achieves computational savings in the core, freeing up CPU cycles
@@ -51,16 +49,6 @@ __ https://doc.dpdk.org/guides-20.11/nics/overview.html
 Enabling TSO
 
 
-The TSO support may be enabled via a global config value
-``userspace-tso-enable``.  Setting this to ``true`` enables TSO support for
-all ports.::
-
-$ ovs-vsctl set Open_vSwitch . other_config:userspace-tso-enable=true
-
-The default value is ``false``.
-
-Changing ``userspace-tso-enable`` requires restarting the daemon.
-
 When using :doc:`vHost User ports `, TSO may be enabled
 as follows.
 
diff --git a/lib/automake.mk b/lib/automake.mk
index 2ca94e13c..f11c10d9a 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -363,8 +363,6 @@ lib_libopenvswitch_la_SOURCES = \
lib/unicode.h \
lib/unixctl.c \
lib/unixctl.h \
-   lib/userspace-tso.c \
-   lib/userspace-tso.h \
lib/util.c \
lib/util.h \
lib/uuid.c \
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 0d370bda3..1f7443028 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -65,7 +65,6 @@
 #include "timeval.h"
 #include "unaligned.h"
 #include "unixctl.h"
-#include "userspace-tso.h"
 #include "util.h"
 #include "uuid.h"
 
@@ -1180,16 +1179,13 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 dev->hw_ol_features &= ~NETDEV_TX_SCTP_CKSUM_OFFLOAD;
 }
 
-dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
-if (userspace_tso_enabled()) {
-if (info.tx_offload_capa & DEV_TX_OFFLOAD_TCP_TSO) {
-dev->hw_ol_features |= NETDEV_TX_TSO_OFFLOAD;
-} else {
-VLOG_WARN("%s: Tx TSO offload is not supported.",
-  netdev_get_name(>up));
-}
+if (info.tx_offload_capa & DEV_TX_OFFLOAD_TCP_TSO) {
+dev->hw_ol_features |= NETDEV_TX_TSO_OFFLOAD;
+} else {
+dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD;
 }
 
+
 n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
 n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
 
@@ -1419,16 +1415,13 @@ netdev_dpdk_vhost_construct(struct netdev *netdev)
 goto out;
 }
 
-if (!userspace_tso_enabled()) {
-err = rte_vhost_driver_disable_features(dev->vhost_id,
-1ULL << VIRTIO_NET_F_HOST_TSO4
-| 1ULL << VIRTIO_NET_F_HOST_TSO6
-| 1ULL << VIRTIO_NET_F_CSUM);
-if (err) {
-VLOG_ERR("rte_vhost_driver_disable_features failed for vhost user "
- "port:

[ovs-dev] [[PATCH RFC] 16/17] Add Generic Segmentation Offloading.

2021-12-07 Thread Flavio Leitner

This provides a software implementation in the case
the egress netdev doesn't support segmentation in hardware.

This is an _untested_ patch to showcase the proposed solution.

The challenge here is to guarantee packet ordering in the
original batch that may be full of TSO packets. Each TSO
packet can go up to ~64kB, so with segment size of 1440
that means about 44 packets for each TSO. Each batch has
32 packets, so the total batch amounts to 1408 normal
packets.

The segmentation estimates the total number of packets
and then the total number of batches. Then allocate
enough memory and finally do the work.

Finally each batch is sent in order to the netdev.

Signed-off-by: Flavio Leitner 
---
 lib/automake.mk |   2 +
 lib/dp-packet-gso.c | 153 
 lib/dp-packet-gso.h |  24 +++
 lib/dp-packet.h |   7 ++
 lib/netdev.c| 122 +--
 5 files changed, 259 insertions(+), 49 deletions(-)
 create mode 100644 lib/dp-packet-gso.c
 create mode 100644 lib/dp-packet-gso.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 46f869a33..2ca94e13c 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -107,6 +107,8 @@ lib_libopenvswitch_la_SOURCES = \
lib/dpctl.h \
lib/dp-packet.h \
lib/dp-packet.c \
+   lib/dp-packet-gso.c \
+   lib/dp-packet-gso.h \
lib/dpdk.h \
lib/dpif-netdev-extract-study.c \
lib/dpif-netdev-lookup.h \
diff --git a/lib/dp-packet-gso.c b/lib/dp-packet-gso.c
new file mode 100644
index 0..fcc35b100
--- /dev/null
+++ b/lib/dp-packet-gso.c
@@ -0,0 +1,153 @@
+/*
+ * Copyright (c) 2021 Red Hat, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include 
+#include 
+
+#include "coverage.h"
+#include "dp-packet.h"
+#include "dp-packet-gso.h"
+#include "netdev-provider.h"
+
+COVERAGE_DEFINE(soft_seg_good);
+
+/* Retuns a new packet that is a segment of packet 'p'.
+ *
+ * The new packet is initialized with 'hdr_len' bytes from the
+ * start of packet 'p' and then appended with 'data_len' bytes
+ * from the 'data' buffer.
+ *
+ * Note: The packet headers are not updated. */
+static struct dp_packet *
+dp_packet_gso_seg_new(const struct dp_packet *p, size_t hdr_len,
+  const char *data, size_t data_len)
+{
+struct dp_packet *seg = dp_packet_new_with_headroom(hdr_len + data_len,
+dp_packet_headroom(p));
+
+/* Append the original packet headers and then the payload. */
+dp_packet_put(seg, dp_packet_data(p), hdr_len);
+dp_packet_put(seg, data, data_len);
+
+/* The new segment should have the same offsets. */
+seg->l2_5_ofs = p->l2_5_ofs;
+seg->l3_ofs = p->l3_ofs;
+seg->l4_ofs = p->l4_ofs;
+
+/* The protocol headers remain the same, so preserve hash and mark. */
+*dp_packet_rss_ptr(seg) = dp_packet_get_rss_hash(p);
+*dp_packet_flow_mark_ptr(seg) = *dp_packet_flow_mark_ptr(p);
+
+/* The segment should inherit all the offloading flags from the
+ * original packet, except for the TCP segmentation flag. */
+*dp_packet_ol_flags_ptr(seg) =  *dp_packet_ol_flags_ptr(p);
+dp_packet_ol_reset_tcp_seg(seg);
+
+return seg;
+}
+
+/* Returns the calculated number of TCP segments in packet 'p'. */
+int
+dp_packet_gso_nr_segs(struct dp_packet *p)
+{
+uint16_t segsz = dp_packet_get_tso_segsz(p);
+const char *data_tail;
+const char *data_pos;
+int n_segs;
+
+data_pos = dp_packet_get_tcp_payload(p);
+data_tail = (char *) dp_packet_tail(p) - dp_packet_l2_pad_size(p);
+data_pos = dp_packet_get_tcp_payload(p);
+n_segs = DIV_ROUND_UP((data_tail - data_pos), segsz);
+
+return n_segs;
+
+}
+
+/* Perform software segmentation on packet 'p'.
+ *
+ * Returns all the segments added to the array of preallocated
+ * batches in 'batches' starting at batch position 'batch_pos'. */
+void
+dp_packet_gso(struct dp_packet *p, struct dp_packet_batch *batches,
+  size_t *batch_pos)
+{
+struct tcp_header *tcp_hdr;
+struct ip_header *ip_hdr;
+struct dp_packet *seg;
+uint32_t tcp_seq;
+uint16_t ip_id;
+int hdr_len;
+
+tcp_hdr = dp_packet_l4(p);
+tcp_seq = ntohl(get_16aligned_be32(_hdr->tcp_seq));
+hdr_len = ((char *)dp_packet_l4(p) - (char *)dp_packet_eth(p))
+  + TCP_OFFSET(tcp_hdr->tcp_ctl) * 4;
+ip_id = 0;
+if

[ovs-dev] [[PATCH RFC] 15/17] Respect tso/gso segment size.

2021-12-07 Thread Flavio Leitner

Currently OVS will calculate the segment size based on the
MTU of the egress port. That usually happens to be correct
when the ports share the same MTU, but that is not always true.

Therefore, if the segment size is provided, then use that and
make sure the over sized packets are dropped.

Signed-off-by: Flavio Leitner 
---
 lib/dp-packet.c|  1 +
 lib/dp-packet.h| 27 
 lib/netdev-dpdk.c  | 13 ++--
 lib/netdev-linux.c | 78 +++---
 4 files changed, 98 insertions(+), 21 deletions(-)

diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 8a1bf221a..0cfc295b1 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -34,6 +34,7 @@ dp_packet_init__(struct dp_packet *p, size_t allocated, enum 
dp_packet_source so
 pkt_metadata_init(>md, 0);
 dp_packet_reset_cutlen(p);
 dp_packet_ol_reset(p);
+dp_packet_set_tso_segsz(p, 0);
 /* Initialize implementation-specific fields of dp_packet. */
 dp_packet_init_specific(p);
 /* By default assume the packet type to be Ethernet. */
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 51f98ab9a..27529ca87 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -124,6 +124,7 @@ struct dp_packet {
 uint32_t ol_flags;  /* Offloading flags. */
 uint32_t rss_hash;  /* Packet hash. */
 uint32_t flow_mark; /* Packet flow mark. */
+uint16_t tso_segsz;  /* TCP TSO segment size. */
 #endif
 enum dp_packet_source source;  /* Source of memory allocated as 'base'. */
 
@@ -164,6 +165,9 @@ static inline void dp_packet_set_size(struct dp_packet *, 
uint32_t);
 static inline uint16_t dp_packet_get_allocated(const struct dp_packet *);
 static inline void dp_packet_set_allocated(struct dp_packet *, uint16_t);
 
+static inline uint16_t dp_packet_get_tso_segsz(const struct dp_packet *);
+static inline void dp_packet_set_tso_segsz(struct dp_packet *, uint16_t);
+
 void *dp_packet_resize_l2(struct dp_packet *, int increment);
 void *dp_packet_resize_l2_5(struct dp_packet *, int increment);
 static inline void *dp_packet_eth(const struct dp_packet *);
@@ -635,6 +639,18 @@ dp_packet_set_allocated(struct dp_packet *p, uint16_t s)
 p->mbuf.buf_len = s;
 }
 
+static inline uint16_t
+dp_packet_get_tso_segsz(const struct dp_packet *p)
+{
+return p->mbuf.tso_segsz;
+}
+
+static inline void
+dp_packet_set_tso_segsz(struct dp_packet *p, uint16_t s)
+{
+p->mbuf.tso_segsz = s;
+}
+
 #else /* DPDK_NETDEV */
 
 static inline void
@@ -691,6 +707,17 @@ dp_packet_set_allocated(struct dp_packet *p, uint16_t s)
 p->allocated_ = s;
 }
 
+static inline uint16_t
+dp_packet_get_tso_segsz(const struct dp_packet *p)
+{
+return p->tso_segsz;
+}
+
+static inline void
+dp_packet_set_tso_segsz(struct dp_packet *p, uint16_t s)
+{
+p->tso_segsz = s;
+}
 #endif /* DPDK_NETDEV */
 
 static inline void
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index c7e09b973..0d370bda3 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -,6 +,7 @@ netdev_dpdk_prep_ol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 
 if (mbuf->ol_flags & PKT_TX_TCP_SEG) {
 struct tcp_header *th = dp_packet_l4(pkt);
+int hdr_len;
 
 if (!th) {
 VLOG_WARN_RL(, "%s: TCP Segmentation without L4 header"
@@ -2231,7 +2232,14 @@ netdev_dpdk_prep_ol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 
 mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4;
 mbuf->ol_flags |= PKT_TX_TCP_CKSUM;
-mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
+hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len;
+if (OVS_UNLIKELY((hdr_len + mbuf->tso_segsz) > dev->max_packet_len)) {
+VLOG_WARN_RL(, "%s: Oversized TSO packet. "
+ "hdr: %"PRIu32", gso: %"PRIu32", max len: %"PRIu32"",
+ dev->up.name, hdr_len, mbuf->tso_segsz,
+ dev->max_packet_len);
+return false;
+}
 
 if (mbuf->ol_flags & PKT_TX_IPV4) {
 mbuf->ol_flags |= PKT_TX_IP_CKSUM;
@@ -2597,7 +2605,8 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, 
struct rte_mbuf **pkts,
 int cnt = 0;
 struct rte_mbuf *pkt;
 
-/* Filter oversized packets, unless are marked for TSO. */
+/* Filter oversized packets. The TSO packets are filtered out
+ * during the offloading preparation for performance reasons. */
 for (i = 0; i < pkt_cnt; i++) {
 pkt = pkts[i];
 if (OVS_UNLIKELY((pkt->pkt_len > dev->max_packet_len)
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 48a3cf7d7..8a6f4592b 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -523,7 +523,7 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 
20);
 static atomic_count miimon_cnt = ATOMIC_COUNT_INIT(0);
 
 static int netdev_linux_parse_vnet_hdr(struct dp_packet *b);
-static void netdev_linux_prepend_vnet_hdr(struct

[ovs-dev] [[PATCH RFC] 14/17] Enable L4 csum offloading by default.

2021-12-07 Thread Flavio Leitner

The netdev receiving packets is supposed to provide the flags
indicating if the L4 csum was verified and it is OK or BAD,
otherwise the stack will check when appropriate by software.

If the packet comes with good checksum, then postpone the
checksum calculation to the egress device if needed.

When encapsulate a packet with that flag, set the checksum
of the inner L4 header since that is not yet supported.

Calculate the L4 csum when the packet is going to be sent over
a device that doesn't support the feature.

Linux tap devices allows enabling L3 and L4 offload, so this
patch enables the feature. However, Linux socket interface
remains disabled because the API doesn't allow enabling
those those features without enabling TSO too.

Signed-off-by: Flavio Leitner 
---
 lib/conntrack.c |  16 +--
 lib/dp-packet.c |  23 +++-
 lib/dp-packet.h |  56 
 lib/flow.c  |  21 +++
 lib/netdev-dpdk.c   | 157 ++---
 lib/netdev-linux.c  | 295 +---
 lib/netdev-native-tnl.c |  32 +
 lib/netdev.c|  40 ++
 lib/packets.c   | 174 +++-
 lib/packets.h   |   3 +
 10 files changed, 527 insertions(+), 290 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 5b4ca4dfc..c12b03538 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2103,14 +2103,10 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 }
 
 if (ok) {
-bool hwol_bad_l4_csum = dp_packet_ol_l4_csum_bad(pkt);
-if (!hwol_bad_l4_csum) {
-bool  hwol_good_l4_csum = dp_packet_ol_l4_csum_good(pkt)
-  || dp_packet_ol_tx_l4_csum(pkt);
-/* Validate the checksum only when hwol is not supported. */
+if (!dp_packet_ol_l4_csum_bad(pkt)) {
 if (extract_l4(>key, l4, dp_packet_l4_size(pkt),
-   >icmp_related, l3, !hwol_good_l4_csum,
-   NULL)) {
+   >icmp_related, l3,
+   !dp_packet_ol_l4_csum_good(pkt), NULL)) {
 ctx->hash = conn_key_hash(>key, ct->hash_basis);
 return true;
 }
@@ -3421,8 +3417,10 @@ handle_ftp_ctl(struct conntrack *ct, const struct 
conn_lookup_ctx *ctx,
 adj_seqnum(>tcp_seq, ec->seq_skew);
 }
 
-th->tcp_csum = 0;
-if (!dp_packet_ol_tx_l4_csum(pkt)) {
+if (dp_packet_ol_tx_tcp_csum(pkt)) {
+dp_packet_ol_reset_l4_csum_good(pkt);
+} else {
+th->tcp_csum = 0;
 if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
 th->tcp_csum = packet_csum_upperlayer6(nh6, th, ctx->key.nw_proto,
dp_packet_l4_size(pkt));
diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 369f3561e..8a1bf221a 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -38,6 +38,9 @@ dp_packet_init__(struct dp_packet *p, size_t allocated, enum 
dp_packet_source so
 dp_packet_init_specific(p);
 /* By default assume the packet type to be Ethernet. */
 p->packet_type = htonl(PT_ETH);
+/* Reset csum start and offset. */
+p->csum_start = 0;
+p->csum_offset = 0;
 }
 
 static void
@@ -188,7 +191,7 @@ dp_packet_clone_with_headroom(const struct dp_packet *p, 
size_t headroom)
 dp_packet_size(p),
 headroom);
 /* Copy the following fields into the returned buffer: l2_pad_size,
- * l2_5_ofs, l3_ofs, l4_ofs, cutlen, packet_type and md. */
+ * l2_5_ofs, l3_ofs, ..., cutlen, packet_type and md. */
 memcpy(_buffer->l2_pad_size, >l2_pad_size,
 sizeof(struct dp_packet) -
 offsetof(struct dp_packet, l2_pad_size));
@@ -517,4 +520,22 @@ dp_packet_ol_send_prepare(struct dp_packet *p, const 
uint64_t flags) {
 dp_packet_ip_set_header_csum(p);
 dp_packet_ol_set_ip_csum_good(p);
 }
+
+if (dp_packet_ol_l4_csum_good(p) || !dp_packet_ol_tx_l4_csum(p)) {
+return;
+}
+
+if (dp_packet_ol_tx_tcp_csum(p)
+&& !(flags & NETDEV_OFFLOAD_TX_TCP_CSUM)) {
+packet_tcp_complete_csum(p);
+dp_packet_ol_set_l4_csum_good(p);
+} else if (dp_packet_ol_tx_udp_csum(p)
+&& !(flags & NETDEV_OFFLOAD_TX_UDP_CSUM)) {
+packet_udp_complete_csum(p);
+dp_packet_ol_set_l4_csum_good(p);
+} else if (!(flags & NETDEV_OFFLOAD_TX_SCTP_CSUM)
+&& dp_packet_ol_tx_sctp_csum(p)) {
+packet_sctp_complete_csum(p);
+dp_packet_ol_set_l4_csum_good(p);
+}
 }
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 278be172e..51f98ab9a 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -138,6 +138,8 @@ struct dp_packet {
   or UINT16_MAX. */
 uint32_t cutlen;   /* length in bytes to cut from the end. */
 ovs_be32

[ovs-dev] [[PATCH RFC] 13/17] Enable IP checksum offloading by default.

2021-12-07 Thread Flavio Leitner

The netdev receiving packets is supposed to provide the flags
indicating if the IP csum was verified and it is OK or BAD,
otherwise the stack will check when appropriate by software.

If the packet comes with good checksum, then postpone the
checksum calculation to the egress device if needed.

When encapsulate a packet with that flag, set the checksum
of the inner IP header since that is not yet supported.

Calculate the IP csum when the packet is going to be sent over
a device that doesn't support the feature.

Linux devices don't support IP csum offload alone, so the
support is not enabled.

Signed-off-by: Flavio Leitner 
---
 lib/conntrack.c | 12 ++---
 lib/dp-packet.c | 12 +
 lib/dp-packet.h | 63 ---
 lib/dpif.h  |  2 +-
 lib/flow.c  | 16 --
 lib/ipf.c   |  9 ++--
 lib/netdev-dpdk.c   | 78 ++--
 lib/netdev-dummy.c  | 21 
 lib/netdev-native-tnl.c | 19 +--
 lib/netdev.c| 22 
 lib/odp-execute.c   | 21 ++--
 lib/packets.c   | 34 ++---
 ofproto/ofproto-dpif-upcall.c   | 14 +++--
 tests/automake.mk   |  1 +
 tests/system-userspace-offload.at   | 79 +
 tests/system-userspace-testsuite.at |  1 +
 16 files changed, 322 insertions(+), 82 deletions(-)
 create mode 100644 tests/system-userspace-offload.at

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 2392a2ea4..5b4ca4dfc 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2089,16 +2089,12 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 ctx->key.dl_type = dl_type;
 
 if (ctx->key.dl_type == htons(ETH_TYPE_IP)) {
-bool hwol_bad_l3_csum = dp_packet_ol_ip_csum_bad(pkt);
-if (hwol_bad_l3_csum) {
+if (dp_packet_ol_ip_csum_bad(pkt)) {
 ok = false;
 COVERAGE_INC(conntrack_l3csum_err);
 } else {
-bool hwol_good_l3_csum = dp_packet_ol_ip_csum_good(pkt)
- || dp_packet_ol_tx_ipv4(pkt);
-/* Validate the checksum only when hwol is not supported. */
 ok = extract_l3_ipv4(>key, l3, dp_packet_l3_size(pkt), NULL,
- !hwol_good_l3_csum);
+ !dp_packet_ol_ip_csum_good(pkt));
 }
 } else if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
 ok = extract_l3_ipv6(>key, l3, dp_packet_l3_size(pkt), NULL);
@@ -3402,7 +3398,9 @@ handle_ftp_ctl(struct conntrack *ct, const struct 
conn_lookup_ctx *ctx,
 }
 if (seq_skew) {
 ip_len = ntohs(l3_hdr->ip_tot_len) + seq_skew;
-if (!dp_packet_ol_tx_ipv4(pkt)) {
+if (dp_packet_ol_tx_ip_csum(pkt)) {
+dp_packet_ol_reset_ip_csum_good(pkt);
+} else {
 l3_hdr->ip_csum = recalc_csum16(l3_hdr->ip_csum,
 l3_hdr->ip_tot_len,
 htons(ip_len));
diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index a4ca5a052..369f3561e 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -21,6 +21,7 @@
 #include "dp-packet.h"
 #include "netdev-afxdp.h"
 #include "netdev-dpdk.h"
+#include "netdev-provider.h"
 #include "openvswitch/dynamic-string.h"
 #include "util.h"
 
@@ -506,3 +507,14 @@ dp_packet_resize_l2(struct dp_packet *p, int increment)
 dp_packet_adjust_layer_offset(>l2_5_ofs, increment);
 return dp_packet_data(p);
 }
+
+/* Checks if the packet 'p' is compatible with netdev_ol_flags 'flags'
+ * and if not, update the packet with the software fall back. */
+void
+dp_packet_ol_send_prepare(struct dp_packet *p, const uint64_t flags) {
+if (!dp_packet_ol_ip_csum_good(p) && dp_packet_ol_tx_ip_csum(p)
+&& !(flags & NETDEV_OFFLOAD_TX_IPV4_CSUM)) {
+dp_packet_ip_set_header_csum(p);
+dp_packet_ol_set_ip_csum_good(p);
+}
+}
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index ac160985d..278be172e 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -25,6 +25,7 @@
 #include 
 #endif
 
+#include "csum.h"
 #include "netdev-afxdp.h"
 #include "netdev-dpdk.h"
 #include "openvswitch/list.h"
@@ -75,12 +76,14 @@ enum dp_packet_offload_mask {
 DEF_OL_FLAG(DP_PACKET_OL_TX_IPV4, PKT_TX_IPV4, 0x80),
 /* Offloaded packet is IPv6. */
 DEF_OL_FLAG(DP_PACKET_OL_TX_IPV6, PKT_TX_IPV6, 0x100),
+/* Offload IP checksum. */
+DEF_OL_FLAG(DP_PACKET_OL_TX_IP_CSUM, PKT_TX_IP_CKSUM, 0x200),
 /* Offload TCP checksum. */
-DEF_OL_FLAG(DP_PACKET_OL_TX_TCP_CSUM, PKT_TX_TCP_CKSUM, 0x200),
+DEF_OL_FLAG(DP_PACKET_OL_TX_TCP_CSUM, PKT_TX_TCP_CKSUM, 0x400),

[ovs-dev] [[PATCH RFC] 12/17] Show netdev offloading flags.

2021-12-07 Thread Flavio Leitner

Add a new command to show the offloading features of
each data path port.

Signed-off-by: Flavio Leitner 
---
 lib/dpif-netdev-unixctl.man |  5 
 lib/dpif-netdev.c   | 58 +
 lib/netdev-provider.h   |  3 ++
 lib/netdev.c| 35 ++
 tests/dpif-netdev.at| 21 ++
 5 files changed, 122 insertions(+)

diff --git a/lib/dpif-netdev-unixctl.man b/lib/dpif-netdev-unixctl.man
index 607750bad..da64f89d6 100644
--- a/lib/dpif-netdev-unixctl.man
+++ b/lib/dpif-netdev-unixctl.man
@@ -260,3 +260,8 @@ PMDs in the case where no value is specified.  By default 
"scalar" is used.
 \fIstudy_cnt\fR defaults to 128 and indicates the number of packets that the
 "study" miniflow implementation must parse before choosing an optimal
 implementation.
+.IP "\fBdpif-netdev/offload-show\fR [\fIdp\fR] [\fInetdev\fR]"
+Prints the hardware offloading features enabled in netdev \fInetdev\fR
+attached to datapath \fIdp\fR. The datapath \fIdp\fR parameter can be
+omitted if there is only one. All netdev ports are printed if the
+parameter \fInetdev\fR is omitted.
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 69d7ec26e..a525ab1e9 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -1491,6 +1491,61 @@ dpif_netdev_bond_show(struct unixctl_conn *conn, int 
argc,
 ds_destroy();
 }
 
+static void
+dpif_netdev_offload_show(struct unixctl_conn *conn, int argc,
+ const char *argv[], void *aux OVS_UNUSED)
+{
+struct ds reply = DS_EMPTY_INITIALIZER;
+const char *netdev_name = NULL;
+struct dp_netdev *dp = NULL;
+struct dp_netdev_port *port;
+
+ovs_mutex_lock(_netdev_mutex);
+if (argc == 3) {
+dp = shash_find_data(_netdevs, argv[1]);
+netdev_name = argv[2];
+} else if (argc == 2) {
+dp = shash_find_data(_netdevs, argv[1]);
+if (!dp && shash_count(_netdevs) == 1) {
+/* There's only one datapath. */
+dp = shash_first(_netdevs)->data;
+netdev_name = argv[1];
+}
+} else if (shash_count(_netdevs) == 1) {
+/* There's only one datapath. */
+dp = shash_first(_netdevs)->data;
+}
+
+if (!dp) {
+ovs_mutex_unlock(_netdev_mutex);
+unixctl_command_reply_error(conn,
+"please specify an existing datapath");
+return;
+}
+
+ovs_mutex_lock(>port_mutex);
+HMAP_FOR_EACH (port, node, >ports) {
+if (netdev_name) {
+/* find the port and dump the info */
+if (!strcmp(netdev_get_name(port->netdev), netdev_name)) {
+ds_put_format(, "%s: ", netdev_get_name(port->netdev));
+netdev_ol_flags_to_string(, port->netdev);
+ds_put_format(, "\n");
+break;
+}
+} else {
+ds_put_format(, "%s: ", netdev_get_name(port->netdev));
+netdev_ol_flags_to_string(, port->netdev);
+ds_put_format(, "\n");
+}
+}
+
+ovs_mutex_unlock(>port_mutex);
+ovs_mutex_unlock(_netdev_mutex);
+unixctl_command_reply(conn, ds_cstr());
+ds_destroy();
+}
+
 
 static int
 dpif_netdev_init(void)
@@ -1547,6 +1602,9 @@ dpif_netdev_init(void)
 unixctl_command_register("dpif-netdev/miniflow-parser-get", "",
  0, 0, dpif_miniflow_extract_impl_get,
  NULL);
+unixctl_command_register("dpif-netdev/offload-show", "[dp] [netdev]",
+ 0, 2, dpif_netdev_offload_show,
+ NULL);
 return 0;
 }
 
diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
index 0a8538615..5489ebbb8 100644
--- a/lib/netdev-provider.h
+++ b/lib/netdev-provider.h
@@ -37,6 +37,7 @@ extern "C" {
 struct netdev_tnl_build_header_params;
 #define NETDEV_NUMA_UNSPEC OVS_NUMA_UNSPEC
 
+/* Keep this enum updated with translation to string below. */
 enum netdev_ol_flags {
 NETDEV_OFFLOAD_TX_IPV4_CSUM = 1 << 0,
 NETDEV_OFFLOAD_TX_TCP_CSUM = 1 << 1,
@@ -45,6 +46,8 @@ enum netdev_ol_flags {
 NETDEV_OFFLOAD_TX_TCP_TSO = 1 << 4,
 };
 
+void netdev_ol_flags_to_string(struct ds *, const struct netdev *);
+
 /* A network device (e.g. an Ethernet device).
  *
  * Network device implementations may read these members but should not modify
diff --git a/lib/netdev.c b/lib/netdev.c
index 9043d5aaf..5bde9c1c9 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -2298,3 +2298,38 @@ netdev_free_custom_stats_counters(struct 
netdev_custom_stats *custom_stats)
 }
 }
 }
+
+void
+netdev_ol_flags_to_string(struct ds *string, const struct netdev *netdev)
+{
+/* Sort by dependency, if any. */
+if (netdev->ol_flags & NETDEV_OFFLOAD_TX_IPV4_CSUM) {
+ds_put_format(string, "ip_csum: on, ");
+} else {
+ds_put_format(string, "ip_csum: off, ");
+}
+
+if (netdev->ol_flags & NETDEV_OFFLOAD_TX_TCP_CSUM)

[ovs-dev] [[PATCH RFC] 10/17] dp-packet: Add _ol_ to functions using OL flags.

2021-12-07 Thread Flavio Leitner

This helps to identify when it is about the flags or
the packet itself.

Signed-off-by: Flavio Leitner 
---
 lib/conntrack.c |  8 
 lib/dp-packet.c |  2 +-
 lib/dp-packet.h | 10 +-
 lib/ipf.c   |  4 ++--
 lib/netdev-native-tnl.c |  4 ++--
 lib/netdev.c|  2 +-
 lib/packets.c   |  2 +-
 7 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 2f9b17670..2392a2ea4 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2089,12 +2089,12 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 ctx->key.dl_type = dl_type;
 
 if (ctx->key.dl_type == htons(ETH_TYPE_IP)) {
-bool hwol_bad_l3_csum = dp_packet_ip_csum_bad(pkt);
+bool hwol_bad_l3_csum = dp_packet_ol_ip_csum_bad(pkt);
 if (hwol_bad_l3_csum) {
 ok = false;
 COVERAGE_INC(conntrack_l3csum_err);
 } else {
-bool hwol_good_l3_csum = dp_packet_ip_csum_good(pkt)
+bool hwol_good_l3_csum = dp_packet_ol_ip_csum_good(pkt)
  || dp_packet_ol_tx_ipv4(pkt);
 /* Validate the checksum only when hwol is not supported. */
 ok = extract_l3_ipv4(>key, l3, dp_packet_l3_size(pkt), NULL,
@@ -2107,9 +2107,9 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 }
 
 if (ok) {
-bool hwol_bad_l4_csum = dp_packet_l4_csum_bad(pkt);
+bool hwol_bad_l4_csum = dp_packet_ol_l4_csum_bad(pkt);
 if (!hwol_bad_l4_csum) {
-bool  hwol_good_l4_csum = dp_packet_l4_csum_good(pkt)
+bool  hwol_good_l4_csum = dp_packet_ol_l4_csum_good(pkt)
   || dp_packet_ol_tx_l4_csum(pkt);
 /* Validate the checksum only when hwol is not supported. */
 if (extract_l4(>key, l4, dp_packet_l4_size(pkt),
diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index b4ee8c33c..a4ca5a052 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -32,7 +32,7 @@ dp_packet_init__(struct dp_packet *p, size_t allocated, enum 
dp_packet_source so
 dp_packet_reset_offsets(p);
 pkt_metadata_init(>md, 0);
 dp_packet_reset_cutlen(p);
-dp_packet_reset_offload(p);
+dp_packet_ol_reset(p);
 /* Initialize implementation-specific fields of dp_packet. */
 dp_packet_init_specific(p);
 /* By default assume the packet type to be Ethernet. */
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index acb236a7d..ac160985d 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -933,7 +933,7 @@ dp_packet_rss_valid(const struct dp_packet *p)
 }
 
 static inline void
-dp_packet_reset_offload(struct dp_packet *p)
+dp_packet_ol_reset(struct dp_packet *p)
 {
 *dp_packet_ol_flags_ptr(p) &= ~DP_PACKET_OL_SUPPORTED_MASK;
 }
@@ -1049,28 +1049,28 @@ dp_packet_ol_set_tcp_seg(struct dp_packet *p)
 }
 
 static inline bool
-dp_packet_ip_csum_good(const struct dp_packet *p)
+dp_packet_ol_ip_csum_good(const struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_RX_IP_CSUM_MASK) ==
 DP_PACKET_OL_RX_IP_CSUM_GOOD;
 }
 
 static inline bool
-dp_packet_ip_csum_bad(const struct dp_packet *p)
+dp_packet_ol_ip_csum_bad(const struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_RX_IP_CSUM_MASK) ==
 DP_PACKET_OL_RX_IP_CSUM_BAD;
 }
 
 static inline bool
-dp_packet_l4_csum_good(const struct dp_packet *p)
+dp_packet_ol_l4_csum_good(const struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_RX_L4_CSUM_MASK) ==
 DP_PACKET_OL_RX_L4_CSUM_GOOD;
 }
 
 static inline bool
-dp_packet_l4_csum_bad(const struct dp_packet *p)
+dp_packet_ol_l4_csum_bad(const struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_RX_L4_CSUM_MASK) ==
 DP_PACKET_OL_RX_L4_CSUM_BAD;
diff --git a/lib/ipf.c b/lib/ipf.c
index fd40e32c4..e78559491 100644
--- a/lib/ipf.c
+++ b/lib/ipf.c
@@ -574,7 +574,7 @@ ipf_list_state_transition(struct ipf *ipf, struct ipf_list 
*ipf_list,
 static bool
 ipf_is_valid_v4_frag(struct ipf *ipf, struct dp_packet *pkt)
 {
-if (OVS_UNLIKELY(dp_packet_ip_csum_bad(pkt))) {
+if (OVS_UNLIKELY(dp_packet_ol_ip_csum_bad(pkt))) {
 COVERAGE_INC(ipf_l3csum_err);
 goto invalid_pkt;
 }
@@ -608,7 +608,7 @@ ipf_is_valid_v4_frag(struct ipf *ipf, struct dp_packet *pkt)
 goto invalid_pkt;
 }
 
-if (OVS_UNLIKELY(!dp_packet_ip_csum_good(pkt)
+if (OVS_UNLIKELY(!dp_packet_ol_ip_csum_good(pkt)
  && !dp_packet_ol_tx_ipv4(pkt)
  && csum(l3, ip_hdr_len) != 0)) {
 COVERAGE_INC(ipf_l3csum_err);
diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
index 40705e190..48f13b4bd 100644
--- a/lib/netdev-native-tnl.c
+++ b/lib/netdev-native-tnl.c
@@ -88,7 +88,7 @@ netdev_tnl_ip_extract_tnl_md(struct dp_packet *packet,

[ovs-dev] [[PATCH RFC] 11/17] Document netdev offload.

2021-12-07 Thread Flavio Leitner

Document the implementation of netdev hardware offloading
in userspace datapath.

Signed-off-by: Flavio Leitner 
---
 Documentation/automake.mk |  1 +
 Documentation/topics/index.rst|  1 +
 Documentation/topics/nic-offloads.rst | 95 +++
 3 files changed, 97 insertions(+)
 create mode 100644 Documentation/topics/nic-offloads.rst

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index 137cc57c5..b3da74d4d 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -50,6 +50,7 @@ DOC_SOURCE = \
Documentation/topics/integration.rst \
Documentation/topics/language-bindings.rst \
Documentation/topics/networking-namespaces.rst \
+   Documentation/topics/nic-offloads.rst \
Documentation/topics/openflow.rst \
Documentation/topics/ovs-extensions.rst \
Documentation/topics/ovsdb-relay.rst \
diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst
index d8ccbd757..0e402d978 100644
--- a/Documentation/topics/index.rst
+++ b/Documentation/topics/index.rst
@@ -44,6 +44,7 @@ OVS
openflow
bonding
networking-namespaces
+   nic-offloads
ovsdb-relay
ovsdb-replication
dpdk/index
diff --git a/Documentation/topics/nic-offloads.rst 
b/Documentation/topics/nic-offloads.rst
new file mode 100644
index 0..5959c65ad
--- /dev/null
+++ b/Documentation/topics/nic-offloads.rst
@@ -0,0 +1,95 @@
+..
+  Licensed under the Apache License, Version 2.0 (the "License"); you may
+  not use this file except in compliance with the License. You may obtain
+  a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+  License for the specific language governing permissions and limitations
+  under the License.
+
+  Convention for heading levels in Open vSwitch documentation:
+
+  ===  Heading 0 (reserved for the title in a document)
+  ---  Heading 1
+  ~~~  Heading 2
+  +++  Heading 3
+  '''  Heading 4
+
+  Avoid deeper levels because they do not render well.
+
+
+NIC Offloads
+
+
+This document explains the internals of Open vSwitch support for NIC offloads.
+
+Design
+--
+
+The Open vSwitch should strive to forward packets as they arrive regardless
+if the checksum is correct, for example. However, it cannot fix existing
+problems. Therefore, when the packet has the checksum verified or it the
+packet is known to be good, the checksum calculation can be offloaded to
+the NIC, otherwise updates can be made as long as the previous situation
+doesn't change. For example, a packet has corrupted IP checksum can be
+accepted, a flow rule can change the IP destination address to another
+address. In that case, OVS needs to partially recompute the checksum
+instead of offloading or calculate all of it again which would fix the
+existing issue.
+
+The drivers can set flags indicating if the checksum is good or bad.
+The checksum is considered unverified if no flag is set.
+
+When a packet ingress the data path with good checksum, OVS should
+enable checksum offload by default. This allows the data path to
+postpone checksum updates until the packet egress the data path.
+
+When a packet egress the data path, the packet flags and the egress
+port flags are verified to make sure all required NIC offload
+features to send out the packet are available. If not, the data
+path will fall back to equivalent software implementation.
+
+
+Drivers
+---
+
+When the driver initiates, it should set the flags to tell the data path
+which offload features are supported. For example, if the driver supports
+IP checksum offloading, then netdev->ol_flags should set the flag
+NETDEV_OFFLOAD_TX_IPV4_CSUM.
+
+
+Rules
+-
+1) OVS should strive to forward all packets regardless of checksum.
+
+2) OVS must not correct a bad packet/checksum.
+
+3) Packet with flag DP_PACKET_OL_RX_IP_CSUM_GOOD means that the
+   IP checksum is present in the packet and it is good.
+
+4) Packet with flag DP_PACKET_OL_RX_IP_CSUM_BAD means that the
+   IP checksum is present in the packet and it is BAD. Extra care
+   should be taken to not fix the packet during data path processing.
+
+5) The ingress packet parser can only set DP_PACKET_OL_TX_IP_CSUM
+   if the packet has DP_PACKET_OL_RX_L4_CKSUM_GOOD to not violate
+   rule #2.
+
+6) Packet with flag DP_PACKET_OL_TX_IPV4 is a IPv4 packet.
+
+7) Packet with flag DP_PACKET_OL_TX_IPV6 is a IPv6 packet.
+
+8) Packet with flag DP_PACKET_OL_TX_IP_CSUM tells the data path
+   to skip updating the IP checksum if the packet is modified. The
+   IP checksum will be calculated by the egress port if that
+   supports IP

[ovs-dev] [[PATCH RFC] 09/17] dp-packet: Rename dp_packet_ol l4 functions.

2021-12-07 Thread Flavio Leitner

Rename to better represent their flags.

Signed-off-by: Flavio Leitner 
---
 lib/dp-packet.h| 21 +++--
 lib/netdev-linux.c | 14 +++---
 lib/netdev.c   | 18 --
 3 files changed, 22 insertions(+), 31 deletions(-)

diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index dfa25e095..acb236a7d 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -956,18 +956,11 @@ dp_packet_set_flow_mark(struct dp_packet *p, uint32_t 
mark)
 *dp_packet_ol_flags_ptr(p) |= DP_PACKET_OL_FLOW_MARK;
 }
 
-/* Returns the L4 cksum offload bitmask. */
-static inline uint64_t
-dp_packet_ol_l4_mask(const struct dp_packet *p)
-{
-return *dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_TX_L4_MASK;
-}
-
 /* Return true if the packet 'p' requested L4 checksum offload. */
 static inline bool
 dp_packet_ol_tx_l4_csum(const struct dp_packet *p)
 {
-return !!dp_packet_ol_l4_mask(p);
+return !!(*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_TX_L4_MASK);
 }
 
 /* Returns 'true' if packet 'p' is marked for TCP segmentation offloading. */
@@ -986,7 +979,7 @@ dp_packet_ol_tx_ipv4(const struct dp_packet *p)
 
 /* Returns 'true' if packet 'p' is marked for TCP checksum offloading. */
 static inline bool
-dp_packet_ol_l4_is_tcp(const struct dp_packet *p)
+dp_packet_ol_tx_tcp_csum(const struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_TX_L4_MASK) ==
 DP_PACKET_OL_TX_TCP_CSUM;
@@ -994,7 +987,7 @@ dp_packet_ol_l4_is_tcp(const struct dp_packet *p)
 
 /* Returns 'true' if packet 'p' is marked for UDP checksum offloading. */
 static inline bool
-dp_packet_ol_l4_is_udp(struct dp_packet *p)
+dp_packet_ol_tx_udp_csum(struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_TX_L4_MASK) ==
 DP_PACKET_OL_TX_UDP_CSUM;
@@ -1002,7 +995,7 @@ dp_packet_ol_l4_is_udp(struct dp_packet *p)
 
 /* Returns 'true' if packet 'p' is marked for SCTP checksum offloading. */
 static inline bool
-dp_packet_ol_l4_is_sctp(struct dp_packet *p)
+dp_packet_ol_tx_sctp_csum(struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_TX_L4_MASK) ==
 DP_PACKET_OL_TX_SCTP_CSUM;
@@ -1025,7 +1018,7 @@ dp_packet_ol_set_tx_ipv6(struct dp_packet *p)
 /* Mark packet 'p' for TCP checksum offloading.  It implies that either
  * the packet 'p' is marked for IPv4 or IPv6 checksum offloading. */
 static inline void
-dp_packet_ol_set_csum_tcp(struct dp_packet *p)
+dp_packet_ol_set_tx_tcp_csum(struct dp_packet *p)
 {
 *dp_packet_ol_flags_ptr(p) |= DP_PACKET_OL_TX_TCP_CSUM;
 }
@@ -1033,7 +1026,7 @@ dp_packet_ol_set_csum_tcp(struct dp_packet *p)
 /* Mark packet 'p' for UDP checksum offloading.  It implies that either
  * the packet 'p' is marked for IPv4 or IPv6 checksum offloading. */
 static inline void
-dp_packet_ol_set_csum_udp(struct dp_packet *p)
+dp_packet_ol_set_tx_udp_csum(struct dp_packet *p)
 {
 *dp_packet_ol_flags_ptr(p) |= DP_PACKET_OL_TX_UDP_CSUM;
 }
@@ -1041,7 +1034,7 @@ dp_packet_ol_set_csum_udp(struct dp_packet *p)
 /* Mark packet 'p' for SCTP checksum offloading.  It implies that either
  * the packet 'p' is marked for IPv4 or IPv6 checksum offloading. */
 static inline void
-dp_packet_ol_set_csum_sctp(struct dp_packet *p)
+dp_packet_ol_set_tx_sctp_csum(struct dp_packet *p)
 {
 *dp_packet_ol_flags_ptr(p) |= DP_PACKET_OL_TX_SCTP_CSUM;
 }
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 696a86db2..82f9a0758 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -6637,11 +6637,11 @@ netdev_linux_parse_vnet_hdr(struct dp_packet *b)
 
 if (vnet->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) {
 if (l4proto == IPPROTO_TCP) {
-dp_packet_ol_set_csum_tcp(b);
+dp_packet_ol_set_tx_tcp_csum(b);
 } else if (l4proto == IPPROTO_UDP) {
-dp_packet_ol_set_csum_udp(b);
+dp_packet_ol_set_tx_udp_csum(b);
 } else if (l4proto == IPPROTO_SCTP) {
-dp_packet_ol_set_csum_sctp(b);
+dp_packet_ol_set_tx_sctp_csum(b);
 }
 }
 
@@ -6681,18 +6681,18 @@ netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int 
mtu)
 vnet->flags = VIRTIO_NET_HDR_GSO_NONE;
 }
 
-if (dp_packet_ol_l4_mask(b)) {
+if (dp_packet_ol_tx_l4_csum(b)) {
 vnet->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
 vnet->csum_start = (OVS_FORCE __virtio16)((char *)dp_packet_l4(b)
   - (char *)dp_packet_eth(b));
 
-if (dp_packet_ol_l4_is_tcp(b)) {
+if (dp_packet_ol_tx_tcp_csum(b)) {
 vnet->csum_offset = (OVS_FORCE __virtio16) __builtin_offsetof(
 struct tcp_header, tcp_csum);
-} else if (dp_packet_ol_l4_is_udp(b)) {
+} else if (dp_packet_ol_tx_udp_csum(b)) {
 vnet->csum_offset = (OVS_FORCE __virtio16) __builtin_offsetof(
 struct udp_header, udp_csum);
-} else if

[ovs-dev] [[PATCH RFC] 08/17] dp-packet: Rename dp_packet_ol_is_ipv4.

2021-12-07 Thread Flavio Leitner

Rename to dp_packet_ol_tx_ipv4 to align the flag.

Signed-off-by: Flavio Leitner 
---
 lib/conntrack.c| 4 ++--
 lib/dp-packet.h| 2 +-
 lib/ipf.c  | 6 +++---
 lib/netdev-linux.c | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 24234e672..2f9b17670 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2095,7 +2095,7 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 COVERAGE_INC(conntrack_l3csum_err);
 } else {
 bool hwol_good_l3_csum = dp_packet_ip_csum_good(pkt)
- || dp_packet_ol_is_ipv4(pkt);
+ || dp_packet_ol_tx_ipv4(pkt);
 /* Validate the checksum only when hwol is not supported. */
 ok = extract_l3_ipv4(>key, l3, dp_packet_l3_size(pkt), NULL,
  !hwol_good_l3_csum);
@@ -3402,7 +3402,7 @@ handle_ftp_ctl(struct conntrack *ct, const struct 
conn_lookup_ctx *ctx,
 }
 if (seq_skew) {
 ip_len = ntohs(l3_hdr->ip_tot_len) + seq_skew;
-if (!dp_packet_ol_is_ipv4(pkt)) {
+if (!dp_packet_ol_tx_ipv4(pkt)) {
 l3_hdr->ip_csum = recalc_csum16(l3_hdr->ip_csum,
 l3_hdr->ip_tot_len,
 htons(ip_len));
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 614ebbb4d..dfa25e095 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -979,7 +979,7 @@ dp_packet_ol_tcp_seg(const struct dp_packet *p)
 
 /* Returns 'true' if packet 'p' is marked for IPv4 checksum offloading. */
 static inline bool
-dp_packet_ol_is_ipv4(const struct dp_packet *p)
+dp_packet_ol_tx_ipv4(const struct dp_packet *p)
 {
 return !!(*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_TX_IPV4);
 }
diff --git a/lib/ipf.c b/lib/ipf.c
index f290d5d23..fd40e32c4 100644
--- a/lib/ipf.c
+++ b/lib/ipf.c
@@ -433,7 +433,7 @@ ipf_reassemble_v4_frags(struct ipf_list *ipf_list)
 len += rest_len;
 l3 = dp_packet_l3(pkt);
 ovs_be16 new_ip_frag_off = l3->ip_frag_off & ~htons(IP_MORE_FRAGMENTS);
-if (!dp_packet_ol_is_ipv4(pkt)) {
+if (!dp_packet_ol_tx_ipv4(pkt)) {
 l3->ip_csum = recalc_csum16(l3->ip_csum, l3->ip_frag_off,
 new_ip_frag_off);
 l3->ip_csum = recalc_csum16(l3->ip_csum, l3->ip_tot_len, htons(len));
@@ -609,7 +609,7 @@ ipf_is_valid_v4_frag(struct ipf *ipf, struct dp_packet *pkt)
 }
 
 if (OVS_UNLIKELY(!dp_packet_ip_csum_good(pkt)
- && !dp_packet_ol_is_ipv4(pkt)
+ && !dp_packet_ol_tx_ipv4(pkt)
  && csum(l3, ip_hdr_len) != 0)) {
 COVERAGE_INC(ipf_l3csum_err);
 goto invalid_pkt;
@@ -1185,7 +1185,7 @@ ipf_post_execute_reass_pkts(struct ipf *ipf,
 } else {
 struct ip_header *l3_frag = dp_packet_l3(frag_i->pkt);
 struct ip_header *l3_reass = dp_packet_l3(pkt);
-if (!dp_packet_ol_is_ipv4(frag_i->pkt)) {
+if (!dp_packet_ol_tx_ipv4(frag_i->pkt)) {
 ovs_be32 reass_ip =
 get_16aligned_be32(_reass->ip_src);
 ovs_be32 frag_ip =
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 5d0af5a40..696a86db2 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -6671,7 +6671,7 @@ netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int 
mtu)
 
 vnet->hdr_len = (OVS_FORCE __virtio16)hdr_len;
 vnet->gso_size = (OVS_FORCE __virtio16)(mtu - hdr_len);
-if (dp_packet_ol_is_ipv4(b)) {
+if (dp_packet_ol_tx_ipv4(b)) {
 vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
 } else {
 vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [[PATCH RFC] 06/17] dp-packet: Use p for packet and b for batch.

2021-12-07 Thread Flavio Leitner

Currently 'p' and 'b' and used for packets, so use
a convention that struct dp_packet is 'p' and
struct dp_packet_batch is 'b'.

Some comments needed new formatting to not pass the
80 column.

Some variables were using 'p' or 'b' were renamed
as well.

There should be no functional change with this patch.

Signed-off-by: Flavio Leitner 
---
 lib/dp-packet.c | 342 
 lib/dp-packet.h | 506 
 2 files changed, 424 insertions(+), 424 deletions(-)

diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 72f6d09ac..b4ee8c33c 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -25,58 +25,58 @@
 #include "util.h"
 
 static void
-dp_packet_init__(struct dp_packet *b, size_t allocated, enum dp_packet_source 
source)
-{
-dp_packet_set_allocated(b, allocated);
-b->source = source;
-dp_packet_reset_offsets(b);
-pkt_metadata_init(>md, 0);
-dp_packet_reset_cutlen(b);
-dp_packet_reset_offload(b);
+dp_packet_init__(struct dp_packet *p, size_t allocated, enum dp_packet_source 
source)
+{
+dp_packet_set_allocated(p, allocated);
+p->source = source;
+dp_packet_reset_offsets(p);
+pkt_metadata_init(>md, 0);
+dp_packet_reset_cutlen(p);
+dp_packet_reset_offload(p);
 /* Initialize implementation-specific fields of dp_packet. */
-dp_packet_init_specific(b);
+dp_packet_init_specific(p);
 /* By default assume the packet type to be Ethernet. */
-b->packet_type = htonl(PT_ETH);
+p->packet_type = htonl(PT_ETH);
 }
 
 static void
-dp_packet_use__(struct dp_packet *b, void *base, size_t allocated,
+dp_packet_use__(struct dp_packet *p, void *base, size_t allocated,
  enum dp_packet_source source)
 {
-dp_packet_set_base(b, base);
-dp_packet_set_data(b, base);
-dp_packet_set_size(b, 0);
+dp_packet_set_base(p, base);
+dp_packet_set_data(p, base);
+dp_packet_set_size(p, 0);
 
-dp_packet_init__(b, allocated, source);
+dp_packet_init__(p, allocated, source);
 }
 
-/* Initializes 'b' as an empty dp_packet that contains the 'allocated' bytes of
+/* Initializes 'p' as an empty dp_packet that contains the 'allocated' bytes of
  * memory starting at 'base'.  'base' should be the first byte of a region
- * obtained from malloc().  It will be freed (with free()) if 'b' is resized or
+ * obtained from malloc().  It will be freed (with free()) if 'p' is resized or
  * freed. */
 void
-dp_packet_use(struct dp_packet *b, void *base, size_t allocated)
+dp_packet_use(struct dp_packet *p, void *base, size_t allocated)
 {
-dp_packet_use__(b, base, allocated, DPBUF_MALLOC);
+dp_packet_use__(p, base, allocated, DPBUF_MALLOC);
 }
 
 #if HAVE_AF_XDP
-/* Initialize 'b' as an empty dp_packet that contains
+/* Initialize 'p' as an empty dp_packet that contains
  * memory starting at AF_XDP umem base.
  */
 void
-dp_packet_use_afxdp(struct dp_packet *b, void *data, size_t allocated,
+dp_packet_use_afxdp(struct dp_packet *p, void *data, size_t allocated,
 size_t headroom)
 {
-dp_packet_set_base(b, (char *)data - headroom);
-dp_packet_set_data(b, data);
-dp_packet_set_size(b, 0);
+dp_packet_set_base(p, (char *)data - headroom);
+dp_packet_set_data(p, data);
+dp_packet_set_size(p, 0);
 
-dp_packet_init__(b, allocated, DPBUF_AFXDP);
+dp_packet_init__(p, allocated, DPBUF_AFXDP);
 }
 #endif
 
-/* Initializes 'b' as an empty dp_packet that contains the 'allocated' bytes of
+/* Initializes 'p' as an empty dp_packet that contains the 'allocated' bytes of
  * memory starting at 'base'.  'base' should point to a buffer on the stack.
  * (Nothing actually relies on 'base' being allocated on the stack.  It could
  * be static or malloc()'d memory.  But stack space is the most common use
@@ -91,12 +91,12 @@ dp_packet_use_afxdp(struct dp_packet *b, void *data, size_t 
allocated,
  * on an dp_packet initialized by this function, so that if it expanded into 
the
  * heap, that memory is freed. */
 void
-dp_packet_use_stub(struct dp_packet *b, void *base, size_t allocated)
+dp_packet_use_stub(struct dp_packet *p, void *base, size_t allocated)
 {
-dp_packet_use__(b, base, allocated, DPBUF_STUB);
+dp_packet_use__(p, base, allocated, DPBUF_STUB);
 }
 
-/* Initializes 'b' as an dp_packet whose data starts at 'data' and continues 
for
+/* Initializes 'p' as an dp_packet whose data starts at 'data' and continues 
for
  * 'size' bytes.  This is appropriate for an dp_packet that will be used to
  * inspect existing data, without moving it around or reallocating it, and
  * generally without modifying it at all.
@@ -104,43 +104,43 @@ dp_packet_use_stub(struct dp_packet *b, void *base, 
size_t allocated)
  * An dp_packet operation that requires reallocating data will assert-fail if 
this
  * function was used to initialize it. */
 void
-dp_packet_use_const(struct dp_packet *b, const void *data, size_t size)
+dp_packet_use_const(struct

[ovs-dev] [[PATCH RFC] 07/17] dp-packet: Rename dp_packet_ol_tcp_seg

2021-12-07 Thread Flavio Leitner

Rename to dp_packet_ol_tcp_seg, because that is less
redundant and allows other protocols.

Signed-off-by: Flavio Leitner 
---
 lib/dp-packet.h| 2 +-
 lib/netdev-linux.c | 2 +-
 lib/netdev.c   | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 8b06e457b..614ebbb4d 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -972,7 +972,7 @@ dp_packet_ol_tx_l4_csum(const struct dp_packet *p)
 
 /* Returns 'true' if packet 'p' is marked for TCP segmentation offloading. */
 static inline bool
-dp_packet_ol_is_tso(const struct dp_packet *p)
+dp_packet_ol_tcp_seg(const struct dp_packet *p)
 {
 return !!(*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_TX_TCP_SEG);
 }
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 35e3e1e79..5d0af5a40 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -6665,7 +6665,7 @@ netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int 
mtu)
 {
 struct virtio_net_hdr *vnet = dp_packet_push_zeros(b, sizeof *vnet);
 
-if (dp_packet_ol_is_tso(b)) {
+if (dp_packet_ol_tcp_seg(b)) {
 uint16_t hdr_len = ((char *)dp_packet_l4(b) - (char *)dp_packet_eth(b))
 + TCP_HEADER_LEN;
 
diff --git a/lib/netdev.c b/lib/netdev.c
index d087929e5..fb535ed7c 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -794,7 +794,7 @@ netdev_send_prepare_packet(const uint64_t netdev_flags,
 {
 uint64_t l4_mask;
 
-if (dp_packet_ol_is_tso(packet)
+if (dp_packet_ol_tcp_seg(packet)
 && !(netdev_flags & NETDEV_OFFLOAD_TX_TCP_TSO)) {
 /* Fall back to GSO in software. */
 VLOG_ERR_BUF(errormsg, "No TSO support");
@@ -960,7 +960,7 @@ netdev_push_header(const struct netdev *netdev,
 size_t i, size = dp_packet_batch_size(batch);
 
 DP_PACKET_BATCH_REFILL_FOR_EACH (i, size, packet, batch) {
-if (OVS_UNLIKELY(dp_packet_ol_is_tso(packet)
+if (OVS_UNLIKELY(dp_packet_ol_tcp_seg(packet)
  || dp_packet_ol_l4_mask(packet))) {
 COVERAGE_INC(netdev_push_header_drops);
 dp_packet_delete(packet);
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [[PATCH RFC] 05/17] Rename dp_packet_hwol to dp_packet_ol.

2021-12-07 Thread Flavio Leitner

The name correlates better with the flag names.

Signed-off-by: Flavio Leitner 
---
 lib/conntrack.c|  8 
 lib/dp-packet.h| 28 ++--
 lib/ipf.c  |  6 +++---
 lib/netdev-dpdk.c  | 24 
 lib/netdev-linux.c | 24 
 lib/netdev.c   | 14 +++---
 6 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 907c5ed30..24234e672 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2095,7 +2095,7 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 COVERAGE_INC(conntrack_l3csum_err);
 } else {
 bool hwol_good_l3_csum = dp_packet_ip_csum_good(pkt)
- || dp_packet_hwol_is_ipv4(pkt);
+ || dp_packet_ol_is_ipv4(pkt);
 /* Validate the checksum only when hwol is not supported. */
 ok = extract_l3_ipv4(>key, l3, dp_packet_l3_size(pkt), NULL,
  !hwol_good_l3_csum);
@@ -2110,7 +2110,7 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 bool hwol_bad_l4_csum = dp_packet_l4_csum_bad(pkt);
 if (!hwol_bad_l4_csum) {
 bool  hwol_good_l4_csum = dp_packet_l4_csum_good(pkt)
-  || dp_packet_hwol_tx_l4_csum(pkt);
+  || dp_packet_ol_tx_l4_csum(pkt);
 /* Validate the checksum only when hwol is not supported. */
 if (extract_l4(>key, l4, dp_packet_l4_size(pkt),
>icmp_related, l3, !hwol_good_l4_csum,
@@ -3402,7 +3402,7 @@ handle_ftp_ctl(struct conntrack *ct, const struct 
conn_lookup_ctx *ctx,
 }
 if (seq_skew) {
 ip_len = ntohs(l3_hdr->ip_tot_len) + seq_skew;
-if (!dp_packet_hwol_is_ipv4(pkt)) {
+if (!dp_packet_ol_is_ipv4(pkt)) {
 l3_hdr->ip_csum = recalc_csum16(l3_hdr->ip_csum,
 l3_hdr->ip_tot_len,
 htons(ip_len));
@@ -3424,7 +3424,7 @@ handle_ftp_ctl(struct conntrack *ct, const struct 
conn_lookup_ctx *ctx,
 }
 
 th->tcp_csum = 0;
-if (!dp_packet_hwol_tx_l4_csum(pkt)) {
+if (!dp_packet_ol_tx_l4_csum(pkt)) {
 if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
 th->tcp_csum = packet_csum_upperlayer6(nh6, th, ctx->key.nw_proto,
dp_packet_l4_size(pkt));
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 5540680cf..82eae87b6 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -958,35 +958,35 @@ dp_packet_set_flow_mark(struct dp_packet *p, uint32_t 
mark)
 
 /* Returns the L4 cksum offload bitmask. */
 static inline uint64_t
-dp_packet_hwol_l4_mask(const struct dp_packet *b)
+dp_packet_ol_l4_mask(const struct dp_packet *b)
 {
 return *dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_L4_MASK;
 }
 
 /* Return true if the packet 'b' requested L4 checksum offload. */
 static inline bool
-dp_packet_hwol_tx_l4_csum(const struct dp_packet *b)
+dp_packet_ol_tx_l4_csum(const struct dp_packet *b)
 {
-return !!dp_packet_hwol_l4_mask(b);
+return !!dp_packet_ol_l4_mask(b);
 }
 
 /* Returns 'true' if packet 'b' is marked for TCP segmentation offloading. */
 static inline bool
-dp_packet_hwol_is_tso(const struct dp_packet *b)
+dp_packet_ol_is_tso(const struct dp_packet *b)
 {
 return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_TCP_SEG);
 }
 
 /* Returns 'true' if packet 'b' is marked for IPv4 checksum offloading. */
 static inline bool
-dp_packet_hwol_is_ipv4(const struct dp_packet *b)
+dp_packet_ol_is_ipv4(const struct dp_packet *b)
 {
 return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_IPV4);
 }
 
 /* Returns 'true' if packet 'b' is marked for TCP checksum offloading. */
 static inline bool
-dp_packet_hwol_l4_is_tcp(const struct dp_packet *b)
+dp_packet_ol_l4_is_tcp(const struct dp_packet *b)
 {
 return (*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_L4_MASK) ==
 DP_PACKET_OL_TX_TCP_CSUM;
@@ -994,7 +994,7 @@ dp_packet_hwol_l4_is_tcp(const struct dp_packet *b)
 
 /* Returns 'true' if packet 'b' is marked for UDP checksum offloading. */
 static inline bool
-dp_packet_hwol_l4_is_udp(struct dp_packet *b)
+dp_packet_ol_l4_is_udp(struct dp_packet *b)
 {
 return (*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_L4_MASK) ==
 DP_PACKET_OL_TX_UDP_CSUM;
@@ -1002,7 +1002,7 @@ dp_packet_hwol_l4_is_udp(struct dp_packet *b)
 
 /* Returns 'true' if packet 'b' is marked for SCTP checksum offloading. */
 static inline bool
-dp_packet_hwol_l4_is_sctp(struct dp_packet *b)
+dp_packet_ol_l4_is_sctp(struct dp_packet *b)
 {
 return (*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_L4_MASK) ==

[ovs-dev] [[PATCH RFC] 04/17] Rename hwol csum valid to good.

2021-12-07 Thread Flavio Leitner

This represents better the state and use the same
convention as the flags.

Signed-off-by: Flavio Leitner 
---
 lib/conntrack.c | 4 ++--
 lib/dp-packet.h | 4 ++--
 lib/ipf.c   | 2 +-
 lib/netdev-native-tnl.c | 4 ++--
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index e84ec4aee..907c5ed30 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2094,7 +2094,7 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 ok = false;
 COVERAGE_INC(conntrack_l3csum_err);
 } else {
-bool hwol_good_l3_csum = dp_packet_ip_csum_valid(pkt)
+bool hwol_good_l3_csum = dp_packet_ip_csum_good(pkt)
  || dp_packet_hwol_is_ipv4(pkt);
 /* Validate the checksum only when hwol is not supported. */
 ok = extract_l3_ipv4(>key, l3, dp_packet_l3_size(pkt), NULL,
@@ -2109,7 +2109,7 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 if (ok) {
 bool hwol_bad_l4_csum = dp_packet_l4_csum_bad(pkt);
 if (!hwol_bad_l4_csum) {
-bool  hwol_good_l4_csum = dp_packet_l4_csum_valid(pkt)
+bool  hwol_good_l4_csum = dp_packet_l4_csum_good(pkt)
   || dp_packet_hwol_tx_l4_csum(pkt);
 /* Validate the checksum only when hwol is not supported. */
 if (extract_l4(>key, l4, dp_packet_l4_size(pkt),
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 18faa79c0..5540680cf 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -1056,7 +1056,7 @@ dp_packet_hwol_set_tcp_seg(struct dp_packet *b)
 }
 
 static inline bool
-dp_packet_ip_csum_valid(const struct dp_packet *p)
+dp_packet_ip_csum_good(const struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_RX_IP_CSUM_MASK) ==
 DP_PACKET_OL_RX_IP_CSUM_GOOD;
@@ -1070,7 +1070,7 @@ dp_packet_ip_csum_bad(const struct dp_packet *p)
 }
 
 static inline bool
-dp_packet_l4_csum_valid(const struct dp_packet *p)
+dp_packet_l4_csum_good(const struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_RX_L4_CSUM_MASK) ==
 DP_PACKET_OL_RX_L4_CSUM_GOOD;
diff --git a/lib/ipf.c b/lib/ipf.c
index 013c4cfba..390fbe312 100644
--- a/lib/ipf.c
+++ b/lib/ipf.c
@@ -608,7 +608,7 @@ ipf_is_valid_v4_frag(struct ipf *ipf, struct dp_packet *pkt)
 goto invalid_pkt;
 }
 
-if (OVS_UNLIKELY(!dp_packet_ip_csum_valid(pkt)
+if (OVS_UNLIKELY(!dp_packet_ip_csum_good(pkt)
  && !dp_packet_hwol_is_ipv4(pkt)
  && csum(l3, ip_hdr_len) != 0)) {
 COVERAGE_INC(ipf_l3csum_err);
diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
index 2de424105..40705e190 100644
--- a/lib/netdev-native-tnl.c
+++ b/lib/netdev-native-tnl.c
@@ -88,7 +88,7 @@ netdev_tnl_ip_extract_tnl_md(struct dp_packet *packet, struct 
flow_tnl *tnl,
 
 ovs_be32 ip_src, ip_dst;
 
-if (OVS_UNLIKELY(!dp_packet_ip_csum_valid(packet))) {
+if (OVS_UNLIKELY(!dp_packet_ip_csum_good(packet))) {
 if (csum(ip, IP_IHL(ip->ip_ihl_ver) * 4)) {
 VLOG_WARN_RL(_rl, "ip packet has invalid checksum");
 return NULL;
@@ -190,7 +190,7 @@ udp_extract_tnl_md(struct dp_packet *packet, struct 
flow_tnl *tnl,
 }
 
 if (udp->udp_csum) {
-if (OVS_UNLIKELY(!dp_packet_l4_csum_valid(packet))) {
+if (OVS_UNLIKELY(!dp_packet_l4_csum_good(packet))) {
 uint32_t csum;
 if (netdev_tnl_is_header_ipv6(dp_packet_data(packet))) {
 csum = packet_csum_pseudoheader6(dp_packet_l3(packet));
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [[PATCH RFC] 03/17] Prefix netdev offload flags with NETDEV_OFFLOAD_.

2021-12-07 Thread Flavio Leitner

Use the 'NETDEV_OFFLOAD_' prefix in the flags to indicate
we are talking about hardware offloading capabilities.

Signed-off-by: Flavio Leitner 
---
 lib/netdev-dpdk.c | 20 ++--
 lib/netdev-linux.c| 10 +-
 lib/netdev-provider.h | 10 +-
 lib/netdev.c  |  8 
 4 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 6fbf19ada..c4618eb22 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -5006,12 +5006,12 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
 
 err = dpdk_eth_dev_init(dev);
 if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
-netdev->ol_flags |= NETDEV_TX_OFFLOAD_TCP_TSO;
-netdev->ol_flags |= NETDEV_TX_OFFLOAD_TCP_CSUM;
-netdev->ol_flags |= NETDEV_TX_OFFLOAD_UDP_CSUM;
-netdev->ol_flags |= NETDEV_TX_OFFLOAD_IPV4_CSUM;
+netdev->ol_flags |= NETDEV_OFFLOAD_TX_TCP_TSO;
+netdev->ol_flags |= NETDEV_OFFLOAD_TX_TCP_CSUM;
+netdev->ol_flags |= NETDEV_OFFLOAD_TX_UDP_CSUM;
+netdev->ol_flags |= NETDEV_OFFLOAD_TX_IPV4_CSUM;
 if (dev->hw_ol_features & NETDEV_TX_SCTP_CHECKSUM_OFFLOAD) {
-netdev->ol_flags |= NETDEV_TX_OFFLOAD_SCTP_CSUM;
+netdev->ol_flags |= NETDEV_OFFLOAD_TX_SCTP_CSUM;
 }
 }
 
@@ -5153,11 +5153,11 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev 
*netdev)
 }
 
 if (userspace_tso_enabled()) {
-netdev->ol_flags |= NETDEV_TX_OFFLOAD_TCP_TSO;
-netdev->ol_flags |= NETDEV_TX_OFFLOAD_TCP_CSUM;
-netdev->ol_flags |= NETDEV_TX_OFFLOAD_UDP_CSUM;
-netdev->ol_flags |= NETDEV_TX_OFFLOAD_SCTP_CSUM;
-netdev->ol_flags |= NETDEV_TX_OFFLOAD_IPV4_CSUM;
+netdev->ol_flags |= NETDEV_OFFLOAD_TX_TCP_TSO;
+netdev->ol_flags |= NETDEV_OFFLOAD_TX_TCP_CSUM;
+netdev->ol_flags |= NETDEV_OFFLOAD_TX_UDP_CSUM;
+netdev->ol_flags |= NETDEV_OFFLOAD_TX_SCTP_CSUM;
+netdev->ol_flags |= NETDEV_OFFLOAD_TX_IPV4_CSUM;
 vhost_unsup_flags = 1ULL << VIRTIO_NET_F_HOST_ECN
 | 1ULL << VIRTIO_NET_F_HOST_UFO;
 } else {
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index e4b7c72f8..30d552170 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -928,11 +928,11 @@ netdev_linux_common_construct(struct netdev *netdev_)
 ovs_mutex_init(>mutex);
 
 if (userspace_tso_enabled()) {
-netdev_->ol_flags |= NETDEV_TX_OFFLOAD_TCP_TSO;
-netdev_->ol_flags |= NETDEV_TX_OFFLOAD_TCP_CSUM;
-netdev_->ol_flags |= NETDEV_TX_OFFLOAD_UDP_CSUM;
-netdev_->ol_flags |= NETDEV_TX_OFFLOAD_SCTP_CSUM;
-netdev_->ol_flags |= NETDEV_TX_OFFLOAD_IPV4_CSUM;
+netdev_->ol_flags |= NETDEV_OFFLOAD_TX_TCP_TSO;
+netdev_->ol_flags |= NETDEV_OFFLOAD_TX_TCP_CSUM;
+netdev_->ol_flags |= NETDEV_OFFLOAD_TX_UDP_CSUM;
+netdev_->ol_flags |= NETDEV_OFFLOAD_TX_SCTP_CSUM;
+netdev_->ol_flags |= NETDEV_OFFLOAD_TX_IPV4_CSUM;
 }
 
 return 0;
diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
index 08bf8b871..0a8538615 100644
--- a/lib/netdev-provider.h
+++ b/lib/netdev-provider.h
@@ -38,11 +38,11 @@ struct netdev_tnl_build_header_params;
 #define NETDEV_NUMA_UNSPEC OVS_NUMA_UNSPEC
 
 enum netdev_ol_flags {
-NETDEV_TX_OFFLOAD_IPV4_CSUM = 1 << 0,
-NETDEV_TX_OFFLOAD_TCP_CSUM = 1 << 1,
-NETDEV_TX_OFFLOAD_UDP_CSUM = 1 << 2,
-NETDEV_TX_OFFLOAD_SCTP_CSUM = 1 << 3,
-NETDEV_TX_OFFLOAD_TCP_TSO = 1 << 4,
+NETDEV_OFFLOAD_TX_IPV4_CSUM = 1 << 0,
+NETDEV_OFFLOAD_TX_TCP_CSUM = 1 << 1,
+NETDEV_OFFLOAD_TX_UDP_CSUM = 1 << 2,
+NETDEV_OFFLOAD_TX_SCTP_CSUM = 1 << 3,
+NETDEV_OFFLOAD_TX_TCP_TSO = 1 << 4,
 };
 
 /* A network device (e.g. an Ethernet device).
diff --git a/lib/netdev.c b/lib/netdev.c
index e9b2bbe83..a06138aca 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -795,7 +795,7 @@ netdev_send_prepare_packet(const uint64_t netdev_flags,
 uint64_t l4_mask;
 
 if (dp_packet_hwol_is_tso(packet)
-&& !(netdev_flags & NETDEV_TX_OFFLOAD_TCP_TSO)) {
+&& !(netdev_flags & NETDEV_OFFLOAD_TX_TCP_TSO)) {
 /* Fall back to GSO in software. */
 VLOG_ERR_BUF(errormsg, "No TSO support");
 return false;
@@ -804,19 +804,19 @@ netdev_send_prepare_packet(const uint64_t netdev_flags,
 l4_mask = dp_packet_hwol_l4_mask(packet);
 if (l4_mask) {
 if (dp_packet_hwol_l4_is_tcp(packet)) {
-if (!(netdev_flags & NETDEV_TX_OFFLOAD_TCP_CSUM)) {
+if (!(netdev_flags & NETDEV_OFFLOAD_TX_TCP_CSUM)) {
 /* Fall back to TCP csum in software. */
 VLOG_ERR_BUF(errormsg, "No TCP checksum support");
 return false;
 }
 } else if (dp_packet_hwol_l4_is_udp(packet)) {
-if (!(netdev_flags & NETDEV_TX_OFFLOAD_UDP_CSUM)) {
+

[ovs-dev] [[PATCH RFC] 02/17] Rename flags with CKSUM to CSUM.

2021-12-07 Thread Flavio Leitner

It seems csum is more common and shorter.

Signed-off-by: Flavio Leitner 
---
 lib/dp-packet.h   | 72 +--
 lib/netdev-dpdk.c | 16 +-
 lib/netdev-linux.c|  8 ++---
 lib/netdev-provider.h |  8 ++---
 lib/netdev.c  |  6 ++--
 5 files changed, 55 insertions(+), 55 deletions(-)

diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index ee8451496..18faa79c0 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -62,13 +62,13 @@ enum dp_packet_offload_mask {
 /* Is the 'flow_mark' valid? */
 DEF_OL_FLAG(DP_PACKET_OL_FLOW_MARK, PKT_RX_FDIR_ID, 0x2),
 /* Bad L4 checksum in the packet. */
-DEF_OL_FLAG(DP_PACKET_OL_RX_L4_CKSUM_BAD, PKT_RX_L4_CKSUM_BAD, 0x4),
+DEF_OL_FLAG(DP_PACKET_OL_RX_L4_CSUM_BAD, PKT_RX_L4_CKSUM_BAD, 0x4),
 /* Bad IP checksum in the packet. */
-DEF_OL_FLAG(DP_PACKET_OL_RX_IP_CKSUM_BAD, PKT_RX_IP_CKSUM_BAD, 0x8),
+DEF_OL_FLAG(DP_PACKET_OL_RX_IP_CSUM_BAD, PKT_RX_IP_CKSUM_BAD, 0x8),
 /* Valid L4 checksum in the packet. */
-DEF_OL_FLAG(DP_PACKET_OL_RX_L4_CKSUM_GOOD, PKT_RX_L4_CKSUM_GOOD, 0x10),
+DEF_OL_FLAG(DP_PACKET_OL_RX_L4_CSUM_GOOD, PKT_RX_L4_CKSUM_GOOD, 0x10),
 /* Valid IP checksum in the packet. */
-DEF_OL_FLAG(DP_PACKET_OL_RX_IP_CKSUM_GOOD, PKT_RX_IP_CKSUM_GOOD, 0x20),
+DEF_OL_FLAG(DP_PACKET_OL_RX_IP_CSUM_GOOD, PKT_RX_IP_CKSUM_GOOD, 0x20),
 /* TCP Segmentation Offload. */
 DEF_OL_FLAG(DP_PACKET_OL_TX_TCP_SEG, PKT_TX_TCP_SEG, 0x40),
 /* Offloaded packet is IPv4. */
@@ -76,34 +76,34 @@ enum dp_packet_offload_mask {
 /* Offloaded packet is IPv6. */
 DEF_OL_FLAG(DP_PACKET_OL_TX_IPV6, PKT_TX_IPV6, 0x100),
 /* Offload TCP checksum. */
-DEF_OL_FLAG(DP_PACKET_OL_TX_TCP_CKSUM, PKT_TX_TCP_CKSUM, 0x200),
+DEF_OL_FLAG(DP_PACKET_OL_TX_TCP_CSUM, PKT_TX_TCP_CKSUM, 0x200),
 /* Offload UDP checksum. */
-DEF_OL_FLAG(DP_PACKET_OL_TX_UDP_CKSUM, PKT_TX_UDP_CKSUM, 0x400),
+DEF_OL_FLAG(DP_PACKET_OL_TX_UDP_CSUM, PKT_TX_UDP_CKSUM, 0x400),
 /* Offload SCTP checksum. */
-DEF_OL_FLAG(DP_PACKET_OL_TX_SCTP_CKSUM, PKT_TX_SCTP_CKSUM, 0x800),
+DEF_OL_FLAG(DP_PACKET_OL_TX_SCTP_CSUM, PKT_TX_SCTP_CKSUM, 0x800),
 /* Adding new field requires adding to DP_PACKET_OL_SUPPORTED_MASK. */
 };
 
 #define DP_PACKET_OL_SUPPORTED_MASK (DP_PACKET_OL_RSS_HASH | \
  DP_PACKET_OL_FLOW_MARK| \
- DP_PACKET_OL_RX_L4_CKSUM_BAD  | \
- DP_PACKET_OL_RX_IP_CKSUM_BAD  | \
- DP_PACKET_OL_RX_L4_CKSUM_GOOD | \
- DP_PACKET_OL_RX_IP_CKSUM_GOOD | \
+ DP_PACKET_OL_RX_L4_CSUM_BAD  | \
+ DP_PACKET_OL_RX_IP_CSUM_BAD  | \
+ DP_PACKET_OL_RX_L4_CSUM_GOOD | \
+ DP_PACKET_OL_RX_IP_CSUM_GOOD | \
  DP_PACKET_OL_TX_TCP_SEG   | \
  DP_PACKET_OL_TX_IPV4  | \
  DP_PACKET_OL_TX_IPV6  | \
- DP_PACKET_OL_TX_TCP_CKSUM | \
- DP_PACKET_OL_TX_UDP_CKSUM | \
- DP_PACKET_OL_TX_SCTP_CKSUM)
-
-#define DP_PACKET_OL_TX_L4_MASK (DP_PACKET_OL_TX_TCP_CKSUM | \
- DP_PACKET_OL_TX_UDP_CKSUM | \
- DP_PACKET_OL_TX_SCTP_CKSUM)
-#define DP_PACKET_OL_RX_IP_CKSUM_MASK (DP_PACKET_OL_RX_IP_CKSUM_GOOD | \
-   DP_PACKET_OL_RX_IP_CKSUM_BAD)
-#define DP_PACKET_OL_RX_L4_CKSUM_MASK (DP_PACKET_OL_RX_L4_CKSUM_GOOD | \
-   DP_PACKET_OL_RX_L4_CKSUM_BAD)
+ DP_PACKET_OL_TX_TCP_CSUM | \
+ DP_PACKET_OL_TX_UDP_CSUM | \
+ DP_PACKET_OL_TX_SCTP_CSUM)
+
+#define DP_PACKET_OL_TX_L4_MASK (DP_PACKET_OL_TX_TCP_CSUM | \
+ DP_PACKET_OL_TX_UDP_CSUM | \
+ DP_PACKET_OL_TX_SCTP_CSUM)
+#define DP_PACKET_OL_RX_IP_CSUM_MASK (DP_PACKET_OL_RX_IP_CSUM_GOOD | \
+   DP_PACKET_OL_RX_IP_CSUM_BAD)
+#define DP_PACKET_OL_RX_L4_CSUM_MASK (DP_PACKET_OL_RX_L4_CSUM_GOOD | \
+   DP_PACKET_OL_RX_L4_CSUM_BAD)
 
 /* Buffer for holding packet data.  A dp_packet is automatically reallocated
  * as necessary if it grows too large for the available memory.
@@ -989,7 +989,7 @@ static inline bool
 dp_packet_hwol_l4_is_tcp(const struct dp_packet *b)
 {
 return (*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_L4_MASK) ==
-DP_PACKET_OL_TX_TCP_CKSUM;
+DP_PACKET_OL_TX_TCP_CSUM;
 }
 
 /* Returns 'true' if

[ovs-dev] [[PATCH RFC] 01/17] Rename checksum to csum in hwol functions.

2021-12-07 Thread Flavio Leitner

It seems csum is more common and shorter.

Signed-off-by: Flavio Leitner 
---
 lib/conntrack.c | 12 ++--
 lib/dp-packet.h | 10 +-
 lib/ipf.c   |  4 ++--
 lib/netdev-native-tnl.c |  4 ++--
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 33a1a9295..e84ec4aee 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2089,12 +2089,12 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 ctx->key.dl_type = dl_type;
 
 if (ctx->key.dl_type == htons(ETH_TYPE_IP)) {
-bool hwol_bad_l3_csum = dp_packet_ip_checksum_bad(pkt);
+bool hwol_bad_l3_csum = dp_packet_ip_csum_bad(pkt);
 if (hwol_bad_l3_csum) {
 ok = false;
 COVERAGE_INC(conntrack_l3csum_err);
 } else {
-bool hwol_good_l3_csum = dp_packet_ip_checksum_valid(pkt)
+bool hwol_good_l3_csum = dp_packet_ip_csum_valid(pkt)
  || dp_packet_hwol_is_ipv4(pkt);
 /* Validate the checksum only when hwol is not supported. */
 ok = extract_l3_ipv4(>key, l3, dp_packet_l3_size(pkt), NULL,
@@ -2107,10 +2107,10 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 }
 
 if (ok) {
-bool hwol_bad_l4_csum = dp_packet_l4_checksum_bad(pkt);
+bool hwol_bad_l4_csum = dp_packet_l4_csum_bad(pkt);
 if (!hwol_bad_l4_csum) {
-bool  hwol_good_l4_csum = dp_packet_l4_checksum_valid(pkt)
-  || dp_packet_hwol_tx_l4_checksum(pkt);
+bool  hwol_good_l4_csum = dp_packet_l4_csum_valid(pkt)
+  || dp_packet_hwol_tx_l4_csum(pkt);
 /* Validate the checksum only when hwol is not supported. */
 if (extract_l4(>key, l4, dp_packet_l4_size(pkt),
>icmp_related, l3, !hwol_good_l4_csum,
@@ -3424,7 +3424,7 @@ handle_ftp_ctl(struct conntrack *ct, const struct 
conn_lookup_ctx *ctx,
 }
 
 th->tcp_csum = 0;
-if (!dp_packet_hwol_tx_l4_checksum(pkt)) {
+if (!dp_packet_hwol_tx_l4_csum(pkt)) {
 if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
 th->tcp_csum = packet_csum_upperlayer6(nh6, th, ctx->key.nw_proto,
dp_packet_l4_size(pkt));
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 3dc582fbf..ee8451496 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -965,7 +965,7 @@ dp_packet_hwol_l4_mask(const struct dp_packet *b)
 
 /* Return true if the packet 'b' requested L4 checksum offload. */
 static inline bool
-dp_packet_hwol_tx_l4_checksum(const struct dp_packet *b)
+dp_packet_hwol_tx_l4_csum(const struct dp_packet *b)
 {
 return !!dp_packet_hwol_l4_mask(b);
 }
@@ -1056,28 +1056,28 @@ dp_packet_hwol_set_tcp_seg(struct dp_packet *b)
 }
 
 static inline bool
-dp_packet_ip_checksum_valid(const struct dp_packet *p)
+dp_packet_ip_csum_valid(const struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_RX_IP_CKSUM_MASK) ==
 DP_PACKET_OL_RX_IP_CKSUM_GOOD;
 }
 
 static inline bool
-dp_packet_ip_checksum_bad(const struct dp_packet *p)
+dp_packet_ip_csum_bad(const struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_RX_IP_CKSUM_MASK) ==
 DP_PACKET_OL_RX_IP_CKSUM_BAD;
 }
 
 static inline bool
-dp_packet_l4_checksum_valid(const struct dp_packet *p)
+dp_packet_l4_csum_valid(const struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_RX_L4_CKSUM_MASK) ==
 DP_PACKET_OL_RX_L4_CKSUM_GOOD;
 }
 
 static inline bool
-dp_packet_l4_checksum_bad(const struct dp_packet *p)
+dp_packet_l4_csum_bad(const struct dp_packet *p)
 {
 return (*dp_packet_ol_flags_ptr(p) & DP_PACKET_OL_RX_L4_CKSUM_MASK) ==
 DP_PACKET_OL_RX_L4_CKSUM_BAD;
diff --git a/lib/ipf.c b/lib/ipf.c
index 507db2aea..013c4cfba 100644
--- a/lib/ipf.c
+++ b/lib/ipf.c
@@ -574,7 +574,7 @@ ipf_list_state_transition(struct ipf *ipf, struct ipf_list 
*ipf_list,
 static bool
 ipf_is_valid_v4_frag(struct ipf *ipf, struct dp_packet *pkt)
 {
-if (OVS_UNLIKELY(dp_packet_ip_checksum_bad(pkt))) {
+if (OVS_UNLIKELY(dp_packet_ip_csum_bad(pkt))) {
 COVERAGE_INC(ipf_l3csum_err);
 goto invalid_pkt;
 }
@@ -608,7 +608,7 @@ ipf_is_valid_v4_frag(struct ipf *ipf, struct dp_packet *pkt)
 goto invalid_pkt;
 }
 
-if (OVS_UNLIKELY(!dp_packet_ip_checksum_valid(pkt)
+if (OVS_UNLIKELY(!dp_packet_ip_csum_valid(pkt)
  && !dp_packet_hwol_is_ipv4(pkt)
  && csum(l3, ip_hdr_len) != 0)) {
 COVERAGE_INC(ipf_l3csum_err);
diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
index b89dfdd52..2de424105 100644
--- a/lib/netdev-native-tnl.c
+++ b/lib/netdev-native-tnl.c
@@ -88,7 +88,7 @@ netdev_tnl_ip_extract_tnl_md(struct dp_packet *packet, struct

[ovs-dev] [[PATCH RFC] 00/17] Enable TSO in userspace by default.

2021-12-07 Thread Flavio Leitner

This patch series is at RFC stage, though some of the renaming
changes could go in independently of the rest of the series.

The goal is to enable NIC csum and segmentation offloading by
default in OVS userspace data path with and without DPDK.
Other Linux software devices like tap (br) or socket (veth)
netdevs are supported.

The performance depends on the use case. For example, if OVS
is just forwarding between two ports, then checksum offloading
doesn't offer any gains. However, with more complex flow tables,
like the ones generated by OVN for example, where packets are
changed, then checksum offloading can improve performance.

The TCP Segmentation Offload (TSO) helps regardless of the flow
tables because instead of sending many frames of MSS size, OVS
can process one big packet. This improves throughput performance
in some cases up to 6x.

A brief documentation is added to provide details on how this
is supposed to work.

A segmentation implementation is provided (untested) to see
if the approach is good enough. Some of the challenges are
in the commit message.

The patch series currently misses the knobs to control each
feature per port. Is that desired?


Flavio Leitner (17):
  Rename checksum to csum in hwol functions.
  Rename flags with CKSUM to CSUM.
  Prefix netdev offload flags with NETDEV_OFFLOAD_.
  Rename hwol csum valid to good.
  Rename dp_packet_hwol to dp_packet_ol.
  dp-packet: Use p for packet and b for batch.
  dp-packet: Rename dp_packet_ol_tcp_seg
  dp-packet: Rename dp_packet_ol_is_ipv4.
  dp-packet: Rename dp_packet_ol l4 functions.
  dp-packet: Add _ol_ to functions using OL flags.
  Document netdev offload.
  Show netdev offloading flags.
  Enable IP checksum offloading by default.
  Enable L4 csum offloading by default.
  Respect tso/gso segment size.
  Add Generic Segmentation Offloading.
  Enable TSO if available.

 Documentation/automake.mk|   1 +
 Documentation/topics/index.rst   |   1 +
 Documentation/topics/nic-offloads.rst|  95 +++
 Documentation/topics/userspace-tso.rst   |  12 -
 lib/automake.mk  |   4 +-
 lib/conntrack.c  |  28 +-
 lib/dp-packet-gso.c  | 153 +
 lib/{userspace-tso.h => dp-packet-gso.h} |  13 +-
 lib/dp-packet.c  | 378 ++--
 lib/dp-packet.h  | 720 ++-
 lib/dpif-netdev-unixctl.man  |   5 +
 lib/dpif-netdev.c|  58 ++
 lib/dpif.h   |   2 +-
 lib/flow.c   |  37 +-
 lib/ipf.c|  13 +-
 lib/netdev-dpdk.c| 288 +
 lib/netdev-dummy.c   |  21 +
 lib/netdev-linux.c   | 430 +++---
 lib/netdev-native-tnl.c  |  53 +-
 lib/netdev-provider.h|  13 +-
 lib/netdev.c | 183 --
 lib/odp-execute.c|  21 +-
 lib/packets.c| 210 +--
 lib/packets.h|   3 +
 lib/userspace-tso.c  |  48 --
 ofproto/ofproto-dpif-upcall.c|  14 +-
 tests/automake.mk|   1 +
 tests/dpif-netdev.at |  21 +
 tests/system-userspace-offload.at|  79 +++
 tests/system-userspace-testsuite.at  |   1 +
 vswitchd/bridge.c|   2 -
 vswitchd/vswitch.xml |  20 -
 32 files changed, 1858 insertions(+), 1070 deletions(-)
 create mode 100644 Documentation/topics/nic-offloads.rst
 create mode 100644 lib/dp-packet-gso.c
 rename lib/{userspace-tso.h => dp-packet-gso.h} (67%)
 delete mode 100644 lib/userspace-tso.c
 create mode 100644 tests/system-userspace-offload.at

-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn 0/3] Support mixing stateless and stateful ACLs regardless of their priority

2021-12-07 Thread Vladislav Odintsov

Talking about patches 1 and 2 - they've got totally no negative impact, 
it's an optimization for HW VTEP scenario - I'd like them to be included 
as a part of 21.12.


For patch #3 there is absolutely no affect for users who use either only 
stateless ACLs or only stateful.


For users, who do mix of allow-stateless and allow-related rules it's a 
_possible_ affect, only if priority for allow-related rules is higher, 
than for allow-stateless AND these rules have overlapping meaning. It's 
worth to mention, that if somebody has such rules installed now, they 
don't work as the could be treated.


Let me know your thoughts.

regards,

Vladislav Odintsov

On 07.12.2021 19:14, Numan Siddique wrote:

On Tue, Dec 7, 2021 at 3:58 AM Vladislav Odintsov  wrote:

On 01.12.2021 15:56, Vladislav Odintsov wrote:

Currently if user has a stateless and statetul ACLs (allow-stateless and
allow-related) in one port group or in one logical switch simultaneously,
the stateless rules whould take precedence.
This patch series adds support for mixing all the ACLs types with the
respect to their priority.
This change requires next:

Also, as an optimisation, traffic from HW VTEP switch in ingress datapath
is passed from ls_in_l2_sec directly to ls_in_l2_lkup, as it doesn't need
any processing in ingress pipeline except determining outport in
ls_in_l2_lkup table.

Vladislav Odintsov (3):
Revert "northd: support HW VTEP with stateful datapath"
northd: send ingress packets from HW VTEP directly to L2_LKUP table
northd: support mix of stateless ACL with lower priority than stateful

   northd/northd.c | 113 ++--
   northd/ovn-northd.8.xml |  35 -
   northd/ovn_northd.dl|  47 +
   tests/ovn-northd.at |  50 ++
   4 files changed, 114 insertions(+), 131 deletions(-)


Hi Numan,

is is possible to plan this series to be included in 21.12?

Hi Vladislav,

I was thinking of considering them after branching.  Do you need these
patches for 21.12 ?
I'm trying to understand the risk factor ? Are these patches risky at
this time or will not affect other users who don't use this scenario ?

If it is risk free,  +1 from me for 21.12.

Thanks
Numan


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] OvS+OVN '21 Fall Conference - Day 1 joining information

2021-12-07 Thread Aaron Conole

Greetings,

The fall conference will be starting at 17:00 UTC.  In case the
joining information didn't come through via the EventBrite email, or in
case you lost the joining information, here are the instructions for
joining Day 1.

NOTE: Day 2 will use a different event link.

To join, select from the following options:

1) Web Browser

a) link: https://primetime.bluejeans.com/a2m/live-event/hhykuuqb

b) Choice of client:

  i.   Google Chrome - This will work for sure

  ii.  Chromium - This will work if the DRM plugins are installed.  On
   Fedora, for example, the *chromium-freeworld* RPM will work, but
   not ordinary *chromium*

  iii. Chromium-based browsers such as Brave and Microsoft Edge should
   work, as long as they support DRM.  These have not been tested.

  iv.  Firefox cannot be used as a presenter or moderator.  If DRM
   plugins are installed, it does work as an attendee.

  v.   We have been told that Safari does not work for attendees, but
   have not tested.

  vi.  BlueJeans has an app.  This does appear to work.


2) Room System

a) Dial: bjn.vc or 104.238.247.247 in the room system.

b) Enter Meeting ID: 615135144 and Passcode: 7123


3) Joining via a mobile device?

a) Open this link : https://primetime.bluejeans.com/a2m/live-event/hhykuuqb

b) Download the app if you don’t have it already.

c) Enter event ID : hhykuuqb

4) Phone

Dial one of the following numbers, enter the participant PIN followed by # to 
confirm:

+1 (415) 466-7000 (US)

PIN 5242440 #

+1 (760) 699-0393 (US)

PIN 6041806229 #

Joining from outside the US?

https://www.bluejeans.com/numbers/primetime-attendees/event?id=hhykuuqb

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn 0/3] Support mixing stateless and stateful ACLs regardless of their priority

2021-12-07 Thread Numan Siddique

On Tue, Dec 7, 2021 at 3:58 AM Vladislav Odintsov  wrote:
>
> On 01.12.2021 15:56, Vladislav Odintsov wrote:
> > Currently if user has a stateless and statetul ACLs (allow-stateless and
> > allow-related) in one port group or in one logical switch simultaneously,
> > the stateless rules whould take precedence.
> > This patch series adds support for mixing all the ACLs types with the
> > respect to their priority.
> > This change requires next:
> >
> > Also, as an optimisation, traffic from HW VTEP switch in ingress datapath
> > is passed from ls_in_l2_sec directly to ls_in_l2_lkup, as it doesn't need
> > any processing in ingress pipeline except determining outport in
> > ls_in_l2_lkup table.
> >
> > Vladislav Odintsov (3):
> >Revert "northd: support HW VTEP with stateful datapath"
> >northd: send ingress packets from HW VTEP directly to L2_LKUP table
> >northd: support mix of stateless ACL with lower priority than stateful
> >
> >   northd/northd.c | 113 ++--
> >   northd/ovn-northd.8.xml |  35 -
> >   northd/ovn_northd.dl|  47 +
> >   tests/ovn-northd.at |  50 ++
> >   4 files changed, 114 insertions(+), 131 deletions(-)
> >
> Hi Numan,
>
> is is possible to plan this series to be included in 21.12?

Hi Vladislav,

I was thinking of considering them after branching.  Do you need these
patches for 21.12 ?
I'm trying to understand the risk factor ? Are these patches risky at
this time or will not affect other users who don't use this scenario ?

If it is risk free,  +1 from me for 21.12.

Thanks
Numan

>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [0/2] extend nlmon functionality to support TC flowers monitoring

2021-12-07 Thread Ilya Maximets

On 12/7/21 16:36, Mohammad Heib wrote:
> Hi Ilya,
> 
> Thank you for reviewing the patches.
> 
> i fully agree that is a code duplication but I wasn't sure if I can
> include those files in nlmon code since those are kernel-headers files.
> so I sent an email to OVS-DISCUSS mailing list but didn't get a helpful 
> answer:
> email link:
> https://www.mail-archive.com/ovs-discuss@openvswitch.org/msg08484.html 
> 
> 
> so i decided to submit an initial version of my implementation anyway just to 
> get feedback:)
> and if you think it's a good approach to include one of those files into 
> nlmon? i  will definitely remove all duplication and use the existing code.

include/linux/pkt_cls.h is a file in the OVS's source tree, so there
is no reason to not use it.  If the actual kernel header available and
it is new enough, actual kernel header will be included instead by the
include_next directive.

You may also try to export and re-use some structures/functions from
the lib/tc.c.

> 
> Thanks,
> 
> 
> On Tue, Dec 7, 2021 at 4:07 PM Ilya Maximets  > wrote:
> 
> On 11/22/21 00:39, Mohammad Heib wrote:
> > These two patches aim to make nlmon tool more generic
> > and extend its functionality to support TC flowers/actions
> > monitoring and parsing.
> >
> > This change will improve the visibility of the communication
> > between the OVS and the TC subsystem and can be used for debugging
> > and testing OVS HW offload communication with TC.
> >
> > The patches added basic support for capturing and parsing TC flower
> > create/replace/changes Netlink messages and print those messages after
> > parsing them, example:
> >
> >  $ nlmon -l info -t tc
> >   filter ifindex 10 nsid local protocol 0x806 pref 49148 flower chain 0 
> handle 0x1
> >     eth_type:arp
> >     filter-flags:[not-in-hw]
> >
> >   filter ifindex 10 nsid local protocol 0x806 pref 49147 flower chain 0 
> handle 0x1
> >     dst_mac:10:11:12:13:ff:ff
> >     src_mac:12:13:14:15:16:17
> >     eth_type:arp
> >     filter-flags:[not-in-hw]
> >
> > Also,
> > The first patch adds support for setting the nlmon log level from the 
> CLI.
> >
> > This change is backward compatible and apps that use nlmon
> > can still use it without any change required.
> >
> > Mohammad Heib (2):
> >   utilities/nlmon: extend nlmon design to handle more groups
> >   utilities/nlmon: Add TC flower monitoring support
> >
> >  utilities/nlmon.c | 575 +-
> >  utilities/nlmon.h | 162 +
> >  2 files changed, 727 insertions(+), 10 deletions(-)
> >  create mode 100644 utilities/nlmon.h
> >
> 
> Hi, Mohammad.
> 
> CI fails to build this patch set.  Please, re-check.
> 
> On a brief look through the patches I see that you're adding a lot of 
> things into
> nlmon.{c,h}, which are already defined in include/linux/pkt_cls.h or 
> lib/tc.{c,h}.
> It should be possible to re-use most of them instead of duplicating.
> 
> Best regards, Ilya Maximets.
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v4] utilities: Add another GDB macro for ovs-vswitchd

2021-12-07 Thread Eelco Chaudron




On 7 Dec 2021, at 16:31, Ilya Maximets wrote:

> On 12/7/21 15:34, Eelco Chaudron wrote:
>>
>>
>> On 7 Dec 2021, at 15:17, Mike Pattrick wrote:
>>
>>> On Tue, Dec 7, 2021 at 8:54 AM Ilya Maximets  wrote:

 On 11/19/21 15:35, Mike Pattrick wrote:
> This commit adds a basic packet metadata macro to the already existing
> macros in ovs_gdb.py, ovs_dump_packets will print out information about
> one or more packets. It feeds packets into tcpdump, and the user can
> pass in tcpdump options to modify how packets are parsed or even write
> out packets to a pcap file.
>
> Example usage:
> (gdb) break fast_path_processing
> (gdb) commands
>  ovs_dump_packets packets_
>  continue
>  end
> (gdb) continue
>
> Thread 1 "ovs-vswitchd" hit Breakpoint 2, fast_path_processing ...
> 12:01:05.962485 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
> 10.1.1.1 tell 10.1.1.2, length 28
> Thread 1 "ovs-vswitchd" hit Breakpoint 1, fast_path_processing ...
> 12:01:05.981214 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.1.1.1 is-at 
> a6:0f:c3:f0:5f:bd (oui Unknown), length 28

 Hey, Mike.  Thanks for working on this!

 Could you, please, wrap some lines here in the commit message and
 in the code?

 I mean, I think, that ideally we need to do this:

 diff --git a/utilities/automake.mk b/utilities/automake.mk
 index e2e22c39a..7808c0ead 100644
 --- a/utilities/automake.mk
 +++ b/utilities/automake.mk
 @@ -126,6 +126,7 @@ endif

  FLAKE8_PYFILES += utilities/ovs-pcap.in \
 utilities/checkpatch.py utilities/ovs-dev.py \
 +   utilities/gdb/ovs_gdb.py \
 utilities/ovs-check-dead-ifs.in \
 utilities/ovs-tcpdump.in \
 utilities/ovs-pipegen.py
 ---

 But the file currently has some pep8 issues, so we can't.
 At least, we can try to not add more warnings in a new code.

 What do you think?
>>>
>>> The change to make it fully PEP8 would be pretty small, I think only
>>> long lines remain. I'll resubmit with that taken care of.
>>
>> I think the long lines in the python script should remain as those are 
>> console outputs which are displayed as part of the GDB help, and wrapping 
>> them does not make it look good inside GDB.
>
> There should be a way to wrap most of them without harming the looks.
> I mean, in some examples where lots of addresses is printed, it's
> enough to shorten these addresses to meet the line length limit.
> E.g. by keeping only 4 hex digits and, probably, changing them to
> something like 0x, 0x.
>
> Usage examples could probably be moved to the next line from the
> 'Usage:' line and maybe split by the arguments to several lines.

Sounds good to me, as long as the output still makes sense.
Maybe do this for the existing output as well so the file is clean ;)

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [0/2] extend nlmon functionality to support TC flowers monitoring

2021-12-07 Thread Mohammad Heib

Hi Ilya,

Thank you for reviewing the patches.

i fully agree that is a code duplication but I wasn't sure if I can
include those files in nlmon code since those are kernel-headers files.
so I sent an email to OVS-DISCUSS mailing list but didn't get a
helpful answer:
email link:
https://www.mail-archive.com/ovs-discuss@openvswitch.org/msg08484.html

so i decided to submit an initial version of my implementation anyway just
to get feedback:)
and if you think it's a good approach to include one of those files into
nlmon? i  will definitely remove all duplication and use the existing code.

Thanks,


On Tue, Dec 7, 2021 at 4:07 PM Ilya Maximets  wrote:

> On 11/22/21 00:39, Mohammad Heib wrote:
> > These two patches aim to make nlmon tool more generic
> > and extend its functionality to support TC flowers/actions
> > monitoring and parsing.
> >
> > This change will improve the visibility of the communication
> > between the OVS and the TC subsystem and can be used for debugging
> > and testing OVS HW offload communication with TC.
> >
> > The patches added basic support for capturing and parsing TC flower
> > create/replace/changes Netlink messages and print those messages after
> > parsing them, example:
> >
> >  $ nlmon -l info -t tc
> >   filter ifindex 10 nsid local protocol 0x806 pref 49148 flower chain 0
> handle 0x1
> > eth_type:arp
> > filter-flags:[not-in-hw]
> >
> >   filter ifindex 10 nsid local protocol 0x806 pref 49147 flower chain 0
> handle 0x1
> > dst_mac:10:11:12:13:ff:ff
> > src_mac:12:13:14:15:16:17
> > eth_type:arp
> > filter-flags:[not-in-hw]
> >
> > Also,
> > The first patch adds support for setting the nlmon log level from the
> CLI.
> >
> > This change is backward compatible and apps that use nlmon
> > can still use it without any change required.
> >
> > Mohammad Heib (2):
> >   utilities/nlmon: extend nlmon design to handle more groups
> >   utilities/nlmon: Add TC flower monitoring support
> >
> >  utilities/nlmon.c | 575 +-
> >  utilities/nlmon.h | 162 +
> >  2 files changed, 727 insertions(+), 10 deletions(-)
> >  create mode 100644 utilities/nlmon.h
> >
>
> Hi, Mohammad.
>
> CI fails to build this patch set.  Please, re-check.
>
> On a brief look through the patches I see that you're adding a lot of
> things into
> nlmon.{c,h}, which are already defined in include/linux/pkt_cls.h or
> lib/tc.{c,h}.
> It should be possible to re-use most of them instead of duplicating.
>
> Best regards, Ilya Maximets.
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v4] utilities: Add another GDB macro for ovs-vswitchd

2021-12-07 Thread Ilya Maximets

On 12/7/21 15:34, Eelco Chaudron wrote:
> 
> 
> On 7 Dec 2021, at 15:17, Mike Pattrick wrote:
> 
>> On Tue, Dec 7, 2021 at 8:54 AM Ilya Maximets  wrote:
>>>
>>> On 11/19/21 15:35, Mike Pattrick wrote:
 This commit adds a basic packet metadata macro to the already existing
 macros in ovs_gdb.py, ovs_dump_packets will print out information about
 one or more packets. It feeds packets into tcpdump, and the user can
 pass in tcpdump options to modify how packets are parsed or even write
 out packets to a pcap file.

 Example usage:
 (gdb) break fast_path_processing
 (gdb) commands
  ovs_dump_packets packets_
  continue
  end
 (gdb) continue

 Thread 1 "ovs-vswitchd" hit Breakpoint 2, fast_path_processing ...
 12:01:05.962485 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
 10.1.1.1 tell 10.1.1.2, length 28
 Thread 1 "ovs-vswitchd" hit Breakpoint 1, fast_path_processing ...
 12:01:05.981214 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.1.1.1 is-at 
 a6:0f:c3:f0:5f:bd (oui Unknown), length 28
>>>
>>> Hey, Mike.  Thanks for working on this!
>>>
>>> Could you, please, wrap some lines here in the commit message and
>>> in the code?
>>>
>>> I mean, I think, that ideally we need to do this:
>>>
>>> diff --git a/utilities/automake.mk b/utilities/automake.mk
>>> index e2e22c39a..7808c0ead 100644
>>> --- a/utilities/automake.mk
>>> +++ b/utilities/automake.mk
>>> @@ -126,6 +126,7 @@ endif
>>>
>>>  FLAKE8_PYFILES += utilities/ovs-pcap.in \
>>> utilities/checkpatch.py utilities/ovs-dev.py \
>>> +   utilities/gdb/ovs_gdb.py \
>>> utilities/ovs-check-dead-ifs.in \
>>> utilities/ovs-tcpdump.in \
>>> utilities/ovs-pipegen.py
>>> ---
>>>
>>> But the file currently has some pep8 issues, so we can't.
>>> At least, we can try to not add more warnings in a new code.
>>>
>>> What do you think?
>>
>> The change to make it fully PEP8 would be pretty small, I think only
>> long lines remain. I'll resubmit with that taken care of.
> 
> I think the long lines in the python script should remain as those are 
> console outputs which are displayed as part of the GDB help, and wrapping 
> them does not make it look good inside GDB.

There should be a way to wrap most of them without harming the looks.
I mean, in some examples where lots of addresses is printed, it's
enough to shorten these addresses to meet the line length limit.
E.g. by keeping only 4 hex digits and, probably, changing them to
something like 0x, 0x.

Usage examples could probably be moved to the next line from the
'Usage:' line and maybe split by the arguments to several lines.

> 
> This was one of my previous comments on an earlier patchset.
> 
> //Eelco
> 
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3] netdev-vport : Fix userspace tunnel ioctl(SIOCGIFINDEX) info logs.

2021-12-07 Thread Ilya Maximets

On 11/24/21 14:32, lin huang wrote:
> From: linhuang 
> 
> Userspace tunnel doesn't have a valid device in the kernel. So
> get_ifindex() function (ioctl) always get error during
> adding a port, deleting a port or updating a port status.
> 
> The info log is
> "2021-08-29T09:17:39.830Z|00059|netdev_linux|INFO|ioctl(SIOCGIFINDEX)
> on vxlan_sys_4789 device failed: No such device"
> 
> If there are a lot of userspace tunnel ports on a bridge, the
> iface_refresh_netdev_status() function will spend a lot of time.
> 
> So ignore userspace tunnel port ioctl(SIOCGIFINDEX) operation, just
> return -ENODEV.
> 
> Signed-off-by: Lin Huang 
> Test-by: Mike Pattrick 
> Reviewed-by: Aaron Conole 
> Reviewed-by: Ilya Maximets 

Hello.  Thanks for v3.  And sorry for delays.

One comment about the commit message: please, don't add tags
that wasn't actually provided in previous reviews.  Also,
patch changed noticeably between versions, so tags can not
be preserved in this case.

Some code comments inline.

> ---
>  lib/netdev-vport.c | 4 +++-
>  vswitchd/bridge.c  | 2 ++
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
> index 499c029..f0ff02b 100644
> --- a/lib/netdev-vport.c
> +++ b/lib/netdev-vport.c
> @@ -1151,8 +1151,10 @@ netdev_vport_get_ifindex(const struct netdev *netdev_)
>  {
>  char buf[NETDEV_VPORT_NAME_BUFSIZE];
>  const char *name = netdev_vport_get_dpif_port(netdev_, buf, sizeof(buf));
> +const char *dpif_type = netdev_get_dpif_type(netdev_);
> 
> -return linux_get_ifindex(name);
> +return (dpif_type && !strcmp(dpif_type, "system")
> +? linux_get_ifindex(name) : -ENODEV);

The operator precedence seems tricky here and hard to understand.
Another problem is that if dpif_type is not defined, we should
default to executing linux_get_ifindex() instead of returning
an error.

Suggesting to re-write as an 'if' condition like this:

if (dpif_type && strcmp(dpif_type, "system")) {
/* Not a system device. */
return -ENODEV;
}

return linux_get_ifindex(name);

What do you think?

Another option is to replace the 'dpif_type' NULL check with the
ovs_assert(dpif_type), because we're not expecting it to be NULL
with this change.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v4] utilities: Add another GDB macro for ovs-vswitchd

2021-12-07 Thread Eelco Chaudron




On 7 Dec 2021, at 15:17, Mike Pattrick wrote:

> On Tue, Dec 7, 2021 at 8:54 AM Ilya Maximets  wrote:
>>
>> On 11/19/21 15:35, Mike Pattrick wrote:
>>> This commit adds a basic packet metadata macro to the already existing
>>> macros in ovs_gdb.py, ovs_dump_packets will print out information about
>>> one or more packets. It feeds packets into tcpdump, and the user can
>>> pass in tcpdump options to modify how packets are parsed or even write
>>> out packets to a pcap file.
>>>
>>> Example usage:
>>> (gdb) break fast_path_processing
>>> (gdb) commands
>>>  ovs_dump_packets packets_
>>>  continue
>>>  end
>>> (gdb) continue
>>>
>>> Thread 1 "ovs-vswitchd" hit Breakpoint 2, fast_path_processing ...
>>> 12:01:05.962485 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
>>> 10.1.1.1 tell 10.1.1.2, length 28
>>> Thread 1 "ovs-vswitchd" hit Breakpoint 1, fast_path_processing ...
>>> 12:01:05.981214 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.1.1.1 is-at 
>>> a6:0f:c3:f0:5f:bd (oui Unknown), length 28
>>
>> Hey, Mike.  Thanks for working on this!
>>
>> Could you, please, wrap some lines here in the commit message and
>> in the code?
>>
>> I mean, I think, that ideally we need to do this:
>>
>> diff --git a/utilities/automake.mk b/utilities/automake.mk
>> index e2e22c39a..7808c0ead 100644
>> --- a/utilities/automake.mk
>> +++ b/utilities/automake.mk
>> @@ -126,6 +126,7 @@ endif
>>
>>  FLAKE8_PYFILES += utilities/ovs-pcap.in \
>> utilities/checkpatch.py utilities/ovs-dev.py \
>> +   utilities/gdb/ovs_gdb.py \
>> utilities/ovs-check-dead-ifs.in \
>> utilities/ovs-tcpdump.in \
>> utilities/ovs-pipegen.py
>> ---
>>
>> But the file currently has some pep8 issues, so we can't.
>> At least, we can try to not add more warnings in a new code.
>>
>> What do you think?
>
> The change to make it fully PEP8 would be pretty small, I think only
> long lines remain. I'll resubmit with that taken care of.

I think the long lines in the python script should remain as those are console 
outputs which are displayed as part of the GDB help, and wrapping them does not 
make it look good inside GDB.

This was one of my previous comments on an earlier patchset.

//Eelco


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/2] alb.at: Increase time/warp.

2021-12-07 Thread Ilya Maximets

On 12/6/21 18:11, David Marchand wrote:
> On Tue, Nov 23, 2021 at 3:01 PM Kevin Traynor  wrote:
>>
>> It seems that on slow system with high concurrency and cpu contention
>> time/warp is not accurate enough for the ALB unit tests with the minimum
>> time/warp that was used to hit an amount of events. This results in some
>> intermittent test failures.
>>
>> As those tests are just waiting for a certain amount of events to occur
>> and there is no functional change during that time let's do the time/warp
>> again with higher values.
>>
>> With this no failures are seen in several hundred runs.
>>
>> Fixes: a83a406096e9 ("dpif-netdev: Sync PMD ALB state with user commands.")
>> Reported-by: Ilya Maximets 
> 
> Fwiw, I managed to reproduce with below commands (test failed in 7
> runs out of 10 on my laptop before patch).
> 
> In separate terminals:
> $ taskset -c 3 sh -c 'while true; do true; done'
> $ taskset -c 3 make -C master check TESTSUITEFLAGS="-d 1026"
> 
>> Signed-off-by: Kevin Traynor 
> 
> Reviewed-by: David Marchand 
> 
> I let the test run ~50 times, no issue with patch.

Thanks, Kevin and David!  Applied.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v4] utilities: Add another GDB macro for ovs-vswitchd

2021-12-07 Thread Mike Pattrick

On Tue, Dec 7, 2021 at 8:54 AM Ilya Maximets  wrote:
>
> On 11/19/21 15:35, Mike Pattrick wrote:
> > This commit adds a basic packet metadata macro to the already existing
> > macros in ovs_gdb.py, ovs_dump_packets will print out information about
> > one or more packets. It feeds packets into tcpdump, and the user can
> > pass in tcpdump options to modify how packets are parsed or even write
> > out packets to a pcap file.
> >
> > Example usage:
> > (gdb) break fast_path_processing
> > (gdb) commands
> >  ovs_dump_packets packets_
> >  continue
> >  end
> > (gdb) continue
> >
> > Thread 1 "ovs-vswitchd" hit Breakpoint 2, fast_path_processing ...
> > 12:01:05.962485 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 
> > 10.1.1.1 tell 10.1.1.2, length 28
> > Thread 1 "ovs-vswitchd" hit Breakpoint 1, fast_path_processing ...
> > 12:01:05.981214 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.1.1.1 is-at 
> > a6:0f:c3:f0:5f:bd (oui Unknown), length 28
>
> Hey, Mike.  Thanks for working on this!
>
> Could you, please, wrap some lines here in the commit message and
> in the code?
>
> I mean, I think, that ideally we need to do this:
>
> diff --git a/utilities/automake.mk b/utilities/automake.mk
> index e2e22c39a..7808c0ead 100644
> --- a/utilities/automake.mk
> +++ b/utilities/automake.mk
> @@ -126,6 +126,7 @@ endif
>
>  FLAKE8_PYFILES += utilities/ovs-pcap.in \
> utilities/checkpatch.py utilities/ovs-dev.py \
> +   utilities/gdb/ovs_gdb.py \
> utilities/ovs-check-dead-ifs.in \
> utilities/ovs-tcpdump.in \
> utilities/ovs-pipegen.py
> ---
>
> But the file currently has some pep8 issues, so we can't.
> At least, we can try to not add more warnings in a new code.
>
> What do you think?

The change to make it fully PEP8 would be pretty small, I think only
long lines remain. I'll resubmit with that taken care of.

-M

>
> Best regards, Ilya Maximets.
>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [0/2] extend nlmon functionality to support TC flowers monitoring

2021-12-07 Thread Ilya Maximets

On 11/22/21 00:39, Mohammad Heib wrote:
> These two patches aim to make nlmon tool more generic
> and extend its functionality to support TC flowers/actions
> monitoring and parsing.
> 
> This change will improve the visibility of the communication
> between the OVS and the TC subsystem and can be used for debugging
> and testing OVS HW offload communication with TC.
> 
> The patches added basic support for capturing and parsing TC flower
> create/replace/changes Netlink messages and print those messages after
> parsing them, example:
> 
>  $ nlmon -l info -t tc
>   filter ifindex 10 nsid local protocol 0x806 pref 49148 flower chain 0 
> handle 0x1 
> eth_type:arp
> filter-flags:[not-in-hw]
> 
>   filter ifindex 10 nsid local protocol 0x806 pref 49147 flower chain 0 
> handle 0x1 
> dst_mac:10:11:12:13:ff:ff
> src_mac:12:13:14:15:16:17
> eth_type:arp
> filter-flags:[not-in-hw]
> 
> Also,
> The first patch adds support for setting the nlmon log level from the CLI.
> 
> This change is backward compatible and apps that use nlmon
> can still use it without any change required.
> 
> Mohammad Heib (2):
>   utilities/nlmon: extend nlmon design to handle more groups
>   utilities/nlmon: Add TC flower monitoring support
> 
>  utilities/nlmon.c | 575 +-
>  utilities/nlmon.h | 162 +
>  2 files changed, 727 insertions(+), 10 deletions(-)
>  create mode 100644 utilities/nlmon.h
> 

Hi, Mohammad.

CI fails to build this patch set.  Please, re-check.

On a brief look through the patches I see that you're adding a lot of things 
into
nlmon.{c,h}, which are already defined in include/linux/pkt_cls.h or 
lib/tc.{c,h}.
It should be possible to re-use most of them instead of duplicating.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v4] utilities: Add another GDB macro for ovs-vswitchd

2021-12-07 Thread Ilya Maximets

On 11/19/21 15:35, Mike Pattrick wrote:
> This commit adds a basic packet metadata macro to the already existing
> macros in ovs_gdb.py, ovs_dump_packets will print out information about
> one or more packets. It feeds packets into tcpdump, and the user can
> pass in tcpdump options to modify how packets are parsed or even write
> out packets to a pcap file.
> 
> Example usage:
> (gdb) break fast_path_processing
> (gdb) commands
>  ovs_dump_packets packets_
>  continue
>  end
> (gdb) continue
> 
> Thread 1 "ovs-vswitchd" hit Breakpoint 2, fast_path_processing ...
> 12:01:05.962485 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.1.1.1 
> tell 10.1.1.2, length 28
> Thread 1 "ovs-vswitchd" hit Breakpoint 1, fast_path_processing ...
> 12:01:05.981214 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.1.1.1 is-at 
> a6:0f:c3:f0:5f:bd (oui Unknown), length 28

Hey, Mike.  Thanks for working on this!

Could you, please, wrap some lines here in the commit message and
in the code?

I mean, I think, that ideally we need to do this:

diff --git a/utilities/automake.mk b/utilities/automake.mk
index e2e22c39a..7808c0ead 100644
--- a/utilities/automake.mk
+++ b/utilities/automake.mk
@@ -126,6 +126,7 @@ endif
 
 FLAKE8_PYFILES += utilities/ovs-pcap.in \
utilities/checkpatch.py utilities/ovs-dev.py \
+   utilities/gdb/ovs_gdb.py \
utilities/ovs-check-dead-ifs.in \
utilities/ovs-tcpdump.in \
utilities/ovs-pipegen.py
---

But the file currently has some pep8 issues, so we can't.
At least, we can try to not add more warnings in a new code.

What do you think?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v4 6/6] dpif-netdev/mfex: Avoid hashing when opt mfex called

2021-12-07 Thread Kumar Amber

This patch avoids calculating the software hash of the packet again
if the optimized miniflow-extract hit and has already calculated the
packet hash. In cases of scalar miniflow extract, the normal hashing
calculation is performed.

Signed-off-by: Kumar Amber 
---
 lib/dpif-netdev-avx512.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/dpif-netdev-avx512.c b/lib/dpif-netdev-avx512.c
index 544d36903..2188abfd9 100644
--- a/lib/dpif-netdev-avx512.c
+++ b/lib/dpif-netdev-avx512.c
@@ -210,15 +210,15 @@ dp_netdev_input_outer_avx512(struct dp_netdev_pmd_thread 
*pmd,
 if (!mfex_hit) {
 /* Do a scalar miniflow extract into keys. */
 miniflow_extract(packet, >mf);
+key->len = netdev_flow_key_size(miniflow_n_values(>mf));
+key->hash = dpif_netdev_packet_get_rss_hash_orig_pkt(packet,
+ >mf);
 }
 
 /* Cache TCP and byte values for all packets. */
 pkt_meta[i].bytes = dp_packet_size(packet);
 pkt_meta[i].tcp_flags = miniflow_get_tcp_flags(>mf);
 
-key->len = netdev_flow_key_size(miniflow_n_values(>mf));
-key->hash = dpif_netdev_packet_get_rss_hash_orig_pkt(packet, >mf);
-
 if (emc_enabled) {
 f = emc_lookup(>emc_cache, key);
 
-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v4 5/6] dpif-netdev/mfex: Add ipv6 profile based hashing

2021-12-07 Thread Kumar Amber

This commit adds IPv6 profile specific hashing which
uses fixed offsets into the packet to improve hashing
perforamnce.

Hash value is autovalidated by MFEX autovalidator.

Signed-off-by: Kumar Amber 
Signed-off-by: Harry van Haaren 
Co-authored-by: Harry van Haaren 

---
v4:
- Use pre-defined hash length values
v2:
- Fix check-patch sign-offs
---
 NEWS |  1 +
 lib/dpif-netdev-extract-avx512.c | 61 
 2 files changed, 62 insertions(+)

diff --git a/NEWS b/NEWS
index a99c4675b..47b34317f 100644
--- a/NEWS
+++ b/NEWS
@@ -25,6 +25,7 @@ Post-v2.16.0
  * Add AVX512 optimized profiles to miniflow extract for VLAN/IPv6/UDP
and VLAN/IPv6/TCP.
  * Add IPv4 profile based 5tuple hashing optimizations.
+ * Add IPv6 profile based 5tuple hashing optimizations.
 
 
 v2.16.0 - 16 Aug 2021
diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c
index 1088744d0..db2650e05 100644
--- a/lib/dpif-netdev-extract-avx512.c
+++ b/lib/dpif-netdev-extract-avx512.c
@@ -361,6 +361,12 @@ enum MFEX_PROFILES {
 #define HASH_DT1Q_IPV4 \
 30, 34, 27, 38, 0, 0
 
+#define HASH_IPV6 \
+22, 30, 38, 46, 20, 54
+
+#define HASH_DT1Q_IPV6 \
+26, 34, 42, 50, 24, 58
+
 /* Static const instances of profiles. These are compile-time constants,
  * and are specialized into individual miniflow-extract functions.
  * NOTE: Order of the fields is significant, any change in the order must be
@@ -456,6 +462,9 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 0, UINT16_MAX, 14, 54,
 },
 .dp_pkt_min_size = 54,
+
+.hash_pkt_offs = { HASH_IPV6 },
+.hash_len = 96,
 },
 
 [PROFILE_ETH_IPV6_TCP] = {
@@ -470,6 +479,9 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 0, UINT16_MAX, 14, 54,
 },
 .dp_pkt_min_size = 54,
+
+.hash_pkt_offs = { HASH_IPV6 },
+.hash_len = 104,
 },
 
 [PROFILE_ETH_VLAN_IPV6_TCP] = {
@@ -486,6 +498,9 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 14, UINT16_MAX, 18, 58,
 },
 .dp_pkt_min_size = 66,
+
+.hash_pkt_offs = { HASH_DT1Q_IPV6 },
+.hash_len = 112,
 },
 
 [PROFILE_ETH_VLAN_IPV6_UDP] = {
@@ -502,6 +517,9 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 14, UINT16_MAX, 18, 58,
 },
 .dp_pkt_min_size = 66,
+
+.hash_pkt_offs = { HASH_DT1Q_IPV6 },
+.hash_len = 104,
 },
 };
 
@@ -580,6 +598,37 @@ mfex_5tuple_hash_ipv4(struct dp_packet *packet, const 
uint8_t *pkt,
 }
 }
 
+static inline void
+mfex_5tuple_hash_ipv6(struct dp_packet *packet, const uint8_t *pkt,
+  struct netdev_flow_key *key,
+  const uint8_t *pkt_offsets)
+{
+if (!dp_packet_rss_valid(packet)) {
+uint32_t hash = 0;
+void *ipv6_src_lo = (void *) [pkt_offsets[0]];
+void *ipv6_src_hi = (void *) [pkt_offsets[1]];
+void *ipv6_dst_lo = (void *) [pkt_offsets[2]];
+void *ipv6_dst_hi = (void *) [pkt_offsets[3]];
+void *ports_l4 = (void *) [pkt_offsets[5]];
+
+/* IPv6 Src and Dst. */
+hash = hash_add64(hash, *(uint64_t *) ipv6_src_lo);
+hash = hash_add64(hash, *(uint64_t *) ipv6_src_hi);
+hash = hash_add64(hash, *(uint64_t *) ipv6_dst_lo);
+hash = hash_add64(hash, *(uint64_t *) ipv6_dst_hi);
+/* IPv6 proto. */
+hash = hash_add(hash, pkt[pkt_offsets[4]]);
+/* L4 ports. */
+hash = hash_add(hash, *(uint32_t *) ports_l4);
+hash = hash_finish(hash, 42);
+
+dp_packet_set_rss_hash(packet, hash);
+key->hash = hash;
+} else {
+key->hash = dp_packet_get_rss_hash(packet);
+}
+}
+
 /* Protocol specific helper functions, for calculating offsets/lenghts. */
 static int32_t
 mfex_ipv4_set_l2_pad_size(struct dp_packet *pkt, struct ip_header *nh,
@@ -777,6 +826,9 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 /* Process UDP header. */
 mfex_handle_ipv6_l4((void *)[54], [9]);
 
+mfex_5tuple_hash_ipv6(packet, pkt, [i],
+  profile->hash_pkt_offs);
+keys[i].len = profile->hash_len;
 } break;
 
 case PROFILE_ETH_IPV6_TCP: {
@@ -794,6 +846,9 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 const struct tcp_header *tcp = (void *)[54];
 mfex_handle_tcp_flags(tcp, [9]);
 
+mfex_5tuple_hash_ipv6(packet, pkt, [i],
+  profile->hash_pkt_offs);
+keys[i].len = profile->hash_len;
 } break;
 
 case PROFILE_ETH_VLAN_IPV6_TCP: {
@@ -814,6 +869,9 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 const struct tcp_header *tcp = (void *)[58];

[ovs-dev] [PATCH v4 3/6] dpif-netdev/mfex: Add packet hash check to autovalidator

2021-12-07 Thread Kumar Amber

This patch adds the per profile AVX512 opt hashing to autovalidator
for validating the hash values against the scalar hash.

Signed-off-by: Kumar Amber 
---
 lib/dpif-netdev-private-extract.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/lib/dpif-netdev-private-extract.c 
b/lib/dpif-netdev-private-extract.c
index b3d96075c..263629903 100644
--- a/lib/dpif-netdev-private-extract.c
+++ b/lib/dpif-netdev-private-extract.c
@@ -303,6 +303,9 @@ dpif_miniflow_extract_autovalidator(struct dp_packet_batch 
*packets,
 DP_PACKET_BATCH_FOR_EACH (i, packet, packets) {
 pkt_metadata_init(>md, in_port);
 miniflow_extract(packet, [i].mf);
+keys[i].len = netdev_flow_key_size(miniflow_n_values([i].mf));
+keys[i].hash = dpif_netdev_packet_get_rss_hash_orig_pkt(packet,
+[i].mf);
 
 /* Store known good metadata to compare with optimized metadata. */
 good_l2_5_ofs[i] = packet->l2_5_ofs;
@@ -352,6 +355,15 @@ dpif_miniflow_extract_autovalidator(struct dp_packet_batch 
*packets,
 failed = 1;
 }
 
+/* Check hashes are equal. */
+if ((keys[i].hash != test_keys[i].hash) ||
+(keys[i].len != test_keys[i].len)) {
+ds_put_format(_msg, "Good hash: %d len: %d\tTest hash:%d"
+  " len:%d\n", keys[i].hash, keys[i].len,
+  test_keys[i].hash, test_keys[i].len);
+failed = 1;
+}
+
 if (!miniflow_equal([i].mf, _keys[i].mf)) {
 uint32_t block_cnt = miniflow_n_values([i].mf);
 ds_put_format(_msg, "Autovalidation blocks failed\n"
-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v4 2/6] dpif-netdev/mfex: Add AVX512 vlan ipv6 traffic profiles

2021-12-07 Thread Kumar Amber

Add AVX512 Ipv6 optimized profile for vlan/IPv6/UDP and
vlan/IPv6/TCP.

MFEX autovalidaton test-case already has the IPv6 support for
validating against the scalar mfex.

Signed-off-by: Kumar Amber 
Signed-off-by: Harry van Haaren 
Co-authored-by: Harry van Haaren 

---
v2:
- Fix check-patch sign-offs
---
 NEWS  |  2 +
 lib/dpif-netdev-extract-avx512.c  | 94 +++
 lib/dpif-netdev-private-extract.c | 23 
 lib/dpif-netdev-private-extract.h |  6 ++
 4 files changed, 125 insertions(+)

diff --git a/NEWS b/NEWS
index 7eb7eb341..7b2b675e3 100644
--- a/NEWS
+++ b/NEWS
@@ -22,6 +22,8 @@ Post-v2.16.0
- Userspace datapath:
  * Add AVX512 optimized profiles to miniflow extract for IPv6/UDP and
IPv6/TCP.
+ * Add AVX512 optimized profiles to miniflow extract for VLAN/IPv6/UDP
+   and VLAN/IPv6/TCP.
 
 
 v2.16.0 - 16 Aug 2021
diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c
index 3384a8dba..11bca0144 100644
--- a/lib/dpif-netdev-extract-avx512.c
+++ b/lib/dpif-netdev-extract-avx512.c
@@ -214,6 +214,21 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, 
__m512i idx, __m512i a)
   38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, /* IPv6 */  \
   NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, /* Unused */
 
+/* VLAN (Dot1Q) patterns and masks. */
+#define PATTERN_DT1Q_MASK \
+  0x00, 0x00, 0xFF, 0xFF,
+#define PATTERN_DT1Q_IPV6 \
+  0x00, 0x00, 0x86, 0xDD,
+
+#define PATTERN_DT1Q_IPV6_SHUFFLE \
+  /* Ether (2 blocks): Note that *VLAN* type is written here. */  \
+  0,  1,  2,  3,  4,  5,  6,  7, 8,  9, 10, 11, 16, 17,  0,  0,   \
+  /* VLAN (1 block): Note that the *EtherHdr->Type* is written here. */   \
+  12, 13, 14, 15, 0, 0, 0, 0, \
+  26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, /* IPv6 */  \
+  42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, /* IPv6 */  \
+  NU, NU, NU, NU, NU, NU, NU, NU, /* Unused */
+
 /* Generation of K-mask bitmask values, to zero out data in result. Note that
  * these correspond 1:1 to the above "*_SHUFFLE" values, and bit used must be
  * set in this K-mask, and "NU" values must be zero in the k-mask. Each mask
@@ -228,6 +243,8 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i 
idx, __m512i a)
 #define KMASK_TCP   0x0F00ULL
 #define KMASK_IPV6  0xULL
 #define KMASK_ETHER_IPV6 0x3FFFULL
+#define KMASK_DT1Q_IPV6  0xFF0FULL
+#define KMASK_IPV6_NOHDR 0x00FFULL
 
 #define PATTERN_IPV4_UDP_KMASK \
 (KMASK_ETHER | (KMASK_IPV4 << 16) | (KMASK_UDP << 32))
@@ -244,6 +261,10 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, 
__m512i idx, __m512i a)
 #define PATTERN_IPV6_KMASK \
 (KMASK_ETHER_IPV6 | (KMASK_IPV6 << 16) | (KMASK_IPV6 << 32))
 
+#define PATTERN_DT1Q_IPV6_KMASK \
+(KMASK_ETHER_IPV6 | (KMASK_DT1Q_IPV6 << 16) | (KMASK_IPV6 << 32) | \
+(KMASK_IPV6_NOHDR << 48))
+
 /* This union allows initializing static data as u8, but easily loading it
  * into AVX512 registers too. The union ensures proper alignment for the zmm.
  */
@@ -324,6 +345,8 @@ enum MFEX_PROFILES {
 PROFILE_ETH_VLAN_IPV4_TCP,
 PROFILE_ETH_IPV6_UDP,
 PROFILE_ETH_IPV6_TCP,
+PROFILE_ETH_VLAN_IPV6_TCP,
+PROFILE_ETH_VLAN_IPV6_UDP,
 PROFILE_COUNT,
 };
 
@@ -426,6 +449,37 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 .dp_pkt_min_size = 54,
 },
 
+[PROFILE_ETH_VLAN_IPV6_TCP] = {
+.probe_mask.u8_data = {
+PATTERN_ETHERTYPE_MASK PATTERN_DT1Q_MASK PATTERN_IPV6_MASK },
+.probe_data.u8_data = {
+PATTERN_ETHERTYPE_DT1Q PATTERN_DT1Q_IPV6 PATTERN_IPV6_TCP },
+
+.store_shuf.u8_data = { PATTERN_DT1Q_IPV6_SHUFFLE },
+.store_kmsk = PATTERN_DT1Q_IPV6_KMASK,
+
+.mf_bits = { 0x38a0, 0x0004443c},
+.dp_pkt_offs = {
+14, UINT16_MAX, 18, 58,
+},
+.dp_pkt_min_size = 66,
+},
+
+[PROFILE_ETH_VLAN_IPV6_UDP] = {
+.probe_mask.u8_data = {
+PATTERN_ETHERTYPE_MASK PATTERN_DT1Q_MASK PATTERN_IPV6_MASK },
+.probe_data.u8_data = {
+PATTERN_ETHERTYPE_DT1Q PATTERN_DT1Q_IPV6 PATTERN_IPV6_UDP },
+
+.store_shuf.u8_data = { PATTERN_DT1Q_IPV6_SHUFFLE },
+.store_kmsk = PATTERN_DT1Q_IPV6_KMASK,
+
+.mf_bits = { 0x38a0, 0x0004043c},
+.dp_pkt_offs = {
+14, UINT16_MAX, 18, 58,
+},
+.dp_pkt_min_size = 66,
+},
 };
 
 /* IPv6 header helper function to fix TC, flow label and next header. */
@@ -676,6 +730,44 @@ mfex_avx512_process(struct dp_packet_batch *packets,

[ovs-dev] [PATCH v4 4/6] dpif-netdev/mfex: Add ipv4 profile based hashing

2021-12-07 Thread Kumar Amber

This commit adds IPv4 profile specific hashing which
uses fixed offsets into the packet to improve hashing
perforamnce.

Signed-off-by: Kumar Amber 
Signed-off-by: Harry van Haaren 
Co-authored-by: Harry van Haaren 

---
v4:
- Use pre-defined hash length values
v3:
- Fix check-patch sign-offs
---
 NEWS |  1 +
 lib/dpif-netdev-extract-avx512.c | 65 
 2 files changed, 66 insertions(+)

diff --git a/NEWS b/NEWS
index 7b2b675e3..a99c4675b 100644
--- a/NEWS
+++ b/NEWS
@@ -24,6 +24,7 @@ Post-v2.16.0
IPv6/TCP.
  * Add AVX512 optimized profiles to miniflow extract for VLAN/IPv6/UDP
and VLAN/IPv6/TCP.
+ * Add IPv4 profile based 5tuple hashing optimizations.
 
 
 v2.16.0 - 16 Aug 2021
diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c
index 11bca0144..1088744d0 100644
--- a/lib/dpif-netdev-extract-avx512.c
+++ b/lib/dpif-netdev-extract-avx512.c
@@ -297,6 +297,10 @@ struct mfex_profile {
 uint64_t mf_bits[FLOWMAP_UNITS];
 uint16_t dp_pkt_offs[4];
 uint16_t dp_pkt_min_size;
+
+/* Constant data offsets for Hashing. */
+uint8_t hash_pkt_offs[6];
+uint32_t hash_len;
 };
 
 /* Ensure dp_pkt_offs[4] is the correct size as in struct dp_packet. */
@@ -350,6 +354,13 @@ enum MFEX_PROFILES {
 PROFILE_COUNT,
 };
 
+/* Packet offsets for 5 tuple Hash function. */
+#define HASH_IPV4 \
+26, 30, 23, 34, 0, 0
+
+#define HASH_DT1Q_IPV4 \
+30, 34, 27, 38, 0, 0
+
 /* Static const instances of profiles. These are compile-time constants,
  * and are specialized into individual miniflow-extract functions.
  * NOTE: Order of the fields is significant, any change in the order must be
@@ -369,6 +380,9 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 0, UINT16_MAX, 14, 34,
 },
 .dp_pkt_min_size = 42,
+
+.hash_pkt_offs = { HASH_IPV4 },
+.hash_len = 72,
 },
 
 [PROFILE_ETH_IPV4_TCP] = {
@@ -383,6 +397,9 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 0, UINT16_MAX, 14, 34,
 },
 .dp_pkt_min_size = 54,
+
+.hash_pkt_offs = { HASH_IPV4 },
+.hash_len = 80,
 },
 
 [PROFILE_ETH_VLAN_IPV4_UDP] = {
@@ -401,6 +418,9 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 14, UINT16_MAX, 18, 38,
 },
 .dp_pkt_min_size = 46,
+
+.hash_pkt_offs = { HASH_DT1Q_IPV4 },
+.hash_len = 80,
 },
 
 [PROFILE_ETH_VLAN_IPV4_TCP] = {
@@ -419,6 +439,9 @@ static const struct mfex_profile 
mfex_profiles[PROFILE_COUNT] =
 14, UINT16_MAX, 18, 38,
 },
 .dp_pkt_min_size = 46,
+
+.hash_pkt_offs = { HASH_DT1Q_IPV4 },
+.hash_len = 88,
 },
 
 [PROFILE_ETH_IPV6_UDP] = {
@@ -530,6 +553,33 @@ mfex_ipv6_set_l2_pad_size(struct dp_packet *pkt,
 dp_packet_set_l2_pad_size(pkt, payload_size_ipv6 - p_len);
 }
 
+static inline void
+mfex_5tuple_hash_ipv4(struct dp_packet *packet, const uint8_t *pkt,
+  struct netdev_flow_key *key,
+  const uint8_t *pkt_offsets)
+{
+if (!dp_packet_rss_valid(packet)) {
+uint32_t hash = 0;
+void *ipv4_src = (void *) [pkt_offsets[0]];
+void *ipv4_dst = (void *) [pkt_offsets[1]];
+void *ports_l4 = (void *) [pkt_offsets[3]];
+
+/* IPv4 Src and Dst. */
+hash = hash_add(hash, *(uint32_t *) ipv4_src);
+hash = hash_add(hash, *(uint32_t *) ipv4_dst);
+/* IPv4 proto. */
+hash = hash_add(hash, pkt[pkt_offsets[2]]);
+/* L4 ports. */
+hash = hash_add(hash, *(uint32_t *) ports_l4);
+hash = hash_finish(hash, 42);
+
+dp_packet_set_rss_hash(packet, hash);
+key->hash = hash;
+} else {
+key->hash = dp_packet_get_rss_hash(packet);
+}
+}
+
 /* Protocol specific helper functions, for calculating offsets/lenghts. */
 static int32_t
 mfex_ipv4_set_l2_pad_size(struct dp_packet *pkt, struct ip_header *nh,
@@ -664,6 +714,10 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 /* Process TCP flags, and store to blocks. */
 const struct tcp_header *tcp = (void *)[38];
 mfex_handle_tcp_flags(tcp, [7]);
+
+mfex_5tuple_hash_ipv4(packet, pkt, [i],
+  profile->hash_pkt_offs);
+keys[i].len = profile->hash_len;
 } break;
 
 case PROFILE_ETH_VLAN_IPV4_UDP: {
@@ -674,6 +728,10 @@ mfex_avx512_process(struct dp_packet_batch *packets,
 if (mfex_ipv4_set_l2_pad_size(packet, nh, size_from_ipv4)) {
 continue;
 }
+
+mfex_5tuple_hash_ipv4(packet, pkt, [i],
+  profile->hash_pkt_offs);
+keys[i].len = profile->hash_len;
 } break;
 
 case PROFILE_ETH_IPV4_TCP:

[ovs-dev] [PATCH v4 1/6] dpif-netdev/mfex: Add AVX512 basic ipv6 traffic profiles

2021-12-07 Thread Kumar Amber

Add AVX512 IPv6 optimized profile for IPv6/UDP and
IPv6/TCP.

MFEX autovalidaton test-case already has the IPv6 support for
validating against the scalar mfex.

Signed-off-by: Kumar Amber 
Signed-off-by: Harry van Haaren 
Co-authored-by: Harry van Haaren 

---
v4:
- Rebase to master
v2:
- Fix CI build error
- Fix check-patch sign-offs
---
 NEWS  |   4 +
 lib/automake.mk   |   1 +
 lib/dpif-netdev-extract-avx512.c  | 140 +-
 lib/dpif-netdev-private-extract.c |  28 +-
 lib/dpif-netdev-private-extract.h |   6 ++
 5 files changed, 177 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index dce1aeb2b..7eb7eb341 100644
--- a/NEWS
+++ b/NEWS
@@ -19,6 +19,10 @@ Post-v2.16.0
- ovs-ofctl dump-flows no longer prints "igmp".  Instead the flag
  "ip,nw_proto=2" is used.
 
+   - Userspace datapath:
+ * Add AVX512 optimized profiles to miniflow extract for IPv6/UDP and
+   IPv6/TCP.
+
 
 v2.16.0 - 16 Aug 2021
 -
diff --git a/lib/automake.mk b/lib/automake.mk
index 46f869a33..eeb1fbadd 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -33,6 +33,7 @@ lib_libopenvswitchavx512_la_CFLAGS = \
-mavx512f \
-mavx512bw \
-mavx512dq \
+   -mavx512vl \
-mbmi \
-mbmi2 \
-fPIC \
diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c
index ec64419e3..3384a8dba 100644
--- a/lib/dpif-netdev-extract-avx512.c
+++ b/lib/dpif-netdev-extract-avx512.c
@@ -49,6 +49,8 @@
 #include "dpif-netdev-private-extract.h"
 #include "dpif-netdev-private-flow.h"
 
+#define plen ip6_ctlun.ip6_un1.ip6_un1_plen
+
 /* AVX512-BW level permutex2var_epi8 emulation. */
 static inline __m512i
 __attribute__((target("avx512bw")))
@@ -137,6 +139,7 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i 
idx, __m512i a)
 #define PATTERN_ETHERTYPE_MASK PATTERN_ETHERTYPE_GEN(0xFF, 0xFF)
 #define PATTERN_ETHERTYPE_IPV4 PATTERN_ETHERTYPE_GEN(0x08, 0x00)
 #define PATTERN_ETHERTYPE_DT1Q PATTERN_ETHERTYPE_GEN(0x81, 0x00)
+#define PATTERN_ETHERTYPE_IPV6 PATTERN_ETHERTYPE_GEN(0x86, 0xDD)
 
 /* VLAN (Dot1Q) patterns and masks. */
 #define PATTERN_DT1Q_MASK   \
@@ -192,6 +195,25 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, 
__m512i idx, __m512i a)
   NU, NU, NU, NU, NU, NU, NU, NU, 38, 39, 40, 41, NU, NU, NU, NU, /* TCP */   \
   NU, NU, NU, NU, NU, NU, NU, NU, /* Unused. */
 
+/* Generator for checking IPv6 ver. */
+#define PATTERN_IPV6_GEN(VER_TRC, PROTO)  \
+  VER_TRC, /* Version: 4bits and Traffic class: 4bits. */ \
+  0, 0, 0, /* Traffic class: 4bits and Flow Label: 24bits. */ \
+  0, 0,/* Payload length 16bits. */   \
+  PROTO, 0,/* Next Header 8bits and Hop limit 8bits. */   \
+  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* Src IP: 128bits. */  \
+  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* Dst IP: 128bits. */
+
+#define PATTERN_IPV6_MASK PATTERN_IPV6_GEN(0xF0, 0xFF)
+#define PATTERN_IPV6_UDP PATTERN_IPV6_GEN(0x60, 0x11)
+#define PATTERN_IPV6_TCP PATTERN_IPV6_GEN(0x60, 0x06)
+
+#define PATTERN_IPV6_SHUFFLE  \
+   0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, NU, NU, /* Ether */ \
+  22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, /* IPv6 */  \
+  38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, /* IPv6 */  \
+  NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, NU, /* Unused */
+
 /* Generation of K-mask bitmask values, to zero out data in result. Note that
  * these correspond 1:1 to the above "*_SHUFFLE" values, and bit used must be
  * set in this K-mask, and "NU" values must be zero in the k-mask. Each mask
@@ -204,6 +226,8 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i 
idx, __m512i a)
 #define KMASK_IPV4  0xF0FFULL
 #define KMASK_UDP   0x000FULL
 #define KMASK_TCP   0x0F00ULL
+#define KMASK_IPV6  0xULL
+#define KMASK_ETHER_IPV6 0x3FFFULL
 
 #define PATTERN_IPV4_UDP_KMASK \
 (KMASK_ETHER | (KMASK_IPV4 << 16) | (KMASK_UDP << 32))
@@ -217,6 +241,9 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i 
idx, __m512i a)
 #define PATTERN_DT1Q_IPV4_TCP_KMASK \
 (KMASK_ETHER | (KMASK_DT1Q << 16) | (KMASK_IPV4 << 24) | (KMASK_TCP << 40))
 
+#define PATTERN_IPV6_KMASK \
+(KMASK_ETHER_IPV6 | (KMASK_IPV6 << 16) | (KMASK_IPV6 << 32))
+
 /* This union allows initializing static data as u8, but easily loading it
  * into AVX512 registers too. The union ensures proper alignment for the zmm.
  */
@@ -295,6 +322,8 @@ enum MFEX_PROFILES {
 PROFILE_ETH_IPV4_TCP,
 PROFILE_ETH_VLAN_IPV4_UDP,
 PROFILE_ETH_VLAN_IPV4_TCP,
+PROFILE_ETH_IPV6_UDP,
+PROFILE_ETH_IPV6_TCP,
 PROFILE_COUNT,
 };
 
@@ -368,8 +397,84 @@

[ovs-dev] [PATCH v4 0/6] MFEX Optimizations IPv6 + Hashing

2021-12-07 Thread Kumar Amber

---
v4:
- rebase to master.
- use static key lenghts for different packet types.
v3:
- rebase to master.
v2:
- fix the CI build.
- fix check-patch for co-author.
---

The patch-set introduces AVX512 optimizations of IPv6
traffic profiles and hashing improvements for all AVX512
supported traffic profiles for IPv4 and IPv6.

Kumar Amber (6):
  dpif-netdev/mfex: Add AVX512 basic ipv6 traffic profiles
  dpif-netdev/mfex: Add AVX512 vlan ipv6 traffic profiles
  dpif-netdev/mfex: Add packet hash check to autovalidator
  dpif-netdev/mfex: Add ipv4 profile based hashing
  dpif-netdev/mfex: Add ipv6 profile based hashing
  dpif-netdev/mfex: Avoid hashing when opt mfex called

 NEWS  |   8 +
 lib/automake.mk   |   1 +
 lib/dpif-netdev-avx512.c  |   6 +-
 lib/dpif-netdev-extract-avx512.c  | 360 +-
 lib/dpif-netdev-private-extract.c |  63 +-
 lib/dpif-netdev-private-extract.h |  12 +
 6 files changed, 445 insertions(+), 5 deletions(-)

-- 
2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn 0/3] Support mixing stateless and stateful ACLs regardless of their priority

2021-12-07 Thread Vladislav Odintsov


On 01.12.2021 15:56, Vladislav Odintsov wrote:

Currently if user has a stateless and statetul ACLs (allow-stateless and
allow-related) in one port group or in one logical switch simultaneously,
the stateless rules whould take precedence.
This patch series adds support for mixing all the ACLs types with the
respect to their priority.
This change requires next:

Also, as an optimisation, traffic from HW VTEP switch in ingress datapath
is passed from ls_in_l2_sec directly to ls_in_l2_lkup, as it doesn't need
any processing in ingress pipeline except determining outport in
ls_in_l2_lkup table.

Vladislav Odintsov (3):
   Revert "northd: support HW VTEP with stateful datapath"
   northd: send ingress packets from HW VTEP directly to L2_LKUP table
   northd: support mix of stateless ACL with lower priority than stateful

  northd/northd.c | 113 ++--
  northd/ovn-northd.8.xml |  35 -
  northd/ovn_northd.dl|  47 +
  tests/ovn-northd.at |  50 ++
  4 files changed, 114 insertions(+), 131 deletions(-)


Hi Numan,

is is possible to plan this series to be included in 21.12?

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

48 matches

Mail list logo